csvlint 0.1.4 → 0.2.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +8 -8
- data/.gitignore +7 -1
- data/CHANGELOG.md +19 -1
- data/README.md +93 -36
- data/bin/csvlint +68 -27
- data/csvlint.gemspec +2 -0
- data/features/csvw_schema_validation.feature +127 -0
- data/features/fixtures/spreadsheet.xlsx +0 -0
- data/features/sources.feature +3 -4
- data/features/step_definitions/parse_csv_steps.rb +13 -1
- data/features/step_definitions/schema_validation_steps.rb +27 -1
- data/features/step_definitions/sources_steps.rb +1 -1
- data/features/step_definitions/validation_errors_steps.rb +48 -1
- data/features/step_definitions/validation_info_steps.rb +5 -1
- data/features/step_definitions/validation_warnings_steps.rb +15 -1
- data/features/support/load_tests.rb +114 -0
- data/features/validation_errors.feature +12 -24
- data/features/validation_warnings.feature +18 -6
- data/lib/csvlint.rb +10 -0
- data/lib/csvlint/csvw/column.rb +359 -0
- data/lib/csvlint/csvw/date_format.rb +182 -0
- data/lib/csvlint/csvw/metadata_error.rb +13 -0
- data/lib/csvlint/csvw/number_format.rb +211 -0
- data/lib/csvlint/csvw/property_checker.rb +761 -0
- data/lib/csvlint/csvw/table.rb +204 -0
- data/lib/csvlint/csvw/table_group.rb +165 -0
- data/lib/csvlint/schema.rb +40 -23
- data/lib/csvlint/validate.rb +142 -19
- data/lib/csvlint/version.rb +1 -1
- data/spec/csvw/column_spec.rb +112 -0
- data/spec/csvw/date_format_spec.rb +49 -0
- data/spec/csvw/number_format_spec.rb +403 -0
- data/spec/csvw/table_group_spec.rb +143 -0
- data/spec/csvw/table_spec.rb +90 -0
- data/spec/schema_spec.rb +27 -1
- data/spec/spec_helper.rb +0 -1
- data/spec/validator_spec.rb +16 -10
- metadata +53 -2
checksums.yaml
CHANGED
@@ -1,15 +1,15 @@
|
|
1
1
|
---
|
2
2
|
!binary "U0hBMQ==":
|
3
3
|
metadata.gz: !binary |-
|
4
|
-
|
4
|
+
YjlmZmFlNGZjOWQ5MmNlNDZiOTUxMWY0NGExYTRkYjhhNzdlNjAyNA==
|
5
5
|
data.tar.gz: !binary |-
|
6
|
-
|
6
|
+
ODFjZmJkZmI0Nzg2NmMzN2ViOGNiNDlmODA0NDcxMzM0Zjk4NTgwOQ==
|
7
7
|
SHA512:
|
8
8
|
metadata.gz: !binary |-
|
9
|
-
|
10
|
-
|
11
|
-
|
9
|
+
ZTIyMGVkYjIyMjc2ZWViNTBhYmZkMWIxN2E1OTU0OTFhNGMxNzBlYzg0OTI4
|
10
|
+
NDRkMzY2YzgxNmQwZGZiZDE5M2M2NzYwMzk3ZWZjMDc3YWM0YzQ0NTczY2U3
|
11
|
+
MGZjNTUwMGI2MzgzZDQxYzkzMzBiNzI3NmJkZTIxYjZiYjc5MDA=
|
12
12
|
data.tar.gz: !binary |-
|
13
|
-
|
14
|
-
|
15
|
-
|
13
|
+
NTI1M2I5Yzc3NGNhOTg3Y2VkMmM3ZGM1ZTdiZWNmMzM0ZTY5ODljODNmNWYy
|
14
|
+
MDA0NGVlMGFhNDQ2ZjZjYjI0Nzc2OTdhMWRmODI5YTEzMGRmNTQxZjAyOTA5
|
15
|
+
YjVmMjk4NDIyOWEzMzIxMTBlYjQ4YTgwZmE4MWZlYTQ4MjMzZmE=
|
data/.gitignore
CHANGED
data/CHANGELOG.md
CHANGED
@@ -2,7 +2,25 @@
|
|
2
2
|
|
3
3
|
## [Unreleased](https://github.com/theodi/csvlint.rb/tree/HEAD)
|
4
4
|
|
5
|
-
[Full Changelog](https://github.com/theodi/csvlint.rb/compare/0.1.
|
5
|
+
[Full Changelog](https://github.com/theodi/csvlint.rb/compare/0.1.4...HEAD)
|
6
|
+
|
7
|
+
**Closed issues:**
|
8
|
+
|
9
|
+
- CSV on the web support [\#141](https://github.com/theodi/csvlint.rb/issues/141)
|
10
|
+
|
11
|
+
**Merged pull requests:**
|
12
|
+
|
13
|
+
- Recover from `ArgumentError`s when attempting to locate a schema and detect bad schema when JSON is malformed [\#152](https://github.com/theodi/csvlint.rb/pull/152) ([pezholio](https://github.com/pezholio))
|
14
|
+
|
15
|
+
- Catch errors if link headers are don't have particular values [\#151](https://github.com/theodi/csvlint.rb/pull/151) ([pezholio](https://github.com/pezholio))
|
16
|
+
|
17
|
+
- Rescue excel warning [\#149](https://github.com/theodi/csvlint.rb/pull/149) ([quadrophobiac](https://github.com/quadrophobiac))
|
18
|
+
|
19
|
+
- CSVW-based validation! [\#142](https://github.com/theodi/csvlint.rb/pull/142) ([JeniT](https://github.com/JeniT))
|
20
|
+
|
21
|
+
## [0.1.4](https://github.com/theodi/csvlint.rb/tree/0.1.4) (2015-08-06)
|
22
|
+
|
23
|
+
[Full Changelog](https://github.com/theodi/csvlint.rb/compare/0.1.3...0.1.4)
|
6
24
|
|
7
25
|
**Merged pull requests:**
|
8
26
|
|
data/README.md
CHANGED
@@ -31,13 +31,13 @@ You can either use this gem within your own Ruby code, or as a standolone comman
|
|
31
31
|
After installing the gem, you can validate a CSV on the command line like so:
|
32
32
|
|
33
33
|
csvlint myfile.csv
|
34
|
-
|
34
|
+
|
35
35
|
You will then see the validation result, together with any warnings or errors e.g.
|
36
36
|
|
37
37
|
```
|
38
38
|
myfile.csv is INVALID
|
39
39
|
1. blank_rows. Row: 3
|
40
|
-
1. title_row.
|
40
|
+
1. title_row.
|
41
41
|
2. inconsistent_values. Column: 14
|
42
42
|
```
|
43
43
|
|
@@ -50,40 +50,40 @@ You can also optionally pass a schema file like so:
|
|
50
50
|
Currently the gem supports retrieving a CSV accessible from a URL, File, or an IO-style object (e.g. StringIO)
|
51
51
|
|
52
52
|
require 'csvlint'
|
53
|
-
|
53
|
+
|
54
54
|
validator = Csvlint::Validator.new( "http://example.org/data.csv" )
|
55
55
|
validator = Csvlint::Validator.new( File.new("/path/to/my/data.csv" ))
|
56
56
|
validator = Csvlint::Validator.new( StringIO.new( my_data_in_a_string ) )
|
57
57
|
|
58
|
-
When validating from a URL the range of errors and warnings is wider as the library will also check HTTP headers for
|
58
|
+
When validating from a URL the range of errors and warnings is wider as the library will also check HTTP headers for
|
59
59
|
best practices
|
60
|
-
|
61
|
-
#invoke the validation
|
60
|
+
|
61
|
+
#invoke the validation
|
62
62
|
validator.validate
|
63
|
-
|
63
|
+
|
64
64
|
#check validation status
|
65
65
|
validator.valid?
|
66
|
-
|
66
|
+
|
67
67
|
#access array of errors, each is an Csvlint::ErrorMessage object
|
68
68
|
validator.errors
|
69
|
-
|
69
|
+
|
70
70
|
#access array of warnings
|
71
71
|
validator.warnings
|
72
|
-
|
72
|
+
|
73
73
|
#access array of information messages
|
74
74
|
validator.info_messages
|
75
|
-
|
75
|
+
|
76
76
|
#get some information about the CSV file that was validated
|
77
77
|
validator.encoding
|
78
78
|
validator.content_type
|
79
79
|
validator.extension
|
80
|
-
|
80
|
+
|
81
81
|
#retrieve HTTP headers from request
|
82
82
|
validator.headers
|
83
83
|
|
84
84
|
## Controlling CSV Parsing
|
85
85
|
|
86
|
-
The validator supports configuration of the [CSV Dialect](http://dataprotocols.org/csv-dialect/) used in a data file. This is specified by
|
86
|
+
The validator supports configuration of the [CSV Dialect](http://dataprotocols.org/csv-dialect/) used in a data file. This is specified by
|
87
87
|
passing a dialect hash to the constructor:
|
88
88
|
|
89
89
|
dialect = {
|
@@ -94,17 +94,17 @@ passing a dialect hash to the constructor:
|
|
94
94
|
|
95
95
|
The options should be a Hash that conforms to the [CSV Dialect](http://dataprotocols.org/csv-dialect/) JSON structure.
|
96
96
|
|
97
|
-
While these options configure the parser to correctly process the file, the validator will still raise errors or warnings for CSV
|
97
|
+
While these options configure the parser to correctly process the file, the validator will still raise errors or warnings for CSV
|
98
98
|
structure that it considers to be invalid, e.g. a missing header or different delimiters.
|
99
99
|
|
100
|
-
Note that the parser will also check for a `header` parameter on the `Content-Type` header returned when fetching a remote CSV file. As
|
100
|
+
Note that the parser will also check for a `header` parameter on the `Content-Type` header returned when fetching a remote CSV file. As
|
101
101
|
specified in [RFC 4180](http://www.ietf.org/rfc/rfc4180.txt) the values for this can be `present` and `absent`, e.g:
|
102
102
|
|
103
103
|
Content-Type: text/csv; header=present
|
104
104
|
|
105
105
|
## Error Reporting
|
106
106
|
|
107
|
-
The validator provides feedback on a validation result using instances of `Csvlint::ErrorMessage`. Errors are divided into errors, warnings and information
|
107
|
+
The validator provides feedback on a validation result using instances of `Csvlint::ErrorMessage`. Errors are divided into errors, warnings and information
|
108
108
|
messages. A validation attempt is successful if there are no errors.
|
109
109
|
|
110
110
|
Messages provide context including:
|
@@ -122,7 +122,7 @@ The following types of error can be reported:
|
|
122
122
|
* `:wrong_content_type` -- content type is not `text/csv`
|
123
123
|
* `:ragged_rows` -- row has a different number of columns (than the first row in the file)
|
124
124
|
* `:blank_rows` -- completely empty row, e.g. blank line or a line where all column values are empty
|
125
|
-
* `:invalid_encoding` -- encoding error when parsing row, e.g. because of invalid characters
|
125
|
+
* `:invalid_encoding` -- encoding error when parsing row, e.g. because of invalid characters
|
126
126
|
* `:not_found` -- HTTP 404 error when retrieving the data
|
127
127
|
* `:stray_quote` -- missing or stray quote
|
128
128
|
* `:unclosed_quote` -- unclosed quoted field
|
@@ -153,36 +153,66 @@ There are also information messages available:
|
|
153
153
|
|
154
154
|
## Schema Validation
|
155
155
|
|
156
|
-
The library supports validating data against a schema. A schema configuration can be provided as a Hash or parsed from JSON. The structure currently
|
157
|
-
follows JSON Table Schema with some extensions.
|
156
|
+
The library supports validating data against a schema. A schema configuration can be provided as a Hash or parsed from JSON. The structure currently
|
157
|
+
follows JSON Table Schema with some extensions and rudinmentary [CSV on the Web Metadata](http://www.w3.org/TR/tabular-metadata/).
|
158
158
|
|
159
|
-
An example schema file is:
|
159
|
+
An example JSON Table Schema schema file is:
|
160
160
|
|
161
161
|
{
|
162
162
|
"fields": [
|
163
|
-
{
|
164
|
-
"name": "id",
|
165
|
-
"constraints": { "required": true }
|
163
|
+
{
|
164
|
+
"name": "id",
|
165
|
+
"constraints": { "required": true }
|
166
166
|
},
|
167
|
-
{
|
168
|
-
"name": "price",
|
169
|
-
"constraints": { "required": true, "minLength": 1 }
|
167
|
+
{
|
168
|
+
"name": "price",
|
169
|
+
"constraints": { "required": true, "minLength": 1 }
|
170
170
|
},
|
171
|
-
{
|
172
|
-
"name": "postcode",
|
173
|
-
"constraints": {
|
174
|
-
"required": true,
|
175
|
-
"pattern": "[A-Z]{1,2}[0-9][0-9A-Z]? ?[0-9][A-Z]{2}"
|
176
|
-
}
|
171
|
+
{
|
172
|
+
"name": "postcode",
|
173
|
+
"constraints": {
|
174
|
+
"required": true,
|
175
|
+
"pattern": "[A-Z]{1,2}[0-9][0-9A-Z]? ?[0-9][A-Z]{2}"
|
176
|
+
}
|
177
177
|
}
|
178
178
|
]
|
179
179
|
}
|
180
180
|
|
181
|
-
|
181
|
+
An equivalent CSV on the Web Metadata file is:
|
182
|
+
|
183
|
+
{
|
184
|
+
"@context": "http://www.w3.org/ns/csvw",
|
185
|
+
"url": "http://example.com/example1.csv",
|
186
|
+
"tableSchema": {
|
187
|
+
"columns": [
|
188
|
+
{
|
189
|
+
"name": "id",
|
190
|
+
"required": true
|
191
|
+
},
|
192
|
+
{
|
193
|
+
"name": "price",
|
194
|
+
"required": true,
|
195
|
+
"datatype": { "base": "string", "minLength": 1 }
|
196
|
+
},
|
197
|
+
{
|
198
|
+
"name": "postcode",
|
199
|
+
"required": true
|
200
|
+
}
|
201
|
+
]
|
202
|
+
}
|
203
|
+
}
|
182
204
|
|
183
|
-
|
205
|
+
Parsing and validating with a schema (of either kind):
|
206
|
+
|
207
|
+
schema = Csvlint::Schema.load_from_json(uri)
|
184
208
|
validator = Csvlint::Validator.new( "http://example.org/data.csv", nil, schema )
|
185
209
|
|
210
|
+
### CSV on the Web Validation Support
|
211
|
+
|
212
|
+
This gem passes all the validation tests in the [official CSV on the Web test suite](http://w3c.github.io/csvw/tests/) (though there might still be errors or parts of the [CSV on the Web standard](http://www.w3.org/TR/tabular-metadata/) that aren't tested by that test suite).
|
213
|
+
|
214
|
+
### JSON Table Schema Support
|
215
|
+
|
186
216
|
Supported constraints:
|
187
217
|
|
188
218
|
* `required` -- there must be a value for this field in every row
|
@@ -192,7 +222,7 @@ Supported constraints:
|
|
192
222
|
* `pattern` -- values must match the provided regular expression
|
193
223
|
* `type` -- specifies an XML Schema data type. Values of the column must be a valid value for that type
|
194
224
|
* `minimum` -- specify a minimum range for values, the value will be parsed as specified by `type`
|
195
|
-
* `maximum` -- specify a maximum range for values, the value will be parsed as specified by `type`
|
225
|
+
* `maximum` -- specify a maximum range for values, the value will be parsed as specified by `type`
|
196
226
|
* `datePattern` -- specify a `strftime` compatible date pattern to be used when parsing date values and min/max constraints
|
197
227
|
|
198
228
|
Supported data types (this is still a work in progress):
|
@@ -214,7 +244,7 @@ Supported data types (this is still a work in progress):
|
|
214
244
|
* Time -- `http://www.w3.org/2001/XMLSchema#time`
|
215
245
|
|
216
246
|
Use of an unknown data type will result in the column failing to validate.
|
217
|
-
|
247
|
+
|
218
248
|
Schema validation provides some additional types of error and warning messages:
|
219
249
|
|
220
250
|
* `:missing_value` (error) -- a column marked as `required` in the schema has no value
|
@@ -248,3 +278,30 @@ validator = Csvlint::Validator.new( "http://example.org/data.csv", nil, nil, opt
|
|
248
278
|
3. Commit your changes (`git commit -am 'Add some feature'`)
|
249
279
|
4. Push to the branch (`git push origin my-new-feature`)
|
250
280
|
5. Create new Pull Request
|
281
|
+
|
282
|
+
### Testing
|
283
|
+
|
284
|
+
The codebase includes both rspec and cucumber tests, which can be run together using:
|
285
|
+
|
286
|
+
$ rake
|
287
|
+
|
288
|
+
or separately:
|
289
|
+
|
290
|
+
$ rake spec
|
291
|
+
$ rake features
|
292
|
+
|
293
|
+
When the cucumber tests are first run, a script will create tests based on the latest version of the [CSV on the Web test suite](http://w3c.github.io/csvw/tests/), including creating a local cache of the test files. This requires an internet connection and some patience. Following that download, the tests will run locally; there's also a batch script:
|
294
|
+
|
295
|
+
$ bin/run-csvw-tests
|
296
|
+
|
297
|
+
which will run the tests from the command line.
|
298
|
+
|
299
|
+
If you need to refresh the CSV on the Web tests:
|
300
|
+
|
301
|
+
$ rm bin/run-csvw-tests
|
302
|
+
$ rm features/csvw_validation_tests.feature
|
303
|
+
$ rm -r features/fixtures/csvw
|
304
|
+
|
305
|
+
and then run the cucumber tests again or:
|
306
|
+
|
307
|
+
$ ruby features/support/load_tests.rb
|
data/bin/csvlint
CHANGED
@@ -16,8 +16,8 @@ opts.on("-d", "--dump-errors", "Pretty print error and warning objects.") do |d|
|
|
16
16
|
options[:dump] = d
|
17
17
|
end
|
18
18
|
|
19
|
-
opts.on("-s", "--schema
|
20
|
-
options[:
|
19
|
+
opts.on("-s", "--schema FILENAME", "Schema file") do |s|
|
20
|
+
options[:schema] = s
|
21
21
|
end
|
22
22
|
|
23
23
|
opts.on_tail("-h", "--help",
|
@@ -35,14 +35,15 @@ rescue OptionParser::InvalidOption => e
|
|
35
35
|
end
|
36
36
|
|
37
37
|
def print_error(index, error, dump, color)
|
38
|
-
|
39
38
|
location = ""
|
40
39
|
location += error.row.to_s if error.row
|
41
40
|
location += "#{error.row ? "," : ""}#{error.column.to_s}" if error.column
|
42
41
|
if error.row || error.column
|
43
42
|
location = "#{error.row ? "Row" : "Column"}: #{location}"
|
44
43
|
end
|
45
|
-
output_string = "#{index+1}. #{error.type}
|
44
|
+
output_string = "#{index+1}. #{error.type}"
|
45
|
+
output_string += ". #{location}" unless location.empty?
|
46
|
+
output_string += ". #{error.content}" if error.content
|
46
47
|
|
47
48
|
if $stdout.tty?
|
48
49
|
puts output_string.colorize(color)
|
@@ -56,6 +57,30 @@ def print_error(index, error, dump, color)
|
|
56
57
|
|
57
58
|
end
|
58
59
|
|
60
|
+
def validate_csv(source, schema, dump)
|
61
|
+
validator = Csvlint::Validator.new( source, nil, schema )
|
62
|
+
|
63
|
+
if $stdout.tty?
|
64
|
+
puts "#{source.path || source || "CSV"} is #{validator.valid? ? "VALID".green : "INVALID".red}"
|
65
|
+
else
|
66
|
+
puts "#{source.path || source || "CSV"} is #{validator.valid? ? "VALID" : "INVALID"}"
|
67
|
+
end
|
68
|
+
|
69
|
+
if validator.errors.size > 0
|
70
|
+
validator.errors.each_with_index do |error, i|
|
71
|
+
print_error(i, error, dump, :red)
|
72
|
+
end
|
73
|
+
end
|
74
|
+
|
75
|
+
if validator.warnings.size > 0
|
76
|
+
validator.warnings.each_with_index do |error, i|
|
77
|
+
print_error(i, error, dump, :yellow)
|
78
|
+
end
|
79
|
+
end
|
80
|
+
|
81
|
+
return validator.valid?
|
82
|
+
end
|
83
|
+
|
59
84
|
if ARGV.length == 0 && !$stdin.tty?
|
60
85
|
source = StringIO.new(ARGF.read)
|
61
86
|
else
|
@@ -63,13 +88,13 @@ else
|
|
63
88
|
source = ARGV[0]
|
64
89
|
unless source =~ /^http(s)?/
|
65
90
|
begin
|
66
|
-
source = File.new( source ) unless source =~ /^http(s)?/
|
91
|
+
source = File.new( source ) unless source =~ /^http(s)?/
|
67
92
|
rescue Errno::ENOENT
|
68
93
|
puts "#{source} not found"
|
69
94
|
exit 1
|
70
95
|
end
|
71
96
|
end
|
72
|
-
|
97
|
+
elsif !options[:schema]
|
73
98
|
puts "No CSV data to validate."
|
74
99
|
puts opts
|
75
100
|
exit 1
|
@@ -77,34 +102,50 @@ else
|
|
77
102
|
end
|
78
103
|
|
79
104
|
schema = nil
|
80
|
-
if options[:
|
105
|
+
if options[:schema]
|
81
106
|
begin
|
82
|
-
|
107
|
+
schema = Csvlint::Schema.load_from_json(options[:schema])
|
108
|
+
rescue JSON::ParserError => e
|
109
|
+
output_string = "invalid metadata: malformed JSON"
|
110
|
+
if $stdout.tty?
|
111
|
+
puts output_string.colorize(:red)
|
112
|
+
else
|
113
|
+
puts output_string
|
114
|
+
end
|
115
|
+
exit 1
|
116
|
+
rescue Csvlint::Csvw::MetadataError => e
|
117
|
+
output_string = "invalid metadata: #{e.message}#{" at " + e.path if e.path}"
|
118
|
+
if $stdout.tty?
|
119
|
+
puts output_string.colorize(:red)
|
120
|
+
else
|
121
|
+
puts output_string
|
122
|
+
end
|
123
|
+
exit 1
|
83
124
|
rescue Errno::ENOENT
|
84
|
-
puts "#{options[:
|
125
|
+
puts "#{options[:schema]} not found"
|
85
126
|
exit 1
|
86
127
|
end
|
87
|
-
schema = Csvlint::Schema.from_json_table(nil, JSON.parse(schemafile))
|
88
|
-
end
|
89
|
-
|
90
|
-
validator = Csvlint::Validator.new( source, nil, schema )
|
91
|
-
|
92
|
-
if $stdout.tty?
|
93
|
-
puts "#{ARGV[0] || "CSV"} is #{validator.valid? ? "VALID".green : "INVALID".red}"
|
94
|
-
else
|
95
|
-
puts "#{ARGV[0] || "CSV"} is #{validator.valid? ? "VALID" : "INVALID"}"
|
96
128
|
end
|
97
129
|
|
98
|
-
|
99
|
-
|
100
|
-
|
130
|
+
valid = true
|
131
|
+
if source.nil?
|
132
|
+
unless schema.instance_of? Csvlint::Csvw::TableGroup
|
133
|
+
puts "No CSV data to validate."
|
134
|
+
puts opts
|
135
|
+
exit 1
|
101
136
|
end
|
102
|
-
|
103
|
-
|
104
|
-
|
105
|
-
|
106
|
-
|
137
|
+
schema.tables.keys.each do |source|
|
138
|
+
begin
|
139
|
+
source = source.sub("file:","")
|
140
|
+
source = File.new( source )
|
141
|
+
rescue Errno::ENOENT
|
142
|
+
puts "#{source} not found"
|
143
|
+
exit 1
|
144
|
+
end unless source =~ /^http(s)?/
|
145
|
+
valid &= validate_csv(source, schema, options[:dump])
|
107
146
|
end
|
147
|
+
else
|
148
|
+
valid = validate_csv(source, schema, options[:dump])
|
108
149
|
end
|
109
150
|
|
110
|
-
exit 1 unless
|
151
|
+
exit 1 unless valid
|
data/csvlint.gemspec
CHANGED
@@ -23,6 +23,8 @@ Gem::Specification.new do |spec|
|
|
23
23
|
spec.add_dependency "open_uri_redirections"
|
24
24
|
spec.add_dependency "activesupport"
|
25
25
|
spec.add_dependency "addressable"
|
26
|
+
spec.add_dependency "escape_utils"
|
27
|
+
spec.add_dependency "uri_template"
|
26
28
|
|
27
29
|
spec.add_development_dependency "bundler", "~> 1.3"
|
28
30
|
spec.add_development_dependency "rake"
|
@@ -0,0 +1,127 @@
|
|
1
|
+
Feature: CSVW Schema Validation
|
2
|
+
|
3
|
+
Scenario: Valid CSV
|
4
|
+
Given I have a CSV with the following content:
|
5
|
+
"""
|
6
|
+
"Bob","1234","bob@example.org"
|
7
|
+
"Alice","5","alice@example.com"
|
8
|
+
"""
|
9
|
+
And it is stored at the url "http://example.com/example1.csv"
|
10
|
+
And I have metadata with the following content:
|
11
|
+
"""
|
12
|
+
{
|
13
|
+
"@context": "http://www.w3.org/ns/csvw",
|
14
|
+
"url": "http://example.com/example1.csv",
|
15
|
+
"dialect": { "header": false },
|
16
|
+
"tableSchema": {
|
17
|
+
"columns": [
|
18
|
+
{ "name": "Name", "required": true },
|
19
|
+
{ "name": "Id", "required": true, "datatype": { "base": "string", "minLength": 1 } },
|
20
|
+
{ "name": "Email", "required": true }
|
21
|
+
]
|
22
|
+
}
|
23
|
+
}
|
24
|
+
"""
|
25
|
+
When I ask if there are errors
|
26
|
+
Then there should be 0 error
|
27
|
+
|
28
|
+
Scenario: Schema invalid CSV
|
29
|
+
Given I have a CSV with the following content:
|
30
|
+
"""
|
31
|
+
"Bob","1234","bob@example.org"
|
32
|
+
"Alice","5","alice@example.com"
|
33
|
+
"""
|
34
|
+
And it is stored at the url "http://example.com/example1.csv"
|
35
|
+
And I have metadata with the following content:
|
36
|
+
"""
|
37
|
+
{
|
38
|
+
"@context": "http://www.w3.org/ns/csvw",
|
39
|
+
"url": "http://example.com/example1.csv",
|
40
|
+
"dialect": { "header": false },
|
41
|
+
"tableSchema": {
|
42
|
+
"columns": [
|
43
|
+
{ "name": "Name", "required": true },
|
44
|
+
{ "name": "Id", "required": true, "datatype": { "base": "string", "minLength": 3 } },
|
45
|
+
{ "name": "Email", "required": true }
|
46
|
+
]
|
47
|
+
}
|
48
|
+
}
|
49
|
+
"""
|
50
|
+
When I ask if there are errors
|
51
|
+
Then there should be 1 error
|
52
|
+
|
53
|
+
Scenario: CSV with incorrect header
|
54
|
+
Given I have a CSV with the following content:
|
55
|
+
"""
|
56
|
+
"name","id","contact"
|
57
|
+
"Bob","1234","bob@example.org"
|
58
|
+
"Alice","5","alice@example.com"
|
59
|
+
"""
|
60
|
+
And it is stored at the url "http://example.com/example1.csv"
|
61
|
+
And I have metadata with the following content:
|
62
|
+
"""
|
63
|
+
{
|
64
|
+
"@context": "http://www.w3.org/ns/csvw",
|
65
|
+
"url": "http://example.com/example1.csv",
|
66
|
+
"tableSchema": {
|
67
|
+
"columns": [
|
68
|
+
{ "titles": "name", "required": true },
|
69
|
+
{ "titles": "id", "required": true, "datatype": { "base": "string", "minLength": 1 } },
|
70
|
+
{ "titles": "email", "required": true }
|
71
|
+
]
|
72
|
+
}
|
73
|
+
}
|
74
|
+
"""
|
75
|
+
When I ask if there are errors
|
76
|
+
Then there should be 1 error
|
77
|
+
|
78
|
+
Scenario: Schema with valid regex
|
79
|
+
Given I have a CSV with the following content:
|
80
|
+
"""
|
81
|
+
"firstname","id","email"
|
82
|
+
"Bob","1234","bob@example.org"
|
83
|
+
"Alice","5","alice@example.com"
|
84
|
+
"""
|
85
|
+
And it is stored at the url "http://example.com/example1.csv"
|
86
|
+
And I have metadata with the following content:
|
87
|
+
"""
|
88
|
+
{
|
89
|
+
"@context": "http://www.w3.org/ns/csvw",
|
90
|
+
"url": "http://example.com/example1.csv",
|
91
|
+
"tableSchema": {
|
92
|
+
"columns": [
|
93
|
+
{ "titles": "firstname", "required": true, "datatype": { "base": "string", "format": "^[A-Za-z0-9_]*$" } },
|
94
|
+
{ "titles": "id", "required": true, "datatype": { "base": "string", "minLength": 1 } },
|
95
|
+
{ "titles": "email", "required": true }
|
96
|
+
]
|
97
|
+
}
|
98
|
+
}
|
99
|
+
"""
|
100
|
+
When I ask if there are warnings
|
101
|
+
Then there should be 0 warnings
|
102
|
+
|
103
|
+
Scenario: Schema with invalid regex
|
104
|
+
Given I have a CSV with the following content:
|
105
|
+
"""
|
106
|
+
"firstname","id","email"
|
107
|
+
"Bob","1234","bob@example.org"
|
108
|
+
"Alice","5","alice@example.com"
|
109
|
+
"""
|
110
|
+
And it is stored at the url "http://example.com/example1.csv"
|
111
|
+
And I have metadata with the following content:
|
112
|
+
"""
|
113
|
+
{
|
114
|
+
"@context": "http://www.w3.org/ns/csvw",
|
115
|
+
"url": "http://example.com/example1.csv",
|
116
|
+
"tableSchema": {
|
117
|
+
"columns": [
|
118
|
+
{ "titles": "firstname", "required": true, "datatype": { "base": "string", "format": "((" } },
|
119
|
+
{ "titles": "id", "required": true, "datatype": { "base": "string", "minLength": 1 } },
|
120
|
+
{ "titles": "email", "required": true }
|
121
|
+
]
|
122
|
+
}
|
123
|
+
}
|
124
|
+
"""
|
125
|
+
When I ask if there are warnings
|
126
|
+
Then there should be 1 warnings
|
127
|
+
And that warning should have the type "invalid_regex"
|