idata 0.1.33 → 0.2.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.md +22 -24
- data/README2.md +37 -0
- data/bin/ivalidate2 +426 -0
- data/full-pg-lawson.sh +707 -0
- data/full-pg.sh +153 -155
- data/full.sh +75 -5
- data/lib/idata/version.rb +1 -1
- data/sample.sh +13 -10
- metadata +18 -14
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA1:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 26fd9bafa90c2af8eef61231c2b070c6a395bbcb
|
|
4
|
+
data.tar.gz: b1952edd955ea4eeb72d31f8c640f053d5515a17
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 79b8f98bad03bcfccea93cf2cb074271391def428288784059842435e4e6334928f1d5806c50707816a61621e38e4761cae761443c634bcf2ba0baa5187c3e61
|
|
7
|
+
data.tar.gz: 2513beafcacb28bc7d745e93ab6dd09e0f3b246c25bcee2d81b5bee92b66778b26feab7f43887790a284dfbc8e6906c3860d71082f530a78f1ff82e598085df7
|
data/README.md
CHANGED
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
# Overview
|
|
2
|
-
We provide some
|
|
2
|
+
We provide some utilities for validating data in a PostgreSQL data table.
|
|
3
3
|
These utilities can be used as simple terminal commands and can be installed by:
|
|
4
4
|
|
|
5
5
|
gem install idata
|
|
@@ -13,7 +13,7 @@ idata comes along with the following commands:
|
|
|
13
13
|
* imerge
|
|
14
14
|
* isanitize
|
|
15
15
|
|
|
16
|
-
Run a command with
|
|
16
|
+
Run a command with `--help` switch for the details
|
|
17
17
|
|
|
18
18
|
Prequisites:
|
|
19
19
|
* PostgreSQL 9.0 or above
|
|
@@ -23,34 +23,34 @@ Prequisites:
|
|
|
23
23
|
# Usage
|
|
24
24
|
Suppose we have an `items` table, and we want to validate its records against certain criteria like:
|
|
25
25
|
|
|
26
|
-
* `
|
|
27
|
-
* `
|
|
28
|
-
* The composite `[
|
|
29
|
-
* One `
|
|
26
|
+
* `vendor_code` must not be null
|
|
27
|
+
* `vendor_name` must not be null
|
|
28
|
+
* The composite `[vendor_code, vendor_name]` must be unique
|
|
29
|
+
* One `vendor_code` corresponds to only ONE `vendor_name` (in other words, there must not be two items with different `vendor_name` but with the same `vendor_code`)
|
|
30
30
|
and vice-versa
|
|
31
31
|
* `vendor_code` must reference the `code` column in the `vendors` table
|
|
32
32
|
|
|
33
33
|
Then the validation command could be:
|
|
34
34
|
```
|
|
35
|
-
ivalidate --host=localhost --user=postgres --database=mydb --table=items
|
|
36
|
-
--
|
|
35
|
+
ivalidate --host=localhost --user=postgres --database=mydb --table=items
|
|
36
|
+
--log-to=validation_errors \
|
|
37
|
+
--not-null="vendor_code" \
|
|
37
38
|
--not-null="vendor_name" \
|
|
38
|
-
--unique="
|
|
39
|
-
--consistent-by="
|
|
40
|
-
--consistent-by="
|
|
39
|
+
--unique="vendor_code,vendor_name" \
|
|
40
|
+
--consistent-by="vendor_code|vendor_name" \
|
|
41
|
+
--consistent-by="vendor_name|vendor_code" \
|
|
41
42
|
--cross-reference="vendor_code|vendors.code"
|
|
42
43
|
```
|
|
43
44
|
Validation results for every single record are logged to an additional column named `validation_errors`
|
|
44
|
-
of the `items` table, as specified by the `--log-to` switch
|
|
45
|
-
|
|
46
|
-
As you can see, most common checks can be performed using the supported switches:
|
|
45
|
+
of the `items` table, as specified by the `--log-to` switch. As you can see, most common checks can be performed using the supported switches:
|
|
47
46
|
```
|
|
48
47
|
--not-null
|
|
49
48
|
--unique
|
|
50
49
|
--consistent-by
|
|
51
50
|
--cross-reference
|
|
52
51
|
```
|
|
53
|
-
|
|
52
|
+
# Custom Validation
|
|
53
|
+
For more customized checks, we support some other switches.
|
|
54
54
|
|
|
55
55
|
The `--match="field/pattern/"` switch tells the program to check if value of a `field` matches the provided `pattern` (which is a regular expression).
|
|
56
56
|
For example:
|
|
@@ -61,22 +61,20 @@ For example:
|
|
|
61
61
|
# Check if value of status is either 'A' or 'I' (any other value is not allowed)
|
|
62
62
|
ivalidate --match="status/^(A|I)$/"
|
|
63
63
|
```
|
|
64
|
-
In case you need even more customized validation other than the supported ones (match
|
|
65
|
-
then `--query` switch may
|
|
64
|
+
In case you need even more customized validation other than the supported ones (`match`, `unique`, `not-null`, `cross-reference`...)
|
|
65
|
+
then the `--query` switch may come in handy. For example:
|
|
66
66
|
```
|
|
67
|
-
ivalidate --query="
|
|
67
|
+
ivalidate --query="start_date >= string_to_date('01/02/2014') -- invalid date"
|
|
68
68
|
```
|
|
69
69
|
You can also use `--rquery` which is the reversed counterpart of `--query`
|
|
70
|
-
For example, the following two checks are equivalent:
|
|
70
|
+
For example, the following two checks are equivalent, mark any record whose `start_date < '01/02/2014'` as "invalid date":
|
|
71
71
|
```
|
|
72
|
-
ivalidate --query="
|
|
73
|
-
ivalidate --rquery="
|
|
72
|
+
ivalidate --query="start_date >= string_to_date('01/02/2014') -- invalid date"
|
|
73
|
+
ivalidate --rquery="start_date < string_to_date('01/02/2014') -- invalid date"
|
|
74
74
|
```
|
|
75
|
-
(mark any record whose `start_date < '01/02/2014'` as "invalid date")
|
|
76
75
|
|
|
77
76
|
Note: run `ivalidate --help` to see the full list of supported switches
|
|
78
77
|
|
|
79
|
-
|
|
80
78
|
# Put it all together
|
|
81
79
|
You can put several `ivalidate` commands (for several data tables) in one single bash/sh file.
|
|
82
80
|
Besides `ivalidate`, we also support some other utilities to:
|
|
@@ -84,6 +82,6 @@ Besides `ivalidate`, we also support some other utilities to:
|
|
|
84
82
|
+ Modify data tables
|
|
85
83
|
+ Generate summary reports
|
|
86
84
|
|
|
87
|
-
|
|
85
|
+
See our `sample.sh` for a comprehensive example
|
|
88
86
|
|
|
89
87
|
|
data/README2.md
ADDED
|
@@ -0,0 +1,37 @@
|
|
|
1
|
+
### Overview
|
|
2
|
+
File dùng cho validation criteria có cấu trúc chung như sau:
|
|
3
|
+
```yaml
|
|
4
|
+
table:
|
|
5
|
+
- field: field1, field2, etc.
|
|
6
|
+
validations:
|
|
7
|
+
- rule:
|
|
8
|
+
code:
|
|
9
|
+
error:
|
|
10
|
+
impact:
|
|
11
|
+
solution:
|
|
12
|
+
```
|
|
13
|
+
|
|
14
|
+
##### Explain:
|
|
15
|
+
+ Một `table` có một hoặc nhiều `field`
|
|
16
|
+
+ Một `field` có một section `validations` chứa một hoặc nhiều rules
|
|
17
|
+
+ Mỗi rule chứa `rule` (required) và `code`, `error`, `solution`, `impact`, `priority` tương ứng (optional)
|
|
18
|
+
+ Rule: viết theo format covered bên dưới
|
|
19
|
+
+ Các field khác free text trên cùng 1 line, dùng \n làm dấu cách dòng
|
|
20
|
+
+ Giá trị của field có thể là một hoặc tập gồm nhiều field name, separated by comma
|
|
21
|
+
|
|
22
|
+
## Writing rules
|
|
23
|
+
Supported rules include:
|
|
24
|
+
|
|
25
|
+
| Rule | Description | Example |
|
|
26
|
+
| ---- | ----------- | ------- |
|
|
27
|
+
| `not null` | Giá trị của field tương ứng không được rỗng | |
|
|
28
|
+
| `unique` | Giá trị của field tương ứng phải unique trong table | |
|
|
29
|
+
| `matches "/regexp/"` | Giá trị của field phải thỏa format định nghĩa bởi `regexp` | |
|
|
30
|
+
| `not matches "/regexp/"` | Reverse counterpart của `matches` | |
|
|
31
|
+
| `consistent by "ref"` | Giá trị của field tương ứng phải consistent với `ref` | |
|
|
32
|
+
| `cross references "table.field"` | Giá trị của field phải reference tới một field khác `table.field` | |
|
|
33
|
+
| `custom query "query"` | Dùng custom SQL `query` (trong trường hợp business phức tạp không thể biểu diễn bằng các rule khác) | |
|
|
34
|
+
| `reverse query "query"` | Reverse counterpart của `custom query` | |
|
|
35
|
+
|
|
36
|
+
## Others
|
|
37
|
+
TBD
|
data/bin/ivalidate2
ADDED
|
@@ -0,0 +1,426 @@
|
|
|
1
|
+
#!/usr/bin/env ruby
|
|
2
|
+
# DATA VALIDATOR
|
|
3
|
+
#
|
|
4
|
+
# @author Nghi Pham
|
|
5
|
+
# @date April 2014
|
|
6
|
+
#
|
|
7
|
+
# Data validation includes:
|
|
8
|
+
# * Uniqueness
|
|
9
|
+
# * Integrity (cross reference)
|
|
10
|
+
# * Data type: numeric, text, enum, etc.
|
|
11
|
+
# * Data format: text size, text values, enum, inclusion, exclusion, etc.
|
|
12
|
+
#
|
|
13
|
+
# Issue ruby load.rb --help for guideline/examples
|
|
14
|
+
#
|
|
15
|
+
require 'optparse'
|
|
16
|
+
require 'active_record'
|
|
17
|
+
require 'rubygems'
|
|
18
|
+
require 'logger'
|
|
19
|
+
|
|
20
|
+
$options = {}
|
|
21
|
+
parser = OptionParser.new("", 24) do |opts|
|
|
22
|
+
opts.banner = "\nProgram: Data Validator\nAuthor: MCKI\n\n"
|
|
23
|
+
|
|
24
|
+
opts.on("-c", "--config FILE", "Configuration file") do |v|
|
|
25
|
+
$options[:config] = v
|
|
26
|
+
end
|
|
27
|
+
|
|
28
|
+
opts.on("-h", "--host HOST", "PostgreSQL host") do |v|
|
|
29
|
+
$options[:host] = v
|
|
30
|
+
end
|
|
31
|
+
|
|
32
|
+
opts.on("-d", "--database DATABASE", "PostgreSQL database") do |v|
|
|
33
|
+
$options[:database] = v
|
|
34
|
+
end
|
|
35
|
+
|
|
36
|
+
opts.on("-u", "--username USER", "PostgreSQL username") do |v|
|
|
37
|
+
$options[:username] = v
|
|
38
|
+
end
|
|
39
|
+
|
|
40
|
+
opts.on("-p", "--password PASSWORD", "PostgreSQL password") do |v|
|
|
41
|
+
$options[:password] = v
|
|
42
|
+
end
|
|
43
|
+
|
|
44
|
+
opts.on("-l", "--listen PORT", "PostgreSQL listen port (default to 5432)") do |v|
|
|
45
|
+
$options[:listen] = v
|
|
46
|
+
end
|
|
47
|
+
|
|
48
|
+
opts.on_tail('--help', 'Displays this help') do
|
|
49
|
+
puts opts, "", help
|
|
50
|
+
exit
|
|
51
|
+
end
|
|
52
|
+
end
|
|
53
|
+
|
|
54
|
+
def help
|
|
55
|
+
return ''
|
|
56
|
+
end
|
|
57
|
+
|
|
58
|
+
begin
|
|
59
|
+
parser.parse!
|
|
60
|
+
rescue SystemExit => ex
|
|
61
|
+
exit
|
|
62
|
+
end
|
|
63
|
+
|
|
64
|
+
# Load parameters from ENVIRONMENT if exist
|
|
65
|
+
$options[:host] ||= ENV['HOST']
|
|
66
|
+
$options[:username] ||= ENV['USERNAME']
|
|
67
|
+
$options[:password] ||= ENV['PASSWORD']
|
|
68
|
+
$options[:listen] ||= ENV['LISTEN']
|
|
69
|
+
$options[:database] ||= ENV['DATABASE']
|
|
70
|
+
|
|
71
|
+
# validate parameters
|
|
72
|
+
if $options[:config].nil?
|
|
73
|
+
puts "\nPlease specify config file: -c\n\n"
|
|
74
|
+
exit
|
|
75
|
+
end
|
|
76
|
+
|
|
77
|
+
if $options[:host].nil?
|
|
78
|
+
puts "\nPlease specify host name: -h\n\n"
|
|
79
|
+
exit
|
|
80
|
+
end
|
|
81
|
+
|
|
82
|
+
if $options[:database].nil?
|
|
83
|
+
puts "\nPlease specify PostgreSQL database name: -d\n\n"
|
|
84
|
+
exit
|
|
85
|
+
end
|
|
86
|
+
|
|
87
|
+
if $options[:username].nil?
|
|
88
|
+
puts "\nPlease specify PostgreSQL username: -u\n\n"
|
|
89
|
+
exit
|
|
90
|
+
end
|
|
91
|
+
|
|
92
|
+
# Default value
|
|
93
|
+
$options[:listen] ||= 5432
|
|
94
|
+
|
|
95
|
+
# Database dump
|
|
96
|
+
ActiveRecord::Base.establish_connection(
|
|
97
|
+
'adapter' => 'postgresql',
|
|
98
|
+
'host' => $options[:host],
|
|
99
|
+
'database' => $options[:database],
|
|
100
|
+
'username' => $options[:username],
|
|
101
|
+
'password' => $options[:password],
|
|
102
|
+
'port' => $options[:listen],
|
|
103
|
+
'timeout' => 15000
|
|
104
|
+
)
|
|
105
|
+
|
|
106
|
+
class String
|
|
107
|
+
def not_null_sql
|
|
108
|
+
a = self.split(/\s*,\s*/)
|
|
109
|
+
sql = a.map{|s|
|
|
110
|
+
"#{s} IS NOT NULL AND length(trim(#{s}::text)) <> 0"
|
|
111
|
+
}.join(" AND ")
|
|
112
|
+
|
|
113
|
+
"(#{sql})"
|
|
114
|
+
end
|
|
115
|
+
|
|
116
|
+
def lower
|
|
117
|
+
a = self.split(/\s*,\s*/)
|
|
118
|
+
sql = a.map{|s|
|
|
119
|
+
"lower(#{s})"
|
|
120
|
+
}.join(",")
|
|
121
|
+
|
|
122
|
+
sql
|
|
123
|
+
end
|
|
124
|
+
|
|
125
|
+
def black; "\033[30m#{self}\033[0m" end
|
|
126
|
+
def red; "\033[31m#{self}\033[0m" end
|
|
127
|
+
def green; "\033[32m#{self}\033[0m" end
|
|
128
|
+
def brown; "\033[33m#{self}\033[0m" end
|
|
129
|
+
def blue; "\033[34m#{self}\033[0m" end
|
|
130
|
+
def magenta; "\033[35m#{self}\033[0m" end
|
|
131
|
+
def cyan; "\033[36m#{self}\033[0m" end
|
|
132
|
+
def gray; "\033[37m#{self}\033[0m" end
|
|
133
|
+
end
|
|
134
|
+
|
|
135
|
+
class Logger
|
|
136
|
+
alias_method :_old_info, :info
|
|
137
|
+
alias_method :_old_warn, :warn
|
|
138
|
+
alias_method :_old_error, :error
|
|
139
|
+
|
|
140
|
+
|
|
141
|
+
def error(msg)
|
|
142
|
+
_old_error(msg.red)
|
|
143
|
+
end
|
|
144
|
+
|
|
145
|
+
def info(msg)
|
|
146
|
+
_old_info(msg.green)
|
|
147
|
+
end
|
|
148
|
+
|
|
149
|
+
def warn(msg)
|
|
150
|
+
_old_warn(msg.brown)
|
|
151
|
+
end
|
|
152
|
+
end
|
|
153
|
+
|
|
154
|
+
module IData
|
|
155
|
+
class Validator
|
|
156
|
+
SUPPORTED_RULES_REGEXP = /^\s*(not null|cross references|matches|not matches|custom query|reverse query|unique|consistent by)\s*/
|
|
157
|
+
DEFAULT_ERROR_FIELD = 'validation_errors'
|
|
158
|
+
META_TABLE = 'validation_meta'
|
|
159
|
+
SUMMARY_TABLE = 'summary'
|
|
160
|
+
|
|
161
|
+
def initialize(file)
|
|
162
|
+
@config = YAML.load_file(file)
|
|
163
|
+
@logger = Logger.new(STDOUT)
|
|
164
|
+
@rules = []
|
|
165
|
+
@logger.formatter = proc do |severity, datetime, progname, msg|
|
|
166
|
+
"#{severity}: #{datetime} - #{msg}\n"
|
|
167
|
+
end
|
|
168
|
+
@config.each do |table, fields|
|
|
169
|
+
unless table_exists?(table)
|
|
170
|
+
@logger.warn "Table #{table} does not exist!"
|
|
171
|
+
end
|
|
172
|
+
@logger.info "Validating table #{table}"
|
|
173
|
+
fields.each do |field|
|
|
174
|
+
field['validations'].each do |rule|
|
|
175
|
+
p rule
|
|
176
|
+
type, args = parse_rule(rule['rule'])
|
|
177
|
+
options = rule.merge('table' => table, 'field' => field['field'], 'type' => type, 'args' => args)
|
|
178
|
+
options['code'] = Digest::SHA1.hexdigest([table, field['field'], rule['rule']].join(""))
|
|
179
|
+
@rules << options
|
|
180
|
+
end
|
|
181
|
+
end
|
|
182
|
+
end
|
|
183
|
+
end
|
|
184
|
+
|
|
185
|
+
def validate!
|
|
186
|
+
# reset the meta table
|
|
187
|
+
execute("DROP TABLE IF EXISTS #{META_TABLE}")
|
|
188
|
+
|
|
189
|
+
# validate
|
|
190
|
+
@logger.info "Validation started!"
|
|
191
|
+
@rules.each {|r| validate(r) }
|
|
192
|
+
|
|
193
|
+
# create meta table
|
|
194
|
+
create_table_from_array(@rules, table_name: META_TABLE, drop_table: true, extra_fields: ['impact', 'solution', 'count', 'percentage'])
|
|
195
|
+
|
|
196
|
+
# Done!
|
|
197
|
+
@logger.info "Validation done!"
|
|
198
|
+
end
|
|
199
|
+
|
|
200
|
+
def create_table_from_array(entries, options = {})
|
|
201
|
+
raise "Please specify :table_name" unless options[:table_name]
|
|
202
|
+
|
|
203
|
+
extra_fields = options[:extra_fields] || []
|
|
204
|
+
columns = entries.inject([]) {|x, i| x += i.keys } + extra_fields
|
|
205
|
+
columns.uniq!
|
|
206
|
+
|
|
207
|
+
if options[:drop_table]
|
|
208
|
+
execute "DROP TABLE IF EXISTS #{options[:table_name]}"
|
|
209
|
+
end
|
|
210
|
+
|
|
211
|
+
execute "CREATE TABLE IF NOT EXISTS #{options[:table_name]} ( #{columns.map{|c| quote_col_name(c.to_s) + ' VARCHAR' }.join(', ')} )"
|
|
212
|
+
|
|
213
|
+
insert_sql = entries.map { |r|
|
|
214
|
+
"INSERT INTO #{options[:table_name]}(#{columns.map{|c| quote_col_name(c.to_s)}.join(', ')}) VALUES(#{ columns.map{|c| quote(r[c])}.join(',') });"
|
|
215
|
+
}.join("")
|
|
216
|
+
|
|
217
|
+
execute insert_sql
|
|
218
|
+
end
|
|
219
|
+
|
|
220
|
+
def validate(options)
|
|
221
|
+
unless table_exists?(options['table'])
|
|
222
|
+
return
|
|
223
|
+
end
|
|
224
|
+
|
|
225
|
+
add_error_field(options)
|
|
226
|
+
case options['type']
|
|
227
|
+
|
|
228
|
+
when 'not null'
|
|
229
|
+
validate_not_null(options)
|
|
230
|
+
when 'custom query'
|
|
231
|
+
validate_custom_query(options)
|
|
232
|
+
when 'reverse query'
|
|
233
|
+
validate_reverse_query(options)
|
|
234
|
+
when 'custom query reversed'
|
|
235
|
+
validate_custom_query(options)
|
|
236
|
+
when 'matches'
|
|
237
|
+
validate_match(options)
|
|
238
|
+
when 'cross references'
|
|
239
|
+
validate_cross_reference(options)
|
|
240
|
+
when 'consistent by'
|
|
241
|
+
validate_consistent_by(options)
|
|
242
|
+
when 'unique'
|
|
243
|
+
validate_unique(options)
|
|
244
|
+
else
|
|
245
|
+
raise "Rule not recognized"
|
|
246
|
+
end
|
|
247
|
+
rescue Exception => ex
|
|
248
|
+
@logger.warn ex.message.split(/[\n]/).first.strip
|
|
249
|
+
end
|
|
250
|
+
|
|
251
|
+
def report!
|
|
252
|
+
sql = @rules.map {|r|
|
|
253
|
+
"(SELECT unnest(string_to_array(#{DEFAULT_ERROR_FIELD}, ' || ')) as code, count(*), round((count(*) * 100)::numeric / (SELECT count(*) FROM #{r['table']}), 2)::varchar || '%' AS percentage FROM #{r['table']} GROUP BY code)"
|
|
254
|
+
}
|
|
255
|
+
|
|
256
|
+
execute("
|
|
257
|
+
UPDATE #{META_TABLE} meta
|
|
258
|
+
SET count = stat.count,
|
|
259
|
+
percentage = stat.percentage
|
|
260
|
+
FROM (#{sql.join(" UNION ")}) stat
|
|
261
|
+
WHERE meta.code = stat.code"
|
|
262
|
+
)
|
|
263
|
+
end
|
|
264
|
+
|
|
265
|
+
private
|
|
266
|
+
def add_error_field(options)
|
|
267
|
+
error_field = options['log_to'] || DEFAULT_ERROR_FIELD
|
|
268
|
+
execute("ALTER TABLE #{options['table']} ADD COLUMN #{error_field} VARCHAR DEFAULT '';")
|
|
269
|
+
rescue Exception => ex
|
|
270
|
+
# @todo
|
|
271
|
+
end
|
|
272
|
+
|
|
273
|
+
def parse_rule(rule)
|
|
274
|
+
# @todo
|
|
275
|
+
type = rule[SUPPORTED_RULES_REGEXP]
|
|
276
|
+
if type.nil?
|
|
277
|
+
@logger.error "Invalid rule: #{rule}"
|
|
278
|
+
exit(0)
|
|
279
|
+
end
|
|
280
|
+
|
|
281
|
+
type.strip!
|
|
282
|
+
args = rule.gsub(SUPPORTED_RULES_REGEXP, '').gsub(/(^\s*["']|["']\s*$)/, "")
|
|
283
|
+
return type, args
|
|
284
|
+
end
|
|
285
|
+
|
|
286
|
+
def validate_not_null(options)
|
|
287
|
+
@logger.info "Validating data presence: #{options['table']}.[#{options['field']}]"
|
|
288
|
+
options['error'] ||= "[#{options['field']}] is null"
|
|
289
|
+
execute <<-eos
|
|
290
|
+
#{ update_sql(options) }
|
|
291
|
+
WHERE #{options['field']} IS NULL OR length(trim(#{options['field']})) = 0;
|
|
292
|
+
eos
|
|
293
|
+
end
|
|
294
|
+
|
|
295
|
+
def validate_custom_query(options)
|
|
296
|
+
@logger.info "Validating with custom query: #{options['args'][0..50]}#{(options['args'].size > 50) ? '...' : ''}"
|
|
297
|
+
options['error'] ||= "Unknown"
|
|
298
|
+
execute <<-eos
|
|
299
|
+
#{ update_sql(options) }
|
|
300
|
+
WHERE NOT (#{options['args']})
|
|
301
|
+
eos
|
|
302
|
+
end
|
|
303
|
+
|
|
304
|
+
def validate_reverse_query(options)
|
|
305
|
+
@logger.info "Validating with custom query: #{options['args'][0..50]}#{(options['args'].size > 50) ? '...' : ''}"
|
|
306
|
+
options['error'] ||= "Unknown"
|
|
307
|
+
execute <<-eos
|
|
308
|
+
#{ update_sql(options) }
|
|
309
|
+
WHERE (#{options['args']})
|
|
310
|
+
eos
|
|
311
|
+
end
|
|
312
|
+
|
|
313
|
+
def validate_consistent_by(options)
|
|
314
|
+
@logger.info "Validating integrity: #{options['table']}.[#{options['field']}] #{options['rule']}"
|
|
315
|
+
options['error'] ||= "Same [#{options['field']}] but different [#{options['args']}]"
|
|
316
|
+
|
|
317
|
+
f1_case = f1 = options['field']
|
|
318
|
+
f2_case = f2 = options['args']
|
|
319
|
+
|
|
320
|
+
if options['case_insensitive']
|
|
321
|
+
f1_case = f1_case.lower
|
|
322
|
+
f2_case = f2_case.lower
|
|
323
|
+
end
|
|
324
|
+
|
|
325
|
+
execute <<-eos
|
|
326
|
+
#{ update_sql(options) }
|
|
327
|
+
WHERE id IN (
|
|
328
|
+
SELECT unnest(array_agg(id)) FROM #{options['table']}
|
|
329
|
+
WHERE #{f1.not_null_sql} AND #{f2.not_null_sql}
|
|
330
|
+
GROUP BY #{f2_case}
|
|
331
|
+
HAVING COUNT(distinct #{f1_case}) > 1
|
|
332
|
+
);
|
|
333
|
+
eos
|
|
334
|
+
end
|
|
335
|
+
|
|
336
|
+
def validate_unique(options)
|
|
337
|
+
@logger.info "Validating uniqueness: #{options['table']}.[#{options['field']}]"
|
|
338
|
+
options['error'] ||= "[#{options['field']}] is not unique"
|
|
339
|
+
|
|
340
|
+
if options['case_insensitive']
|
|
341
|
+
f_lower = options['field'].lower
|
|
342
|
+
else
|
|
343
|
+
f_lower = options['field']
|
|
344
|
+
end
|
|
345
|
+
|
|
346
|
+
execute <<-eos
|
|
347
|
+
#{ update_sql(options) }
|
|
348
|
+
WHERE id IN (
|
|
349
|
+
SELECT unnest(array_agg(id)) FROM #{options['table']} GROUP BY #{f_lower}
|
|
350
|
+
HAVING count(*) > 1
|
|
351
|
+
) AND #{options['field'].not_null_sql};
|
|
352
|
+
eos
|
|
353
|
+
end
|
|
354
|
+
|
|
355
|
+
def validate_cross_reference(options)
|
|
356
|
+
@logger.info "Validating reference: #{options['table']}.[#{options['field']}] #{options['rule']}"
|
|
357
|
+
|
|
358
|
+
options['error'] ||= "[#{options['field']}] does not reference [#{options['args']}]"
|
|
359
|
+
|
|
360
|
+
field = options['field']
|
|
361
|
+
ref_table, ref_field = options['args'].split(/[\.]/)
|
|
362
|
+
|
|
363
|
+
if options['args'].split(/[\.]/).size != 2
|
|
364
|
+
raise "Invalid rule"
|
|
365
|
+
exit(0)
|
|
366
|
+
end
|
|
367
|
+
|
|
368
|
+
if options['case_insensitive']
|
|
369
|
+
join_condition = "on lower(origin.#{field}) = lower(target.#{ref_field})"
|
|
370
|
+
else
|
|
371
|
+
join_condition = "on origin.#{field}::text = target.#{ref_field}::text"
|
|
372
|
+
end
|
|
373
|
+
|
|
374
|
+
# @todo: poor performance here, think of a better SQL!!!
|
|
375
|
+
execute <<-eos
|
|
376
|
+
#{ update_sql(options) }
|
|
377
|
+
WHERE #{field} IN (
|
|
378
|
+
SELECT origin.#{field} from #{options['table']} origin LEFT JOIN #{ref_table} target
|
|
379
|
+
#{join_condition}
|
|
380
|
+
where target.#{ref_field} is null
|
|
381
|
+
) AND #{field} IS NOT NULL AND length(trim(#{field})) <> 0;
|
|
382
|
+
eos
|
|
383
|
+
end
|
|
384
|
+
|
|
385
|
+
def validate_match(options)
|
|
386
|
+
@logger.info "Validating regexp: #{options['table']}.[#{options['field']}] #{options['rule']}"
|
|
387
|
+
options['error'] ||= "[#{options['field']}] does not match #{options['args']}"
|
|
388
|
+
execute <<-eos
|
|
389
|
+
#{ update_sql(options) }
|
|
390
|
+
WHERE #{options['field']} IS NOT NULL AND length(trim(#{options['field']})) <> 0 AND #{options['field']} !~ '#{options['args']}';
|
|
391
|
+
eos
|
|
392
|
+
end
|
|
393
|
+
|
|
394
|
+
def update_sql(options)
|
|
395
|
+
log_to = options['log_to'] || DEFAULT_ERROR_FIELD
|
|
396
|
+
sql = "UPDATE #{options['table']} SET #{log_to} = array_to_string(string_to_array(#{log_to}, ' || ') || string_to_array(#{quote(options['code'])}, ' || '), ' || ')"
|
|
397
|
+
end
|
|
398
|
+
|
|
399
|
+
def execute(sql)
|
|
400
|
+
ActiveRecord::Base.connection.execute(sql)
|
|
401
|
+
end
|
|
402
|
+
|
|
403
|
+
def quote(str = "")
|
|
404
|
+
ActiveRecord::Base.connection.quote(str)
|
|
405
|
+
end
|
|
406
|
+
|
|
407
|
+
def quote_col_name(str = "")
|
|
408
|
+
ActiveRecord::Base.connection.quote_column_name(str)
|
|
409
|
+
end
|
|
410
|
+
|
|
411
|
+
def table_exists?(table)
|
|
412
|
+
results = execute "SELECT * FROM pg_tables WHERE schemaname='public' AND tablename = #{quote(table)};"
|
|
413
|
+
return !results.first.nil?
|
|
414
|
+
end
|
|
415
|
+
|
|
416
|
+
def drop_table(table_name)
|
|
417
|
+
execute "DROP TABLE IF EXISTS #{table_name}"
|
|
418
|
+
end
|
|
419
|
+
end
|
|
420
|
+
end
|
|
421
|
+
|
|
422
|
+
|
|
423
|
+
x = IData::Validator.new $options[:config]
|
|
424
|
+
x.validate!
|
|
425
|
+
x.report!
|
|
426
|
+
|