idata 0.1.33 → 0.2.1
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/README.md +22 -24
- data/README2.md +37 -0
- data/bin/ivalidate2 +426 -0
- data/full-pg-lawson.sh +707 -0
- data/full-pg.sh +153 -155
- data/full.sh +75 -5
- data/lib/idata/version.rb +1 -1
- data/sample.sh +13 -10
- metadata +18 -14
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 26fd9bafa90c2af8eef61231c2b070c6a395bbcb
|
4
|
+
data.tar.gz: b1952edd955ea4eeb72d31f8c640f053d5515a17
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 79b8f98bad03bcfccea93cf2cb074271391def428288784059842435e4e6334928f1d5806c50707816a61621e38e4761cae761443c634bcf2ba0baa5187c3e61
|
7
|
+
data.tar.gz: 2513beafcacb28bc7d745e93ab6dd09e0f3b246c25bcee2d81b5bee92b66778b26feab7f43887790a284dfbc8e6906c3860d71082f530a78f1ff82e598085df7
|
data/README.md
CHANGED
@@ -1,5 +1,5 @@
|
|
1
1
|
# Overview
|
2
|
-
We provide some
|
2
|
+
We provide some utilities for validating data in a PostgreSQL data table.
|
3
3
|
These utilities can be used as simple terminal commands and can be installed by:
|
4
4
|
|
5
5
|
gem install idata
|
@@ -13,7 +13,7 @@ idata comes along with the following commands:
|
|
13
13
|
* imerge
|
14
14
|
* isanitize
|
15
15
|
|
16
|
-
Run a command with
|
16
|
+
Run a command with `--help` switch for the details
|
17
17
|
|
18
18
|
Prequisites:
|
19
19
|
* PostgreSQL 9.0 or above
|
@@ -23,34 +23,34 @@ Prequisites:
|
|
23
23
|
# Usage
|
24
24
|
Suppose we have an `items` table, and we want to validate its records against certain criteria like:
|
25
25
|
|
26
|
-
* `
|
27
|
-
* `
|
28
|
-
* The composite `[
|
29
|
-
* One `
|
26
|
+
* `vendor_code` must not be null
|
27
|
+
* `vendor_name` must not be null
|
28
|
+
* The composite `[vendor_code, vendor_name]` must be unique
|
29
|
+
* One `vendor_code` corresponds to only ONE `vendor_name` (in other words, there must not be two items with different `vendor_name` but with the same `vendor_code`)
|
30
30
|
and vice-versa
|
31
31
|
* `vendor_code` must reference the `code` column in the `vendors` table
|
32
32
|
|
33
33
|
Then the validation command could be:
|
34
34
|
```
|
35
|
-
ivalidate --host=localhost --user=postgres --database=mydb --table=items
|
36
|
-
--
|
35
|
+
ivalidate --host=localhost --user=postgres --database=mydb --table=items
|
36
|
+
--log-to=validation_errors \
|
37
|
+
--not-null="vendor_code" \
|
37
38
|
--not-null="vendor_name" \
|
38
|
-
--unique="
|
39
|
-
--consistent-by="
|
40
|
-
--consistent-by="
|
39
|
+
--unique="vendor_code,vendor_name" \
|
40
|
+
--consistent-by="vendor_code|vendor_name" \
|
41
|
+
--consistent-by="vendor_name|vendor_code" \
|
41
42
|
--cross-reference="vendor_code|vendors.code"
|
42
43
|
```
|
43
44
|
Validation results for every single record are logged to an additional column named `validation_errors`
|
44
|
-
of the `items` table, as specified by the `--log-to` switch
|
45
|
-
|
46
|
-
As you can see, most common checks can be performed using the supported switches:
|
45
|
+
of the `items` table, as specified by the `--log-to` switch. As you can see, most common checks can be performed using the supported switches:
|
47
46
|
```
|
48
47
|
--not-null
|
49
48
|
--unique
|
50
49
|
--consistent-by
|
51
50
|
--cross-reference
|
52
51
|
```
|
53
|
-
|
52
|
+
# Custom Validation
|
53
|
+
For more customized checks, we support some other switches.
|
54
54
|
|
55
55
|
The `--match="field/pattern/"` switch tells the program to check if value of a `field` matches the provided `pattern` (which is a regular expression).
|
56
56
|
For example:
|
@@ -61,22 +61,20 @@ For example:
|
|
61
61
|
# Check if value of status is either 'A' or 'I' (any other value is not allowed)
|
62
62
|
ivalidate --match="status/^(A|I)$/"
|
63
63
|
```
|
64
|
-
In case you need even more customized validation other than the supported ones (match
|
65
|
-
then `--query` switch may
|
64
|
+
In case you need even more customized validation other than the supported ones (`match`, `unique`, `not-null`, `cross-reference`...)
|
65
|
+
then the `--query` switch may come in handy. For example:
|
66
66
|
```
|
67
|
-
ivalidate --query="
|
67
|
+
ivalidate --query="start_date >= string_to_date('01/02/2014') -- invalid date"
|
68
68
|
```
|
69
69
|
You can also use `--rquery` which is the reversed counterpart of `--query`
|
70
|
-
For example, the following two checks are equivalent:
|
70
|
+
For example, the following two checks are equivalent, mark any record whose `start_date < '01/02/2014'` as "invalid date":
|
71
71
|
```
|
72
|
-
ivalidate --query="
|
73
|
-
ivalidate --rquery="
|
72
|
+
ivalidate --query="start_date >= string_to_date('01/02/2014') -- invalid date"
|
73
|
+
ivalidate --rquery="start_date < string_to_date('01/02/2014') -- invalid date"
|
74
74
|
```
|
75
|
-
(mark any record whose `start_date < '01/02/2014'` as "invalid date")
|
76
75
|
|
77
76
|
Note: run `ivalidate --help` to see the full list of supported switches
|
78
77
|
|
79
|
-
|
80
78
|
# Put it all together
|
81
79
|
You can put several `ivalidate` commands (for several data tables) in one single bash/sh file.
|
82
80
|
Besides `ivalidate`, we also support some other utilities to:
|
@@ -84,6 +82,6 @@ Besides `ivalidate`, we also support some other utilities to:
|
|
84
82
|
+ Modify data tables
|
85
83
|
+ Generate summary reports
|
86
84
|
|
87
|
-
|
85
|
+
See our `sample.sh` for a comprehensive example
|
88
86
|
|
89
87
|
|
data/README2.md
ADDED
@@ -0,0 +1,37 @@
|
|
1
|
+
### Overview
|
2
|
+
File dùng cho validation criteria có cấu trúc chung như sau:
|
3
|
+
```yaml
|
4
|
+
table:
|
5
|
+
- field: field1, field2, etc.
|
6
|
+
validations:
|
7
|
+
- rule:
|
8
|
+
code:
|
9
|
+
error:
|
10
|
+
impact:
|
11
|
+
solution:
|
12
|
+
```
|
13
|
+
|
14
|
+
##### Explain:
|
15
|
+
+ Một `table` có một hoặc nhiều `field`
|
16
|
+
+ Một `field` có một section `validations` chứa một hoặc nhiều rules
|
17
|
+
+ Mỗi rule chứa `rule` (required) và `code`, `error`, `solution`, `impact`, `priority` tương ứng (optional)
|
18
|
+
+ Rule: viết theo format covered bên dưới
|
19
|
+
+ Các field khác free text trên cùng 1 line, dùng \n làm dấu cách dòng
|
20
|
+
+ Giá trị của field có thể là một hoặc tập gồm nhiều field name, separated by comma
|
21
|
+
|
22
|
+
## Writing rules
|
23
|
+
Supported rules include:
|
24
|
+
|
25
|
+
| Rule | Description | Example |
|
26
|
+
| ---- | ----------- | ------- |
|
27
|
+
| `not null` | Giá trị của field tương ứng không được rỗng | |
|
28
|
+
| `unique` | Giá trị của field tương ứng phải unique trong table | |
|
29
|
+
| `matches "/regexp/"` | Giá trị của field phải thỏa format định nghĩa bởi `regexp` | |
|
30
|
+
| `not matches "/regexp/"` | Reverse counterpart của `matches` | |
|
31
|
+
| `consistent by "ref"` | Giá trị của field tương ứng phải consistent với `ref` | |
|
32
|
+
| `cross references "table.field"` | Giá trị của field phải reference tới một field khác `table.field` | |
|
33
|
+
| `custom query "query"` | Dùng custom SQL `query` (trong trường hợp business phức tạp không thể biểu diễn bằng các rule khác) | |
|
34
|
+
| `reverse query "query"` | Reverse counterpart của `custom query` | |
|
35
|
+
|
36
|
+
## Others
|
37
|
+
TBD
|
data/bin/ivalidate2
ADDED
@@ -0,0 +1,426 @@
|
|
1
|
+
#!/usr/bin/env ruby
|
2
|
+
# DATA VALIDATOR
|
3
|
+
#
|
4
|
+
# @author Nghi Pham
|
5
|
+
# @date April 2014
|
6
|
+
#
|
7
|
+
# Data validation includes:
|
8
|
+
# * Uniqueness
|
9
|
+
# * Integrity (cross reference)
|
10
|
+
# * Data type: numeric, text, enum, etc.
|
11
|
+
# * Data format: text size, text values, enum, inclusion, exclusion, etc.
|
12
|
+
#
|
13
|
+
# Issue ruby load.rb --help for guideline/examples
|
14
|
+
#
|
15
|
+
require 'optparse'
|
16
|
+
require 'active_record'
|
17
|
+
require 'rubygems'
|
18
|
+
require 'logger'
|
19
|
+
|
20
|
+
$options = {}
|
21
|
+
parser = OptionParser.new("", 24) do |opts|
|
22
|
+
opts.banner = "\nProgram: Data Validator\nAuthor: MCKI\n\n"
|
23
|
+
|
24
|
+
opts.on("-c", "--config FILE", "Configuration file") do |v|
|
25
|
+
$options[:config] = v
|
26
|
+
end
|
27
|
+
|
28
|
+
opts.on("-h", "--host HOST", "PostgreSQL host") do |v|
|
29
|
+
$options[:host] = v
|
30
|
+
end
|
31
|
+
|
32
|
+
opts.on("-d", "--database DATABASE", "PostgreSQL database") do |v|
|
33
|
+
$options[:database] = v
|
34
|
+
end
|
35
|
+
|
36
|
+
opts.on("-u", "--username USER", "PostgreSQL username") do |v|
|
37
|
+
$options[:username] = v
|
38
|
+
end
|
39
|
+
|
40
|
+
opts.on("-p", "--password PASSWORD", "PostgreSQL password") do |v|
|
41
|
+
$options[:password] = v
|
42
|
+
end
|
43
|
+
|
44
|
+
opts.on("-l", "--listen PORT", "PostgreSQL listen port (default to 5432)") do |v|
|
45
|
+
$options[:listen] = v
|
46
|
+
end
|
47
|
+
|
48
|
+
opts.on_tail('--help', 'Displays this help') do
|
49
|
+
puts opts, "", help
|
50
|
+
exit
|
51
|
+
end
|
52
|
+
end
|
53
|
+
|
54
|
+
def help
|
55
|
+
return ''
|
56
|
+
end
|
57
|
+
|
58
|
+
begin
|
59
|
+
parser.parse!
|
60
|
+
rescue SystemExit => ex
|
61
|
+
exit
|
62
|
+
end
|
63
|
+
|
64
|
+
# Load parameters from ENVIRONMENT if exist
|
65
|
+
$options[:host] ||= ENV['HOST']
|
66
|
+
$options[:username] ||= ENV['USERNAME']
|
67
|
+
$options[:password] ||= ENV['PASSWORD']
|
68
|
+
$options[:listen] ||= ENV['LISTEN']
|
69
|
+
$options[:database] ||= ENV['DATABASE']
|
70
|
+
|
71
|
+
# validate parameters
|
72
|
+
if $options[:config].nil?
|
73
|
+
puts "\nPlease specify config file: -c\n\n"
|
74
|
+
exit
|
75
|
+
end
|
76
|
+
|
77
|
+
if $options[:host].nil?
|
78
|
+
puts "\nPlease specify host name: -h\n\n"
|
79
|
+
exit
|
80
|
+
end
|
81
|
+
|
82
|
+
if $options[:database].nil?
|
83
|
+
puts "\nPlease specify PostgreSQL database name: -d\n\n"
|
84
|
+
exit
|
85
|
+
end
|
86
|
+
|
87
|
+
if $options[:username].nil?
|
88
|
+
puts "\nPlease specify PostgreSQL username: -u\n\n"
|
89
|
+
exit
|
90
|
+
end
|
91
|
+
|
92
|
+
# Default value
|
93
|
+
$options[:listen] ||= 5432
|
94
|
+
|
95
|
+
# Database dump
|
96
|
+
ActiveRecord::Base.establish_connection(
|
97
|
+
'adapter' => 'postgresql',
|
98
|
+
'host' => $options[:host],
|
99
|
+
'database' => $options[:database],
|
100
|
+
'username' => $options[:username],
|
101
|
+
'password' => $options[:password],
|
102
|
+
'port' => $options[:listen],
|
103
|
+
'timeout' => 15000
|
104
|
+
)
|
105
|
+
|
106
|
+
class String
|
107
|
+
def not_null_sql
|
108
|
+
a = self.split(/\s*,\s*/)
|
109
|
+
sql = a.map{|s|
|
110
|
+
"#{s} IS NOT NULL AND length(trim(#{s}::text)) <> 0"
|
111
|
+
}.join(" AND ")
|
112
|
+
|
113
|
+
"(#{sql})"
|
114
|
+
end
|
115
|
+
|
116
|
+
def lower
|
117
|
+
a = self.split(/\s*,\s*/)
|
118
|
+
sql = a.map{|s|
|
119
|
+
"lower(#{s})"
|
120
|
+
}.join(",")
|
121
|
+
|
122
|
+
sql
|
123
|
+
end
|
124
|
+
|
125
|
+
def black; "\033[30m#{self}\033[0m" end
|
126
|
+
def red; "\033[31m#{self}\033[0m" end
|
127
|
+
def green; "\033[32m#{self}\033[0m" end
|
128
|
+
def brown; "\033[33m#{self}\033[0m" end
|
129
|
+
def blue; "\033[34m#{self}\033[0m" end
|
130
|
+
def magenta; "\033[35m#{self}\033[0m" end
|
131
|
+
def cyan; "\033[36m#{self}\033[0m" end
|
132
|
+
def gray; "\033[37m#{self}\033[0m" end
|
133
|
+
end
|
134
|
+
|
135
|
+
class Logger
|
136
|
+
alias_method :_old_info, :info
|
137
|
+
alias_method :_old_warn, :warn
|
138
|
+
alias_method :_old_error, :error
|
139
|
+
|
140
|
+
|
141
|
+
def error(msg)
|
142
|
+
_old_error(msg.red)
|
143
|
+
end
|
144
|
+
|
145
|
+
def info(msg)
|
146
|
+
_old_info(msg.green)
|
147
|
+
end
|
148
|
+
|
149
|
+
def warn(msg)
|
150
|
+
_old_warn(msg.brown)
|
151
|
+
end
|
152
|
+
end
|
153
|
+
|
154
|
+
module IData
|
155
|
+
class Validator
|
156
|
+
SUPPORTED_RULES_REGEXP = /^\s*(not null|cross references|matches|not matches|custom query|reverse query|unique|consistent by)\s*/
|
157
|
+
DEFAULT_ERROR_FIELD = 'validation_errors'
|
158
|
+
META_TABLE = 'validation_meta'
|
159
|
+
SUMMARY_TABLE = 'summary'
|
160
|
+
|
161
|
+
def initialize(file)
|
162
|
+
@config = YAML.load_file(file)
|
163
|
+
@logger = Logger.new(STDOUT)
|
164
|
+
@rules = []
|
165
|
+
@logger.formatter = proc do |severity, datetime, progname, msg|
|
166
|
+
"#{severity}: #{datetime} - #{msg}\n"
|
167
|
+
end
|
168
|
+
@config.each do |table, fields|
|
169
|
+
unless table_exists?(table)
|
170
|
+
@logger.warn "Table #{table} does not exist!"
|
171
|
+
end
|
172
|
+
@logger.info "Validating table #{table}"
|
173
|
+
fields.each do |field|
|
174
|
+
field['validations'].each do |rule|
|
175
|
+
p rule
|
176
|
+
type, args = parse_rule(rule['rule'])
|
177
|
+
options = rule.merge('table' => table, 'field' => field['field'], 'type' => type, 'args' => args)
|
178
|
+
options['code'] = Digest::SHA1.hexdigest([table, field['field'], rule['rule']].join(""))
|
179
|
+
@rules << options
|
180
|
+
end
|
181
|
+
end
|
182
|
+
end
|
183
|
+
end
|
184
|
+
|
185
|
+
def validate!
|
186
|
+
# reset the meta table
|
187
|
+
execute("DROP TABLE IF EXISTS #{META_TABLE}")
|
188
|
+
|
189
|
+
# validate
|
190
|
+
@logger.info "Validation started!"
|
191
|
+
@rules.each {|r| validate(r) }
|
192
|
+
|
193
|
+
# create meta table
|
194
|
+
create_table_from_array(@rules, table_name: META_TABLE, drop_table: true, extra_fields: ['impact', 'solution', 'count', 'percentage'])
|
195
|
+
|
196
|
+
# Done!
|
197
|
+
@logger.info "Validation done!"
|
198
|
+
end
|
199
|
+
|
200
|
+
def create_table_from_array(entries, options = {})
|
201
|
+
raise "Please specify :table_name" unless options[:table_name]
|
202
|
+
|
203
|
+
extra_fields = options[:extra_fields] || []
|
204
|
+
columns = entries.inject([]) {|x, i| x += i.keys } + extra_fields
|
205
|
+
columns.uniq!
|
206
|
+
|
207
|
+
if options[:drop_table]
|
208
|
+
execute "DROP TABLE IF EXISTS #{options[:table_name]}"
|
209
|
+
end
|
210
|
+
|
211
|
+
execute "CREATE TABLE IF NOT EXISTS #{options[:table_name]} ( #{columns.map{|c| quote_col_name(c.to_s) + ' VARCHAR' }.join(', ')} )"
|
212
|
+
|
213
|
+
insert_sql = entries.map { |r|
|
214
|
+
"INSERT INTO #{options[:table_name]}(#{columns.map{|c| quote_col_name(c.to_s)}.join(', ')}) VALUES(#{ columns.map{|c| quote(r[c])}.join(',') });"
|
215
|
+
}.join("")
|
216
|
+
|
217
|
+
execute insert_sql
|
218
|
+
end
|
219
|
+
|
220
|
+
def validate(options)
|
221
|
+
unless table_exists?(options['table'])
|
222
|
+
return
|
223
|
+
end
|
224
|
+
|
225
|
+
add_error_field(options)
|
226
|
+
case options['type']
|
227
|
+
|
228
|
+
when 'not null'
|
229
|
+
validate_not_null(options)
|
230
|
+
when 'custom query'
|
231
|
+
validate_custom_query(options)
|
232
|
+
when 'reverse query'
|
233
|
+
validate_reverse_query(options)
|
234
|
+
when 'custom query reversed'
|
235
|
+
validate_custom_query(options)
|
236
|
+
when 'matches'
|
237
|
+
validate_match(options)
|
238
|
+
when 'cross references'
|
239
|
+
validate_cross_reference(options)
|
240
|
+
when 'consistent by'
|
241
|
+
validate_consistent_by(options)
|
242
|
+
when 'unique'
|
243
|
+
validate_unique(options)
|
244
|
+
else
|
245
|
+
raise "Rule not recognized"
|
246
|
+
end
|
247
|
+
rescue Exception => ex
|
248
|
+
@logger.warn ex.message.split(/[\n]/).first.strip
|
249
|
+
end
|
250
|
+
|
251
|
+
def report!
|
252
|
+
sql = @rules.map {|r|
|
253
|
+
"(SELECT unnest(string_to_array(#{DEFAULT_ERROR_FIELD}, ' || ')) as code, count(*), round((count(*) * 100)::numeric / (SELECT count(*) FROM #{r['table']}), 2)::varchar || '%' AS percentage FROM #{r['table']} GROUP BY code)"
|
254
|
+
}
|
255
|
+
|
256
|
+
execute("
|
257
|
+
UPDATE #{META_TABLE} meta
|
258
|
+
SET count = stat.count,
|
259
|
+
percentage = stat.percentage
|
260
|
+
FROM (#{sql.join(" UNION ")}) stat
|
261
|
+
WHERE meta.code = stat.code"
|
262
|
+
)
|
263
|
+
end
|
264
|
+
|
265
|
+
private
|
266
|
+
def add_error_field(options)
|
267
|
+
error_field = options['log_to'] || DEFAULT_ERROR_FIELD
|
268
|
+
execute("ALTER TABLE #{options['table']} ADD COLUMN #{error_field} VARCHAR DEFAULT '';")
|
269
|
+
rescue Exception => ex
|
270
|
+
# @todo
|
271
|
+
end
|
272
|
+
|
273
|
+
def parse_rule(rule)
|
274
|
+
# @todo
|
275
|
+
type = rule[SUPPORTED_RULES_REGEXP]
|
276
|
+
if type.nil?
|
277
|
+
@logger.error "Invalid rule: #{rule}"
|
278
|
+
exit(0)
|
279
|
+
end
|
280
|
+
|
281
|
+
type.strip!
|
282
|
+
args = rule.gsub(SUPPORTED_RULES_REGEXP, '').gsub(/(^\s*["']|["']\s*$)/, "")
|
283
|
+
return type, args
|
284
|
+
end
|
285
|
+
|
286
|
+
def validate_not_null(options)
|
287
|
+
@logger.info "Validating data presence: #{options['table']}.[#{options['field']}]"
|
288
|
+
options['error'] ||= "[#{options['field']}] is null"
|
289
|
+
execute <<-eos
|
290
|
+
#{ update_sql(options) }
|
291
|
+
WHERE #{options['field']} IS NULL OR length(trim(#{options['field']})) = 0;
|
292
|
+
eos
|
293
|
+
end
|
294
|
+
|
295
|
+
def validate_custom_query(options)
|
296
|
+
@logger.info "Validating with custom query: #{options['args'][0..50]}#{(options['args'].size > 50) ? '...' : ''}"
|
297
|
+
options['error'] ||= "Unknown"
|
298
|
+
execute <<-eos
|
299
|
+
#{ update_sql(options) }
|
300
|
+
WHERE NOT (#{options['args']})
|
301
|
+
eos
|
302
|
+
end
|
303
|
+
|
304
|
+
def validate_reverse_query(options)
|
305
|
+
@logger.info "Validating with custom query: #{options['args'][0..50]}#{(options['args'].size > 50) ? '...' : ''}"
|
306
|
+
options['error'] ||= "Unknown"
|
307
|
+
execute <<-eos
|
308
|
+
#{ update_sql(options) }
|
309
|
+
WHERE (#{options['args']})
|
310
|
+
eos
|
311
|
+
end
|
312
|
+
|
313
|
+
def validate_consistent_by(options)
|
314
|
+
@logger.info "Validating integrity: #{options['table']}.[#{options['field']}] #{options['rule']}"
|
315
|
+
options['error'] ||= "Same [#{options['field']}] but different [#{options['args']}]"
|
316
|
+
|
317
|
+
f1_case = f1 = options['field']
|
318
|
+
f2_case = f2 = options['args']
|
319
|
+
|
320
|
+
if options['case_insensitive']
|
321
|
+
f1_case = f1_case.lower
|
322
|
+
f2_case = f2_case.lower
|
323
|
+
end
|
324
|
+
|
325
|
+
execute <<-eos
|
326
|
+
#{ update_sql(options) }
|
327
|
+
WHERE id IN (
|
328
|
+
SELECT unnest(array_agg(id)) FROM #{options['table']}
|
329
|
+
WHERE #{f1.not_null_sql} AND #{f2.not_null_sql}
|
330
|
+
GROUP BY #{f2_case}
|
331
|
+
HAVING COUNT(distinct #{f1_case}) > 1
|
332
|
+
);
|
333
|
+
eos
|
334
|
+
end
|
335
|
+
|
336
|
+
def validate_unique(options)
|
337
|
+
@logger.info "Validating uniqueness: #{options['table']}.[#{options['field']}]"
|
338
|
+
options['error'] ||= "[#{options['field']}] is not unique"
|
339
|
+
|
340
|
+
if options['case_insensitive']
|
341
|
+
f_lower = options['field'].lower
|
342
|
+
else
|
343
|
+
f_lower = options['field']
|
344
|
+
end
|
345
|
+
|
346
|
+
execute <<-eos
|
347
|
+
#{ update_sql(options) }
|
348
|
+
WHERE id IN (
|
349
|
+
SELECT unnest(array_agg(id)) FROM #{options['table']} GROUP BY #{f_lower}
|
350
|
+
HAVING count(*) > 1
|
351
|
+
) AND #{options['field'].not_null_sql};
|
352
|
+
eos
|
353
|
+
end
|
354
|
+
|
355
|
+
def validate_cross_reference(options)
|
356
|
+
@logger.info "Validating reference: #{options['table']}.[#{options['field']}] #{options['rule']}"
|
357
|
+
|
358
|
+
options['error'] ||= "[#{options['field']}] does not reference [#{options['args']}]"
|
359
|
+
|
360
|
+
field = options['field']
|
361
|
+
ref_table, ref_field = options['args'].split(/[\.]/)
|
362
|
+
|
363
|
+
if options['args'].split(/[\.]/).size != 2
|
364
|
+
raise "Invalid rule"
|
365
|
+
exit(0)
|
366
|
+
end
|
367
|
+
|
368
|
+
if options['case_insensitive']
|
369
|
+
join_condition = "on lower(origin.#{field}) = lower(target.#{ref_field})"
|
370
|
+
else
|
371
|
+
join_condition = "on origin.#{field}::text = target.#{ref_field}::text"
|
372
|
+
end
|
373
|
+
|
374
|
+
# @todo: poor performance here, think of a better SQL!!!
|
375
|
+
execute <<-eos
|
376
|
+
#{ update_sql(options) }
|
377
|
+
WHERE #{field} IN (
|
378
|
+
SELECT origin.#{field} from #{options['table']} origin LEFT JOIN #{ref_table} target
|
379
|
+
#{join_condition}
|
380
|
+
where target.#{ref_field} is null
|
381
|
+
) AND #{field} IS NOT NULL AND length(trim(#{field})) <> 0;
|
382
|
+
eos
|
383
|
+
end
|
384
|
+
|
385
|
+
def validate_match(options)
|
386
|
+
@logger.info "Validating regexp: #{options['table']}.[#{options['field']}] #{options['rule']}"
|
387
|
+
options['error'] ||= "[#{options['field']}] does not match #{options['args']}"
|
388
|
+
execute <<-eos
|
389
|
+
#{ update_sql(options) }
|
390
|
+
WHERE #{options['field']} IS NOT NULL AND length(trim(#{options['field']})) <> 0 AND #{options['field']} !~ '#{options['args']}';
|
391
|
+
eos
|
392
|
+
end
|
393
|
+
|
394
|
+
def update_sql(options)
|
395
|
+
log_to = options['log_to'] || DEFAULT_ERROR_FIELD
|
396
|
+
sql = "UPDATE #{options['table']} SET #{log_to} = array_to_string(string_to_array(#{log_to}, ' || ') || string_to_array(#{quote(options['code'])}, ' || '), ' || ')"
|
397
|
+
end
|
398
|
+
|
399
|
+
def execute(sql)
|
400
|
+
ActiveRecord::Base.connection.execute(sql)
|
401
|
+
end
|
402
|
+
|
403
|
+
def quote(str = "")
|
404
|
+
ActiveRecord::Base.connection.quote(str)
|
405
|
+
end
|
406
|
+
|
407
|
+
def quote_col_name(str = "")
|
408
|
+
ActiveRecord::Base.connection.quote_column_name(str)
|
409
|
+
end
|
410
|
+
|
411
|
+
def table_exists?(table)
|
412
|
+
results = execute "SELECT * FROM pg_tables WHERE schemaname='public' AND tablename = #{quote(table)};"
|
413
|
+
return !results.first.nil?
|
414
|
+
end
|
415
|
+
|
416
|
+
def drop_table(table_name)
|
417
|
+
execute "DROP TABLE IF EXISTS #{table_name}"
|
418
|
+
end
|
419
|
+
end
|
420
|
+
end
|
421
|
+
|
422
|
+
|
423
|
+
x = IData::Validator.new $options[:config]
|
424
|
+
x.validate!
|
425
|
+
x.report!
|
426
|
+
|