kevintyll-ofac 1.0.0

Sign up to get free protection for your applications and to get access to all the features.
data/History.txt ADDED
@@ -0,0 +1,9 @@
1
+ == 0.1.0 2009-05-7
2
+
3
+ * 1 major enhancement:
4
+ * Table creation and data load task complete
5
+
6
+ == 1.0.0 2009-05-11
7
+
8
+ * 1 major enhancement:
9
+ * Initail release
data/LICENSE ADDED
@@ -0,0 +1,20 @@
1
+ Copyright (c) 2009 Kevin Tyll
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining
4
+ a copy of this software and associated documentation files (the
5
+ "Software"), to deal in the Software without restriction, including
6
+ without limitation the rights to use, copy, modify, merge, publish,
7
+ distribute, sublicense, and/or sell copies of the Software, and to
8
+ permit persons to whom the Software is furnished to do so, subject to
9
+ the following conditions:
10
+
11
+ The above copyright notice and this permission notice shall be
12
+ included in all copies or substantial portions of the Software.
13
+
14
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
17
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
18
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
19
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
20
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/PostInstall.txt ADDED
@@ -0,0 +1,11 @@
1
+ For more information on ofac, see http://kevintyll.github.com/ofac/
2
+
3
+ * To create the necessary db migration, from the command line, run:
4
+ script/generate ofac_migration
5
+ * Require the gem in your environment.rb file in the Rails::Initializer block:
6
+ config.gem 'kevintyll-ofac', :lib => 'ofac'
7
+ * To load your table with the current OFAC data, from the command line, run:
8
+ rake ofac:update_data
9
+
10
+ * The OFAC data is not updated with any regularity, but you can sign up for email notifications when the data changes at
11
+ http://www.treas.gov/offices/enforcement/ofac/sdn/index.shtml.
data/README.rdoc ADDED
@@ -0,0 +1,109 @@
1
+ = ofac
2
+
3
+ * http://kevintyll.github.com/ofac
4
+ * http://www.drexel-labs.com
5
+
6
+ * http://www.treas.gov/offices/enforcement/ofac/sdn/index.shtml
7
+
8
+ == DESCRIPTION:
9
+
10
+ ofac is a ruby gem that tries to find a match of a person's name and address against the
11
+ Office of Foreign Assets Control's Specially Designated Nationals list...the so called
12
+ terrorist watch list.
13
+
14
+ This gem, like the ssn_validator gem, started as a need for the company I work for, Clarity Services Inc.
15
+ We decided once again to create a gem out of it and share it with the community. Much
16
+ thanks goes to the management at Clarity Services Inc. for allowing this code to be open sourced. Thanks
17
+ also to Larry Berland at Clarity Services Inc. The matching logic in the ofac_match.rb file was derived from
18
+ his work.
19
+
20
+ == FEATURES:
21
+
22
+ Creates a score, 1 - 100, based on how well the name, address and city match the data on the SDN list. Since
23
+ we have to match on strings, the likely hood of an exact match are virtually nil. So we've created an
24
+ algorithm that creates a score. The better the match, the higher the score. A score of 100 would be
25
+ a perfect match.
26
+
27
+ The score is calculated by adding up the weightings of each part that is matched. So
28
+ if only name is matched, then the max score is the weight for <tt>:name</tt> which is 60
29
+
30
+ It's possible to get partial matches, which will add partial weight to the score. If there
31
+ is not a match on the element as it is passed in, then each word element gets broken down
32
+ and matches are tried on each partial element. The weighting is distrubuted equally for
33
+ each partial that is matched.
34
+
35
+ If exact matches are not made, then a sounds like match is attempted. Any match made by sounds like
36
+ is given 75% of it's weight to the score.
37
+ Example:
38
+
39
+ If you are trying to match the name Kevin Tyll and there is a record for Smith, Kevin in the database, then
40
+ we will try to match both Kevin and Tyll separately, with each element Smith and Kevin. Since only Kevin
41
+ will find a match, and there were 2 elements in the searched name, the score will be added by half the weighting
42
+ for <tt>:name</tt>. So since the weight for <tt>:name</tt> is 60, then we will add 30 to the score.
43
+
44
+ If you are trying to match the name Kevin Gregory Tyll and there is a record for Tyll, Kevin in the database, then
45
+ we will try to match Kevin and Gregory and Tyll separately, with each element Tyll and Kevin. Since both Kevin
46
+ and Tyll will find a match, and there were 3 elements in the searched name, the score will be added by 2/3 the weighting
47
+ for <tt>:name</tt>. So since the weight for <tt>:name</tt> is 60, then we will add 40 to the score.
48
+
49
+ If you are trying to match the name Kevin Tyll and there is a record for Kevin Gregory Tyll in the database, then
50
+ we will try to match Kevin and Tyll separately, with each element Tyll and Kevin and Gregory. Since both Kevin
51
+ and Tyll will find a match, and there were 2 elements in the searched name, the score will be added by 2/2 the weighting
52
+ for <tt>:name</tt>. So since the weight for <tt>:name</tt> is 60, then we will add 60 to the score.
53
+
54
+ If you are trying to match the name Kevin Tyll, and there is a record for Teel, Kevin in the database, then an exact match
55
+ will be found for Kevin, and a sounds like match will be made for Tyll. Since there were 2 elements in hte searched name,
56
+ and the weight for <tt>:name</tt> is 60, then each element is worth 30. Since Kevin was an exact match, it will add 30, and
57
+ since Tyll was a sounds like match, it will add 30 * .75. So the <tt>:name</tt> portion of the search will be worth 53.
58
+
59
+ Matches for name are made for both the name and any aliases in the OFAC database.
60
+
61
+ Matches for <tt>:city</tt> and <tt>:address</tt> will only be added to the score if there is first a match on <tt>:name</tt>.
62
+
63
+ == SYNOPSIS:
64
+ Accepts a hash with the identity's demographic information
65
+
66
+ Ofac.new({:name => 'Oscar Hernandez', :city => 'Clearwater', :address => '123 somewhere ln'})
67
+
68
+ <tt>:name</tt> is required to get a score. If <tt>:name</tt> is missing, an error will not be thrown, but a score of 0 will be returned.
69
+
70
+ The more information provided, the higher the score could be. A score of 100 would mean all fields
71
+ were passed in, and all fields were 100% matches. If only the name is passed in without an address,
72
+ it will be impossible to get a score of 100, even if the name matches perfectly.
73
+
74
+ Acceptable hash keys and their weighting in score calculation:
75
+
76
+ * <tt>:name</tt> (weighting = 60%) (required) This can be a person, business, or marine vessel
77
+ * <tt>:address</tt> (weighting = 10%)
78
+ * <tt>:city</tt> (weighting = 30%)
79
+
80
+ * Instantiate the object with the identity's name, street address, and city.
81
+ ofac = Ofac.new(:name => 'Kevin Tyll', :city => 'Clearwater', :address => '123 Somewhere Ln.')
82
+
83
+ * Then get the score
84
+ ofac.score => return the score 1 - 100
85
+
86
+ * You can also get the list of all the partial matches with the score of each record.
87
+ ofac.possible_hits => returns an array of hashes.
88
+
89
+ == REQUIREMENTS:
90
+
91
+ * Rails 2.0.0 or greater
92
+
93
+ == INSTALL:
94
+
95
+ * To install the gem:
96
+ sudo gem install kevintyll-ofac
97
+ * To create the necessary db migration, from the command line, run:
98
+ script/generate ofac_migration
99
+ * Require the gem in your environment.rb file in the Rails::Initializer block:
100
+ config.gem 'kevintyll-ofac', :lib => 'ofac'
101
+ * To load your table with the current OFAC data, from the command line, run:
102
+ rake ofac:update_data
103
+
104
+ * The OFAC data is not updated with any regularity, but you can sign up for email notifications when the data changes at
105
+ http://www.treas.gov/offices/enforcement/ofac/sdn/index.shtml.
106
+
107
+ == Copyright
108
+
109
+ Copyright (c) 2009 Kevin Tyll. See LICENSE for details.
data/Rakefile ADDED
@@ -0,0 +1,57 @@
1
+ require 'rubygems'
2
+ require 'rake'
3
+
4
+ begin
5
+ require 'jeweler'
6
+ Jeweler::Tasks.new do |gem|
7
+ gem.name = "ofac"
8
+ gem.summary = %Q{Attempts to find a hit on the Office of Foreign Assets Control's Specially Designated Nationals list.}
9
+ gem.description = %Q{Attempts to find a hit on the Office of Foreign Assets Control's Specially Designated Nationals list.}
10
+ gem.email = "kevintyll@gmail.com"
11
+ gem.homepage = "http://github.com/kevintyll/ofac"
12
+ gem.authors = ["Kevin Tyll"]
13
+ gem.post_install_message = File.readlines("PostInstall.txt").join("")
14
+ # gem is a Gem::Specification... see http://www.rubygems.org/read/chapter/20 for additional settings
15
+ end
16
+ rescue LoadError
17
+ puts "Jeweler not available. Install it with: sudo gem install technicalpickles-jeweler -s http://gems.github.com"
18
+ end
19
+
20
+ require 'rake/testtask'
21
+ Rake::TestTask.new(:test) do |test|
22
+ test.libs << 'lib' << 'test'
23
+ test.pattern = 'test/**/*_test.rb'
24
+ test.verbose = true
25
+ end
26
+
27
+ begin
28
+ require 'rcov/rcovtask'
29
+ Rcov::RcovTask.new do |test|
30
+ test.libs << 'test'
31
+ test.pattern = 'test/**/*_test.rb'
32
+ test.verbose = true
33
+ end
34
+ rescue LoadError
35
+ task :rcov do
36
+ abort "RCov is not available. In order to run rcov, you must: sudo gem install spicycode-rcov"
37
+ end
38
+ end
39
+
40
+
41
+ task :default => :test
42
+
43
+ require 'rake/rdoctask'
44
+ Rake::RDocTask.new do |rdoc|
45
+ if File.exist?('VERSION.yml')
46
+ config = YAML.load(File.read('VERSION.yml'))
47
+ version = "#{config[:major]}.#{config[:minor]}.#{config[:patch]}"
48
+ else
49
+ version = ""
50
+ end
51
+
52
+ rdoc.rdoc_dir = 'rdoc'
53
+ rdoc.title = "ofac #{version}"
54
+ rdoc.rdoc_files.include('README*')
55
+ rdoc.rdoc_files.include('lib/**/*.rb')
56
+ end
57
+
data/VERSION.yml ADDED
@@ -0,0 +1,4 @@
1
+ ---
2
+ :minor: 0
3
+ :patch: 0
4
+ :major: 1
@@ -0,0 +1,12 @@
1
+ class OfacMigrationGenerator < Rails::Generator::Base
2
+ def manifest
3
+ record do |m|
4
+ #m.directory File.join('db')
5
+ m.migration_template 'migration.rb', 'db/migrate'
6
+ end
7
+ end
8
+
9
+ def file_name
10
+ "create_ofac_sdn_table"
11
+ end
12
+ end
@@ -0,0 +1,30 @@
1
+ class CreateOfacSdnTable < ActiveRecord::Migration
2
+
3
+ def self.up
4
+ create_table :ofac_sdns do |t|
5
+ t.text :name
6
+ t.string :sdn_type
7
+ t.string :program
8
+ t.string :title
9
+ t.string :vessel_call_sign
10
+ t.string :vessel_type
11
+ t.string :vessel_tonnage
12
+ t.string :gross_registered_tonnage
13
+ t.string :vessel_flag
14
+ t.string :vessel_owner
15
+ t.text :remarks
16
+ t.text :address
17
+ t.string :city
18
+ t.string :country
19
+ t.string :address_remarks
20
+ t.string :alternate_identity_type
21
+ t.text :alternate_identity_name
22
+ t.string :alternate_identity_remarks
23
+ t.timestamps
24
+ end
25
+ end
26
+
27
+ def self.down
28
+ drop_table :ofac_sdns
29
+ end
30
+ end
data/lib/ofac.rb ADDED
@@ -0,0 +1,9 @@
1
+ require 'rake'
2
+ require 'ofac/ruby_string_extensions'
3
+ require 'ofac/ofac_match'
4
+ require 'ofac/models/ofac_sdn'
5
+ require 'ofac/models/ofac_sdn_loader'
6
+ require 'ofac/models/ofac'
7
+
8
+ # Load rake file
9
+ import "#{File.dirname(__FILE__)}/tasks/ofac.rake"
@@ -0,0 +1,119 @@
1
+ require 'activerecord'
2
+ require 'active_record/connection_adapters/mysql_adapter'
3
+
4
+ class Ofac
5
+
6
+
7
+ # Accepts a hash with the identity's demographic information
8
+ #
9
+ # Ofac.new({:name => 'Oscar Hernandez', :city => 'Clearwater', :address => '123 somewhere ln'})
10
+ #
11
+ # <tt>:name</tt> is required to get a score. If <tt>:name</tt> is missing, an error will not be thrown, but a score of 0 will be returned.
12
+ #
13
+ # The more information provided, the higher the score could be. A score of 100 would mean all fields
14
+ # were passed in, and all fields were 100% matches. If only the name is passed in without an address,
15
+ # it will be impossible to get a score of 100, even if the name matches perfectly.
16
+ #
17
+ # Acceptable hash keys and their weighting in score calculation:
18
+ #
19
+ # * <tt>:name</tt> (weighting = 60%) (required) This can be a person, business, or marine vessel
20
+ # * <tt>:address</tt> (weighting = 10%)
21
+ # * <tt>:city</tt> (weighting = 30%)
22
+ def initialize(identity)
23
+ @identity = identity
24
+ end
25
+
26
+ # Creates a score, 1 - 100, based on how well the name and address match the data on the
27
+ # SDN (Specially Designated Nationals) list.
28
+ #
29
+ # The score is calculated by adding up the weightings of each part that is matched. So
30
+ # if only name is matched, then the max score is the weight for <tt>:name</tt> which is 60
31
+ #
32
+ # It's possible to get partial matches, which will add partial weight to the score. If there
33
+ # is not a match on the element as it is passed in, then each word element gets broken down
34
+ # and matches are tried on each partial element. The weighting is distrubuted equally for
35
+ # each partial that is matched.
36
+ #
37
+ # If exact matches are not made, then a sounds like match is attempted. Any match made by sounds like
38
+ # is given 75% of it's weight to the score.
39
+ #
40
+ # Example:
41
+ #
42
+ # If you are trying to match the name Kevin Tyll and there is a record for Smith, Kevin in the database, then
43
+ # we will try to match both Kevin and Tyll separately, with each element Smith and Kevin. Since only Kevin
44
+ # will find a match, and there were 2 elements in the searched name, the score will be added by half the weighting
45
+ # for <tt>:name</tt>. So since the weight for <tt>:name</tt> is 60, then we will add 30 to the score.
46
+ #
47
+ # If you are trying to match the name Kevin Gregory Tyll and there is a record for Tyll, Kevin in the database, then
48
+ # we will try to match Kevin and Gregory and Tyll separately, with each element Tyll and Kevin. Since both Kevin
49
+ # and Tyll will find a match, and there were 3 elements in the searched name, the score will be added by 2/3 the weighting
50
+ # for <tt>:name</tt>. So since the weight for <tt>:name</tt> is 60, then we will add 40 to the score.
51
+ #
52
+ # If you are trying to match the name Kevin Tyll and there is a record for Kevin Gregory Tyll in the database, then
53
+ # we will try to match Kevin and Tyll separately, with each element Tyll and Kevin and Gregory. Since both Kevin
54
+ # and Tyll will find a match, and there were 2 elements in the searched name, the score will be added by 2/2 the weighting
55
+ # for <tt>:name</tt>. So since the weight for <tt>:name</tt> is 60, then we will add 60 to the score.
56
+ #
57
+ # If you are trying to match the name Kevin Tyll, and there is a record for Teel, Kevin in the database, then an exact match
58
+ # will be found for Kevin, and a sounds like match will be made for Tyll. Since there were 2 elements in hte searched name,
59
+ # and the weight for <tt>:name</tt> is 60, then each element is worth 30. Since Kevin was an exact match, it will add 30, and
60
+ # since Tyll was a sounds like match, it will add 30 * .75. So the <tt>:name</tt> portion of the search will be worth 53.
61
+ #
62
+ # Matches for name are made for both the name and any aliases in the OFAC database.
63
+ #
64
+ # Matches for <tt>:city</tt> and <tt>:address</tt> will only be added to the score if there is first a match on <tt>:name</tt>.
65
+ def score
66
+ @score || calculate_score
67
+ end
68
+
69
+ # Returns an array of hashes of records in the OFAC data that found partial matches with that record's score.
70
+ #
71
+ # Ofac.new({:name => 'Oscar Hernandez', :city => 'Clearwater', :address => '123 somewhere ln'}).possible_hits
72
+ #returns
73
+ # [{:address=>"123 Somewhere Ln", :score=>100, :name=>"HERNANDEZ, Oscar|GUAMATUR, S.A.", :city=>"Clearwater"}, {:address=>"123 Somewhere Ln", :score=>100, :name=>"HERNANDEZ, Oscar|Alternate Name", :city=>"Clearwater"}]
74
+ #
75
+ def possible_hits
76
+ @possible_hits || retrieve_possible_hits
77
+ end
78
+
79
+ private
80
+
81
+ def retrieve_possible_hits
82
+ score
83
+ @possible_hits
84
+ end
85
+
86
+ def calculate_score
87
+ unless @identity[:name].to_s == ''
88
+ if OfacSdn.connection.kind_of?(ActiveRecord::ConnectionAdapters::MysqlAdapter)
89
+ #first get a list from the database of possible matches by name
90
+ #this query is pretty liberal, we just want to get a list of possible
91
+ #matches from the database that we can run through our ruby matching algorithm
92
+ partial_name = @identity[:name].gsub!(/\W/,'|')
93
+ name_array = partial_name.split('|')
94
+ name_array.delete('')
95
+ sql_name_partial = name_array.collect {|partial_name| "INSTR(SUBSTR(SOUNDEX(concat('O',name)), 2), REPLACE(SUBSTR(SOUNDEX('O#{partial_name}'), 2), '0', '')) > 0"}.join(' and ')
96
+ sql_alt_name_partial = name_array.collect {|partial_name| "INSTR(SUBSTR(SOUNDEX(concat('O',alternate_identity_name)), 2), REPLACE(SUBSTR(SOUNDEX('O#{partial_name}'), 2), '0', '')) > 0"}.join(' and ')
97
+ ##this sql for getting "accurate sounds like" functionality comes from:
98
+ #http://jgeewax.wordpress.com/2006/07/21/efficient-sounds-like-searches-in-mysql/
99
+ possible_sdns = OfacSdn.connection.select_all("select concat(name,'|', alternate_identity_name) name, address, city
100
+ from ofac_sdns
101
+ where name is not null
102
+ and (((#{sql_name_partial}))
103
+ or ((#{sql_alt_name_partial})))")
104
+ else
105
+ possible_sdns = OfacSdn.find(:all, :select => 'name, alternate_identity_name, address, city').collect{|sdn| {:name => "#{sdn.name}|#{sdn.alternate_identity_name}", :address => sdn.address, :city => sdn.city}}
106
+ end
107
+
108
+ match = OfacMatch.new({:name => {:weight => 60, :token => "#{@identity[:name]}"},
109
+ :address => {:weight => 10, :token => @identity[:address]},
110
+ :city => {:weight => 30, :token => @identity[:city]}})
111
+
112
+ score = match.score(possible_sdns)
113
+ @possible_hits = match.possible_hits
114
+ end
115
+ @score = score || 0
116
+ return @score
117
+ end
118
+
119
+ end
@@ -0,0 +1,5 @@
1
+ require 'activerecord'
2
+
3
+ class OfacSdn < ActiveRecord::Base
4
+
5
+ end
@@ -0,0 +1,305 @@
1
+ require 'net/http'
2
+ require 'activerecord'
3
+ require 'active_record/connection_adapters/mysql_adapter'
4
+
5
+ class OfacSdnLoader
6
+
7
+
8
+ #Loads the most recent file from http://www.treas.gov/offices/enforcement/ofac/sdn/delimit/index.shtml
9
+ def self.load_current_sdn_file
10
+ puts "Reloading OFAC sdn data"
11
+ puts "Downloading OFAC data from http://www.treas.gov/offices/enforcement/ofac/sdn"
12
+ #get the 3 data files
13
+ sdn = Tempfile.new('sdn')
14
+ sdn.write(Net::HTTP.get(URI.parse('http://www.treas.gov/offices/enforcement/ofac/sdn/delimit/sdn.pip')))
15
+ sdn.rewind
16
+ address = Tempfile.new('sdn')
17
+ address.write(Net::HTTP.get(URI.parse('http://www.treas.gov/offices/enforcement/ofac/sdn/delimit/add.pip')))
18
+ address.rewind
19
+ alt = Tempfile.new('sdn')
20
+ alt.write(Net::HTTP.get(URI.parse('http://www.treas.gov/offices/enforcement/ofac/sdn/delimit/alt.pip')))
21
+ alt.rewind
22
+
23
+ if OfacSdn.connection.kind_of?(ActiveRecord::ConnectionAdapters::MysqlAdapter)
24
+ puts "Converting file to csv format for Mysql import. This could take several minutes."
25
+
26
+ csv_file = convert_to_flattened_csv(sdn, address, alt)
27
+
28
+ bulk_mysql_update(csv_file)
29
+ else
30
+ active_record_file_load(sdn, address, alt)
31
+ end
32
+
33
+ sdn.close
34
+ @address.close
35
+ @alt.close
36
+ end
37
+
38
+
39
+ private
40
+
41
+ #convert the file's null value to an empty string
42
+ #and removes " chars.
43
+ def self.clean_file_string(line)
44
+ line.gsub!(/-0-(\s)?/,'')
45
+ line.gsub!(/\n/,'')
46
+ line.gsub(/\"/,'')
47
+ end
48
+
49
+ #split the line into an array
50
+ def self.convert_line_to_array(line)
51
+ clean_file_string(line).split('|') unless line.nil?
52
+ end
53
+
54
+ #return an 2 arrays of the records matching the sdn primary key
55
+ #1 array of address records and one array of alt records
56
+ def self.foreign_key_records(sdn_id)
57
+ address_records = []
58
+ alt_records = []
59
+
60
+ #the first element in each array is the primary and foreign keys
61
+ #we are denormalizing the data
62
+ if @current_address_hash && @current_address_hash[:id] == sdn_id
63
+ address_records << @current_address_hash
64
+ loop do
65
+ @current_address_hash = address_text_to_hash(@address.gets)
66
+ if @current_address_hash && @current_address_hash[:id] == sdn_id
67
+ address_records << @current_address_hash
68
+ else
69
+ break
70
+ end
71
+ end
72
+ end
73
+
74
+ if @current_alt_hash && @current_alt_hash[:id] == sdn_id
75
+ alt_records << @current_alt_hash
76
+ loop do
77
+ @current_alt_hash = alt_text_to_hash(@alt.gets)
78
+ if @current_alt_hash && @current_alt_hash[:id] == sdn_id
79
+ alt_records << @current_alt_hash
80
+ else
81
+ break
82
+ end
83
+ end
84
+ end
85
+ return address_records, alt_records
86
+ end
87
+
88
+ def self.sdn_text_to_hash(line)
89
+ unless line.nil?
90
+ value_array = convert_line_to_array(line)
91
+ {:id => value_array[0],
92
+ :name => value_array[1],
93
+ :sdn_type => value_array[2],
94
+ :program => value_array[3],
95
+ :title => value_array[4],
96
+ :vessel_call_sign => value_array[5],
97
+ :vessel_type => value_array[6],
98
+ :vessel_tonnage => value_array[7],
99
+ :gross_registered_tonnage => value_array[8],
100
+ :vessel_flag => value_array[9],
101
+ :vessel_owner => value_array[10],
102
+ :remarks => value_array[11]
103
+ }
104
+ end
105
+ end
106
+
107
+ def self.address_text_to_hash(line)
108
+ unless line.nil?
109
+ value_array = convert_line_to_array(line)
110
+ {:id => value_array[0],
111
+ :address => value_array[2],
112
+ :city => value_array[3],
113
+ :country => value_array[4],
114
+ :address_remarks => value_array[5]
115
+ }
116
+ end
117
+ end
118
+
119
+ def self.alt_text_to_hash(line)
120
+ unless line.nil?
121
+ value_array = convert_line_to_array(line)
122
+ {:id => value_array[0],
123
+ :alternate_identity_type => value_array[2],
124
+ :alternate_identity_name => value_array[3],
125
+ :alternate_identity_remarks => value_array[4]
126
+ }
127
+ end
128
+ end
129
+
130
+ def self.convert_hash_to_mysql_import_string(record_hash)
131
+ # empty field for id to be generated by mysql.
132
+ new_line = "``|" +
133
+ # :name
134
+ "`#{record_hash[:name]}`|" +
135
+ # :sdn_type
136
+ "`#{record_hash[:sdn_type]}`|" +
137
+ # :program
138
+ "`#{record_hash[:program]}`|" +
139
+ # :title
140
+ "`#{record_hash[:title]}`|" +
141
+ # :vessel_call_sign
142
+ "`#{record_hash[:vessel_call_sign]}`|" +
143
+ # :vessel_type
144
+ "`#{record_hash[:vessel_type]}`|" +
145
+ # :vessel_tonnage
146
+ "`#{record_hash[:vessel_tonnage]}`|" +
147
+ # :gross_registered_tonnage
148
+ "`#{record_hash[:gross_registered_tonnage]}`|" +
149
+ # :vessel_flag
150
+ "`#{record_hash[:vessel_flag]}`|" +
151
+ # :vessel_owner
152
+ "`#{record_hash[:vessel_owner]}`|" +
153
+ # :remarks
154
+ "`#{record_hash[:remarks]}`|" +
155
+ # :address
156
+ "`#{record_hash[:address]}`|" +
157
+ # :city
158
+ "`#{record_hash[:city]}`|" +
159
+ # :country
160
+ "`#{record_hash[:country]}`|" +
161
+ # :address_remarks
162
+ "`#{record_hash[:address_remarks]}`|" +
163
+ # :alternate_identity_type
164
+ "`#{record_hash[:alternate_identity_type]}`|" +
165
+ # :alternate_identity_name
166
+ "`#{record_hash[:alternate_identity_name]}`|" +
167
+ # :alternate_identity_remarks
168
+ "`#{record_hash[:alternate_identity_remarks]}`|" +
169
+ #:created_at
170
+ "`#{Time.now.to_s(:db)}`|" +
171
+ # updated_at
172
+ "`#{Time.now.to_s(:db)}`" + "\n"
173
+
174
+ new_line
175
+ end
176
+
177
+ def self.convert_to_flattened_csv(sdn_file, address_file, alt_file)
178
+ @address = address_file
179
+ @alt = alt_file
180
+
181
+ csv_file = Tempfile.new("ofac") # create temp file for converted csv format.
182
+ #get the first line from the address and alt files
183
+ @current_address_hash = address_text_to_hash(@address.gets)
184
+ @current_alt_hash = alt_text_to_hash(@alt.gets)
185
+
186
+ start = Time.now
187
+
188
+ sdn_file.each_with_index do |line, i|
189
+
190
+ #initialize the address and alt atributes to empty strings
191
+ address_attributes = address_text_to_hash("|||||")
192
+ alt_attributes = alt_text_to_hash("||||")
193
+
194
+ sdn_attributes = sdn_text_to_hash(line)
195
+
196
+ #get the foreign key records for this sdn
197
+ address_records, alt_records = foreign_key_records(sdn_attributes[:id])
198
+
199
+ if address_records.empty?
200
+ #no matching address records, so initialized blank values will be used.
201
+ if alt_records.empty?
202
+ #no matching address records, so initialized blank values will be used.
203
+ csv_file.syswrite(convert_hash_to_mysql_import_string(sdn_attributes.merge(address_attributes).merge(alt_attributes)))
204
+ else
205
+ alt_records.each do |alt|
206
+ csv_file.syswrite(convert_hash_to_mysql_import_string(sdn_attributes.merge(address_attributes).merge(alt)))
207
+ end
208
+ end
209
+ else
210
+ address_records.each do |address|
211
+ if alt_records.empty?
212
+ #no matching address records, so initialized blank values will be used.
213
+ csv_file.syswrite(convert_hash_to_mysql_import_string(sdn_attributes.merge(address).merge(alt_attributes)))
214
+ else
215
+ alt_records.each do |alt|
216
+ csv_file.syswrite(convert_hash_to_mysql_import_string(sdn_attributes.merge(address).merge(alt)))
217
+ end
218
+ end
219
+ end
220
+ end
221
+ puts "#{i} records processed." if (i % 1000 == 0) && (i > 0)
222
+ end
223
+ puts "File conversion ran for #{(Time.now - start) / 60} minutes."
224
+ return csv_file
225
+ end
226
+
227
+ def self.active_record_file_load(sdn_file, address_file, alt_file)
228
+ @address = address_file
229
+ @alt = alt_file
230
+
231
+ #OFAC data is a complete list, so we have to dump and load
232
+ OfacSdn.delete_all
233
+
234
+ #get the first line from the address and alt files
235
+ @current_address_hash = address_text_to_hash(@address.gets)
236
+ @current_alt_hash = alt_text_to_hash(@alt.gets)
237
+ attributes = {}
238
+ sdn_file.each_with_index do |line, i|
239
+
240
+ #initialize the address and alt atributes to empty strings
241
+ address_attributes = address_text_to_hash("|||||")
242
+ alt_attributes = alt_text_to_hash("||||")
243
+
244
+ sdn_attributes = sdn_text_to_hash(line)
245
+
246
+ #get the foreign key records for this sdn
247
+ address_records, alt_records = foreign_key_records(sdn_attributes[:id])
248
+
249
+ if address_records.empty?
250
+ #no matching address records, so initialized blank values will be used.
251
+ if alt_records.empty?
252
+ #no matching address records, so initialized blank values will be used.
253
+ attributes = sdn_attributes.merge(address_attributes).merge(alt_attributes)
254
+ attributes.delete(:id)
255
+ OfacSdn.create(attributes)
256
+ else
257
+ alt_records.each do |alt|
258
+ attributes = sdn_attributes.merge(address_attributes).merge(alt)
259
+ attributes.delete(:id)
260
+ OfacSdn.create(attributes)
261
+ end
262
+ end
263
+ else
264
+ address_records.each do |address|
265
+ if alt_records.empty?
266
+ #no matching address records, so initialized blank values will be used.
267
+ attributes = sdn_attributes.merge(address).merge(alt_attributes)
268
+ attributes.delete(:id)
269
+ OfacSdn.create(attributes)
270
+ else
271
+ alt_records.each do |alt|
272
+ attributes = sdn_attributes.merge(address).merge(alt)
273
+ attributes.delete(:id)
274
+ OfacSdn.create(attributes)
275
+ end
276
+ end
277
+ end
278
+ end
279
+
280
+ puts "#{i} records processed." if (i % 5000 == 0) && (i > 0)
281
+ end
282
+ end
283
+
284
+ # For mysql, use:
285
+ # LOAD DATA LOCAL INFILE 'ssdm1.csv' INTO TABLE death_master_files FIELDS TERMINATED BY '|' ENCLOSED BY "`" LINES TERMINATED BY '\n';
286
+ # This is a much faster way of loading large amounts of data into mysql. For information on the LOAD DATA command
287
+ # see http://dev.mysql.com/doc/refman/5.1/en/load-data.html
288
+ def self.bulk_mysql_update(csv_file)
289
+ puts "Deleting all records in ofac_sdn..."
290
+
291
+ #OFAC data is a complete list, so we have to dump and load
292
+ OfacSdn.delete_all
293
+
294
+ puts "Importing into Mysql..."
295
+
296
+ mysql_command = <<-TEXT
297
+ LOAD DATA LOCAL INFILE '#{csv_file.path}' REPLACE INTO TABLE ofac_sdns FIELDS TERMINATED BY '|' ENCLOSED BY "`" LINES TERMINATED BY '\n';
298
+ TEXT
299
+
300
+ OfacSdn.connection.execute(mysql_command)
301
+ puts "Mysql import complete."
302
+
303
+ end
304
+
305
+ end
@@ -0,0 +1,132 @@
1
+ class OfacMatch
2
+
3
+ attr_reader :possible_hits
4
+
5
+ #Intialize a Match object with a record hash of fields you want to match on.
6
+ #Each key in the hash, also has a data hash value for the weight, token, and type.
7
+ #
8
+ # match = Ofac::Match.new({:name => {:weight => 10, :token => 'Kevin Tyll'},
9
+ # :city => {:weight => 40, :token => 'Clearwater', },
10
+ # :address => {:weight => 40, :token => '1234 Park St.', },
11
+ # :zip => {:weight => 10, :token => '33759', :type => :number}})
12
+ #
13
+ # data hash keys:
14
+ # * <tt>data[:weight]</tt> - value to apply to the score if there is a match (Default is 100/number of key in the record hash)
15
+ # * <tt>data[:token]</tt> - string to match
16
+ # * <tt>data[:match]</tt> - set from records hash
17
+ # * <tt>data[:score]</tt> - output field
18
+ # * <tt>data[:type]</tt> - the type of match that should be performed (valid values are +:sound+ | +:number+) (Default is +:sound+)
19
+ def initialize(stats={})
20
+ @possible_hits = []
21
+ @stats = stats.dup
22
+ weight = 100
23
+ weight = 100 / @stats.length if @stats.length > 0
24
+ @stats.each_value do |data|
25
+ data[:weight] ||= weight
26
+ data[:match] ||= ''
27
+ data[:type] ||= :sound
28
+ data[:score] ||= 0
29
+ data[:token] = data[:token].to_s.upcase
30
+ end
31
+ end
32
+
33
+ # match_records is an array of hashes.
34
+ #
35
+ # The hash keys must match the record hash keys set when initialized.
36
+ #
37
+ # score will return the highest score of all the records that
38
+ # are sent in match_records.
39
+ def score(match_records)
40
+ score_results = Array.new
41
+ unless match_records.empty?
42
+ #place the match_records information
43
+ #into our @stats hash
44
+ match_records.each do |match|
45
+ match.each do |key, value|
46
+ @stats[key.to_sym][:match] = value.to_s.upcase
47
+ end
48
+ record_score = calculate_record
49
+ score_results.push(record_score)
50
+ @possible_hits << match.merge(:score => record_score) if record_score > 0
51
+ end
52
+ score = score_results.max #take max score
53
+ end
54
+ @possible_hits.uniq!
55
+ score ||= 0
56
+ end
57
+
58
+ private
59
+
60
+
61
+ # calculate the score for this record
62
+ # comparing the token to the match fields in the @stats hash
63
+ # and storing the score into the record
64
+ def calculate_record
65
+ score = 0
66
+ unless @stats.nil?
67
+ #need to make sure we check the name first, since city and address don't
68
+ #get added to the score unless there is a name match
69
+ [:name,:city,:address].each do |field|
70
+ data = @stats[field]
71
+ if (data[:token].blank?)
72
+ value = 0 #token is blank can't be sure of a match if nothing to match against
73
+ else
74
+ if (data[:match].blank?)
75
+ value = 0 #token has value match is blank
76
+ else
77
+ #token and match both have values
78
+ if (data[:type] == :number)
79
+ value = data[:token] == data[:match] ? 1 : 0
80
+ else
81
+ #first see if there is an exact match
82
+ value = data[:token] == data[:match] ? 1 : 0
83
+
84
+ unless value > 0
85
+ #do a sounds like with the data as given to see if we get a match
86
+ #if match on sounds_like, only give .75 of the weight.
87
+ value = data[:token].ofac_sounds_like(data[:match],false) ? 0.75 : 0
88
+ end
89
+
90
+ #if no match, then break the data down and see if we can find matches on the
91
+ #individual words
92
+ unless value > 0
93
+ token_data = data[:token].gsub(/\W/,'|')
94
+ token_array = token_data.split('|')
95
+ token_array.delete('')
96
+
97
+ match_data = data[:match].gsub(/\W/,'|')
98
+ match_array = match_data.split('|')
99
+ match_array.delete('')
100
+
101
+ value = 0
102
+ partial_weight = 1/token_array.length.to_f
103
+ token_array.each do |partial_token|
104
+ #first see if we get an exact match of the partial
105
+ if match_array.include?(partial_token)
106
+ value += partial_weight
107
+ else
108
+ #otherwise, see if the partial sounds like any part of the OFAC record
109
+ match_array.each do |partial_match|
110
+ if partial_match.ofac_sounds_like(partial_token,false)
111
+ #give partial value for every part of token that is matched.
112
+ value += partial_weight * 0.75
113
+ break
114
+ end
115
+ end
116
+ end
117
+ end
118
+ end
119
+ end
120
+ end
121
+ end
122
+ data[:score] = data[:weight] * value
123
+ score += data[:score]
124
+ break if field == :name && data[:score] == 0
125
+ end
126
+
127
+ end
128
+ score.round
129
+ end
130
+
131
+ end
132
+
@@ -0,0 +1,22 @@
1
+ class String
2
+
3
+ Ofac_SoundexChars = 'BPFVCSKGJQXZDTLMNR'
4
+ Ofac_SoundexNums = '111122222222334556'
5
+ Ofac_SoundexCharsEx = '^' + Ofac_SoundexChars
6
+ Ofac_SoundexCharsDel = '^A-Z'
7
+
8
+ # desc: http://en.wikipedia.org/wiki/Soundex
9
+ def ofac_soundex(census = true)
10
+ str = upcase.delete(Ofac_SoundexCharsDel).squeeze
11
+
12
+ str[0 .. 0] + str[1 .. -1].
13
+ delete(Ofac_SoundexCharsEx).
14
+ tr(Ofac_SoundexChars, Ofac_SoundexNums)[0 .. (census ? 2 : -1)].
15
+ ljust(3, '0') rescue ''
16
+ end
17
+
18
+ def ofac_sounds_like(other, census = true)
19
+ ofac_soundex(census) == other.ofac_soundex(census)
20
+ end
21
+
22
+ end
@@ -0,0 +1,8 @@
1
+
2
+ namespace :ofac do
3
+ desc "Loads the current file from http://www.treas.gov/offices/enforcement/ofac/sdn/delimit/index.shtml."
4
+ task :update_data => :environment do
5
+ OfacSdnLoader.load_current_sdn_file
6
+ end
7
+
8
+ end
@@ -0,0 +1,10 @@
1
+ 10|7|-0- |-0- |"Panama"|-0-
2
+ 15|12|-0- |-0- |"Panama"|-0-
3
+ 22|14|"123 Somewhere Ln"|"Clearwater"|"United States"|-0-
4
+ 39|27|-0- |"Managua"|"Nicaragua"|-0-
5
+ 39|29|"Bal Harbour Shopping Center, Via Italia"|"Panama City"|"Panama"|-0-
6
+ 41|41|"Avenida de Concha, Espina 8, E-28036"|"Madrid"|"Spain"|-0-
7
+ 41|102|-0- |-0- |-0- |-0-
8
+ 66|111|-0- |"Milan"|"Italy"|-0-
9
+ 66|117|-0- |-0- |"Panama"|-0-
10
+ 66|125|"1840 West 49th Street"|"Hialeah, FL"|"United States"|-0-
@@ -0,0 +1,10 @@
1
+ 15|14|"aka"|"VIAJES GUAMA TOURS"|-0-
2
+ 22|15|"aka"|"HERNANDEZ, Oscar Grouch"|-0-
3
+ 22|16|"aka"|"Alternate Name"|-0-
4
+ 25|57|"aka"|"AVIA IMPORT"|-0-
5
+ 36|219|"aka"|"BNC"|-0-
6
+ 36|220|"aka"|"NATIONAL BANK OF CUBA"|-0-
7
+ 36|221|"aka"|"BNC"|-0-
8
+ 41|222|"aka"|"NATIONAL BANK OF CUBA"|-0-
9
+ 66|223|"aka"|"BNC"|-0-
10
+ 66|224|"aka"|"NATIONAL BANK OF CUBA"|-0-
@@ -0,0 +1,9 @@
1
+ 10|"ABASTECEDORA NAVAL Y INDUSTRIAL, S.A."|-0- |"CUBA"|-0- |-0- |-0- |-0- |-0- |-0- |-0- |-0-
2
+ 15|"ABDELNUR| Nury de Jesus"|"individual"|"CUBA"|-0- |-0- |-0- |-0- |-0- |-0- |-0- |-0-
3
+ 22|"HERNANDEZ, Oscar"|"individual"|"CUBA"|-0- |-0- |"Unknown vessel type"|-0- |-0- |-0- |"Acechilly Navigation Co., Malta"|-0-
4
+ 24|"LOPEZ MENDEZ, Luis Eduardo"|"individual"|"CUBA"|-0- |-0- |"Unknown vessel type"|-0- |-0- |-0- |"Acefrosty Shipping Co., Malta"|-0-
5
+ 25|"ACEFROSTY SHIPPING CO., LTD."|-0- |"CUBA"|-0- |-0- |-0- |-0- |-0- |-0- |-0- |-0-
6
+ 36|"AEROCARIBBEAN AIRLINES"|-0- |"CUBA"|-0- |-0- |-0- |-0- |-0- |-0- |-0- |-0-
7
+ 39|"AEROTAXI EJECUTIVO, S.A."|-0- |"CUBA"|-0- |-0- |-0- |-0- |-0- |-0- |-0- |-0-
8
+ 41|"AGENCIA DE VIAJES GUAMA"|-0- |"CUBA"|-0- |-0- |-0- |-0- |-0- |-0- |-0- |-0-
9
+ 66|"AGUIAR, Raul"|"individual"|"CUBA"|"Director, Banco Nacional de Cuba"|-0- |-0- |-0- |-0- |-0- |-0- |"; Director, Banco Nacional de Cuba."
@@ -0,0 +1,19 @@
1
+ ``|`ABASTECEDORA NAVAL Y INDUSTRIAL, S.A.`|``|`CUBA`|``|``|``|``|``|``|``|``|``|``|`Panama`|``|``|``|``|`2009-05-06 15:55:24`|`2009-05-06 15:55:24`
2
+ ``|`ABDELNUR`|` Nury de Jesus`|`individual`|`CUBA`|``|``|``|``|``|``|``|``|``|`Panama`|``|`aka`|`VIAJES GUAMA TOURS`|``|`2009-05-06 15:55:24`|`2009-05-06 15:55:24`
3
+ ``|`HERNANDEZ, Oscar`|`individual`|`CUBA`|``|``|`Unknown vessel type`|``|``|``|`Acechilly Navigation Co., Malta`|``|`123 Somewhere Ln`|`Clearwater`|`United States`|``|`aka`|`HERNANDEZ, Oscar Grouch`|``|`2009-05-06 15:55:24`|`2009-05-06 15:55:24`
4
+ ``|`HERNANDEZ, Oscar`|`individual`|`CUBA`|``|``|`Unknown vessel type`|``|``|``|`Acechilly Navigation Co., Malta`|``|`123 Somewhere Ln`|`Clearwater`|`United States`|``|`aka`|`Alternate Name`|``|`2009-05-06 15:55:24`|`2009-05-06 15:55:24`
5
+ ``|`LOPEZ MENDEZ, Luis Eduardo`|`individual`|`CUBA`|``|``|`Unknown vessel type`|``|``|``|`Acefrosty Shipping Co., Malta`|``|``|``|``|``|``|``|``|`2009-05-06 15:55:24`|`2009-05-06 15:55:24`
6
+ ``|`ACEFROSTY SHIPPING CO., LTD.`|``|`CUBA`|``|``|``|``|``|``|``|``|``|``|``|``|`aka`|`AVIA IMPORT`|``|`2009-05-06 15:55:24`|`2009-05-06 15:55:24`
7
+ ``|`AEROCARIBBEAN AIRLINES`|``|`CUBA`|``|``|``|``|``|``|``|``|``|``|``|``|`aka`|`BNC`|``|`2009-05-06 15:55:24`|`2009-05-06 15:55:24`
8
+ ``|`AEROCARIBBEAN AIRLINES`|``|`CUBA`|``|``|``|``|``|``|``|``|``|``|``|``|`aka`|`NATIONAL BANK OF CUBA`|``|`2009-05-06 15:55:24`|`2009-05-06 15:55:24`
9
+ ``|`AEROCARIBBEAN AIRLINES`|``|`CUBA`|``|``|``|``|``|``|``|``|``|``|``|``|`aka`|`BNC`|``|`2009-05-06 15:55:24`|`2009-05-06 15:55:24`
10
+ ``|`AEROTAXI EJECUTIVO, S.A.`|``|`CUBA`|``|``|``|``|``|``|``|``|``|`Managua`|`Nicaragua`|``|``|``|``|`2009-05-06 15:55:24`|`2009-05-06 15:55:24`
11
+ ``|`AEROTAXI EJECUTIVO, S.A.`|``|`CUBA`|``|``|``|``|``|``|``|``|`Bal Harbour Shopping Center, Via Italia`|`Panama City`|`Panama`|``|``|``|``|`2009-05-06 15:55:24`|`2009-05-06 15:55:24`
12
+ ``|`AGENCIA DE VIAJES GUAMA`|``|`CUBA`|``|``|``|``|``|``|``|``|`Avenida de Concha, Espina 8, E-28036`|`Madrid`|`Spain`|``|`aka`|`NATIONAL BANK OF CUBA`|``|`2009-05-06 15:55:24`|`2009-05-06 15:55:24`
13
+ ``|`AGENCIA DE VIAJES GUAMA`|``|`CUBA`|``|``|``|``|``|``|``|``|``|``|``|``|`aka`|`NATIONAL BANK OF CUBA`|``|`2009-05-06 15:55:24`|`2009-05-06 15:55:24`
14
+ ``|`AGUIAR, Raul`|`individual`|`CUBA`|`Director, Banco Nacional de Cuba`|``|``|``|``|``|``|`; Director, Banco Nacional de Cuba.`|``|`Milan`|`Italy`|``|`aka`|`BNC`|``|`2009-05-06 15:55:24`|`2009-05-06 15:55:24`
15
+ ``|`AGUIAR, Raul`|`individual`|`CUBA`|`Director, Banco Nacional de Cuba`|``|``|``|``|``|``|`; Director, Banco Nacional de Cuba.`|``|`Milan`|`Italy`|``|`aka`|`NATIONAL BANK OF CUBA`|``|`2009-05-06 15:55:24`|`2009-05-06 15:55:24`
16
+ ``|`AGUIAR, Raul`|`individual`|`CUBA`|`Director, Banco Nacional de Cuba`|``|``|``|``|``|``|`; Director, Banco Nacional de Cuba.`|``|``|`Panama`|``|`aka`|`BNC`|``|`2009-05-06 15:55:24`|`2009-05-06 15:55:24`
17
+ ``|`AGUIAR, Raul`|`individual`|`CUBA`|`Director, Banco Nacional de Cuba`|``|``|``|``|``|``|`; Director, Banco Nacional de Cuba.`|``|``|`Panama`|``|`aka`|`NATIONAL BANK OF CUBA`|``|`2009-05-06 15:55:24`|`2009-05-06 15:55:24`
18
+ ``|`AGUIAR, Raul`|`individual`|`CUBA`|`Director, Banco Nacional de Cuba`|``|``|``|``|``|``|`; Director, Banco Nacional de Cuba.`|`1840 West 49th Street`|`Hialeah, FL`|`United States`|``|`aka`|`BNC`|``|`2009-05-06 15:55:24`|`2009-05-06 15:55:24`
19
+ ``|`AGUIAR, Raul`|`individual`|`CUBA`|`Director, Banco Nacional de Cuba`|``|``|``|``|``|``|`; Director, Banco Nacional de Cuba.`|`1840 West 49th Street`|`Hialeah, FL`|`United States`|``|`aka`|`NATIONAL BANK OF CUBA`|``|`2009-05-06 15:55:24`|`2009-05-06 15:55:24`
@@ -0,0 +1,20 @@
1
+ require 'ofac/models/ofac_sdn_loader'
2
+
3
+ class OfacSdnLoader
4
+
5
+ def self.load_current_sdn_file
6
+ sdn = File.new(File.dirname(__FILE__) + '/../../files/test_sdn_data_load.pip')
7
+ address = File.new(File.dirname(__FILE__) + '/../../files/test_address_data_load.pip')
8
+ alt = File.new(File.dirname(__FILE__) + '/../../files/test_alt_data_load.pip')
9
+ active_record_file_load(sdn, address, alt)
10
+ sdn.close
11
+ address.close
12
+ alt.close
13
+ end
14
+
15
+ #Gives access to the private convert_to_flattened_csv method
16
+ def self.create_csv_file(sdn, address, alt)
17
+ convert_to_flattened_csv(sdn, address, alt)
18
+ end
19
+
20
+ end
@@ -0,0 +1,40 @@
1
+ require 'test_helper'
2
+
3
+ class OfacSdnLoaderTest < Test::Unit::TestCase
4
+
5
+ context '' do
6
+ setup do setup_ofac_sdn_table end
7
+
8
+ should "load table from files multiple times and always have the same record count" do
9
+ assert_equal(0,OfacSdn.count)
10
+ OfacSdnLoader.load_current_sdn_file #this method is mocked to load test files instead of the live files from the web.
11
+ assert_equal(19, OfacSdn.count)
12
+ OfacSdnLoader.load_current_sdn_file
13
+ assert_equal(19, OfacSdn.count)
14
+ end
15
+
16
+ should "create flattened_csv_file_for_mysql_import" do
17
+ #since, I'm using sqlight3 for it's in memory db, I can't test the mysql load
18
+ #but I can test the csv file creation.
19
+ sdn = File.new(File.dirname(__FILE__) + '/files/test_sdn_data_load.pip')
20
+ address = File.new(File.dirname(__FILE__) + '/files/test_address_data_load.pip')
21
+ alt = File.new(File.dirname(__FILE__) + '/files/test_alt_data_load.pip')
22
+
23
+ csv = OfacSdnLoader.create_csv_file(sdn, address, alt) #this method was created in the mock only to call the private convert_to_flattened_csv method
24
+ correctly_formatted_csv = File.open(File.dirname(__FILE__) + '/files/valid_flattened_file.csv')
25
+
26
+ csv.rewind
27
+ generated_file = csv.readlines
28
+ #compare the values of each csv line, with the correctly formated "control file"
29
+ correctly_formatted_csv.each_with_index do |line,i|
30
+ csv_line = generated_file[i]
31
+ correctly_formatted_record_array = line.split('|')
32
+ csv_record_array = csv_line.split('|')
33
+ (0..18).each do |i| #skip indices 19 and 20, they are the created_at and updated_at fields, they will never match.
34
+ assert_equal correctly_formatted_record_array[i], csv_record_array[i]
35
+ end
36
+ end
37
+ end
38
+
39
+ end
40
+ end
data/test/ofac_test.rb ADDED
@@ -0,0 +1,76 @@
1
+ require 'test_helper'
2
+
3
+ class OfacTest < Test::Unit::TestCase
4
+
5
+ context '' do
6
+ setup do
7
+ setup_ofac_sdn_table
8
+ OfacSdnLoader.load_current_sdn_file #this method is mocked to load test files instead of the live files from the web.
9
+ end
10
+
11
+ should "give a score of 0 if no name is given" do
12
+ assert_equal 0, Ofac.new({:address => '123 somewhere'}).score
13
+ end
14
+
15
+ should "give a score of 0 if there is no name match" do
16
+ assert_equal 0, Ofac.new({:name => 'Kevin'}).score
17
+ end
18
+
19
+ should "give a score of 0 if there is no name match but there is an address and city match" do
20
+ assert_equal 0, Ofac.new({:name => 'Kevin', :address => '123 somewhere ln', :city => 'Clearwater'}).score
21
+ end
22
+
23
+ should "give a score of 60 if there is a name match" do
24
+ assert_equal 60, Ofac.new({:name => 'Oscar Hernandez'}).score
25
+ assert_equal 60, Ofac.new({:name => 'Oscar Hernandez', :city => 'no match', :address => 'no match'}).score
26
+ assert_equal 60, Ofac.new({:name => 'Oscar Hernandez', :city => 'Las Vegas', :address => 'no match'}).score
27
+ assert_equal 60, Ofac.new({:name => 'Luis Lopez', :city => 'Las Vegas', :address => 'no match'}).score
28
+ end
29
+
30
+ should "give a score of 60 if there is a name match on alternate identity name" do
31
+ assert_equal 60, Ofac.new({:name => 'Alternate Name'}).score
32
+ end
33
+
34
+ should "give a partial score if there is a partial name match" do
35
+ assert_equal 40, Ofac.new({:name => 'Oscar middlename Hernandez'}).score
36
+ assert_equal 30, Ofac.new({:name => 'Oscar WrongLastName'}).score
37
+ assert_equal 70, Ofac.new({:name => 'Oscar middlename Hernandez',:city => 'Clearwater'}).score
38
+ end
39
+
40
+ should "give a score of 90 if there is a name and city match" do
41
+ assert_equal 90, Ofac.new({:name => 'Oscar Hernandez', :city => 'Clearwater', :address => 'no match'}).score
42
+ end
43
+
44
+ should "give a score of 100 if there is a name and city and address match" do
45
+ assert_equal 100, Ofac.new({:name => 'Oscar Hernandez', :city => 'Clearwater', :address => '123 somewhere ln'}).score
46
+ end
47
+
48
+ should "give partial scores for sounds like matches" do
49
+
50
+ #32456 summer lane sounds like 32456 Somewhere ln so is adds 75% of the address weight to the score, or 8.
51
+ assert_equal 98, Ofac.new({:name => 'Oscar Hernandez', :city => 'Clearwater', :address => '32456 summer lane'}).score
52
+
53
+ #summer sounds like somewhere, and all numbers sound alike, so 2 of the 3 address elements match by sound.
54
+ #Each element is worth 10\3 or 3.33. Exact matches add 2.33 each, and the sounds like adds 2.33 * .75 or 2.5
55
+ #because sounds like matches only add 75% of it's weight.
56
+ #2.5 + 2.5 = 5
57
+ assert_equal 95, Ofac.new({:name => 'Oscar Hernandez', :city => 'Clearwater', :address => '12358 summer blvd'}).score
58
+
59
+
60
+ #Louis sounds like Luis, and Lopez is an exact match:
61
+ #:name has a weight of 60, so each element is worth 30. A sounds like match is worth 30 * .75
62
+ assert_equal 53, Ofac.new({:name => 'Louis Lopez', :city => 'Las Vegas', :address => 'no match'}).score
63
+ end
64
+
65
+ should "return an array of possible hits" do
66
+ #it should matter which order you call score or possible hits.
67
+ sdn = Ofac.new({:name => 'Oscar Hernandez', :city => 'Clearwater', :address => '123 somewhere ln'})
68
+ assert sdn.score > 0
69
+ assert !sdn.possible_hits.empty?
70
+
71
+ sdn = Ofac.new({:name => 'Oscar Hernandez', :city => 'Clearwater', :address => '123 somewhere ln'})
72
+ assert !sdn.possible_hits.empty?
73
+ assert sdn.score > 0
74
+ end
75
+ end
76
+ end
@@ -0,0 +1,48 @@
1
+ require 'rubygems'
2
+ require 'test/unit'
3
+ require 'shoulda'
4
+ require 'mocks/test/ofac_sdn_loader'
5
+
6
+ $LOAD_PATH.unshift(File.join(File.dirname(__FILE__), '..', 'lib'))
7
+ $LOAD_PATH.unshift(File.dirname(__FILE__))
8
+ require 'ofac'
9
+
10
+ ActiveRecord::Base.establish_connection :adapter => 'sqlite3', :database => ':memory:'
11
+
12
+ class Test::Unit::TestCase
13
+ def setup_ofac_sdn_table
14
+ ActiveRecord::Base.connection.tables.each { |table| ActiveRecord::Base.connection.drop_table(table) }
15
+ create_ofac_sdn_table
16
+ end
17
+
18
+ private
19
+
20
+ def create_ofac_sdn_table
21
+ silence_stream(STDOUT) do
22
+ ActiveRecord::Schema.define(:version => 1) do
23
+ create_table :ofac_sdns do |t|
24
+ t.text :name
25
+ t.string :sdn_type
26
+ t.string :program
27
+ t.string :title
28
+ t.string :vessel_call_sign
29
+ t.string :vessel_type
30
+ t.string :vessel_tonnage
31
+ t.string :gross_registered_tonnage
32
+ t.string :vessel_flag
33
+ t.string :vessel_owner
34
+ t.text :remarks
35
+ t.text :address
36
+ t.string :city
37
+ t.string :country
38
+ t.string :address_remarks
39
+ t.string :alternate_identity_type
40
+ t.text :alternate_identity_name
41
+ t.string :alternate_identity_remarks
42
+ t.timestamps
43
+ end
44
+ end
45
+ end
46
+ end
47
+
48
+ end
metadata ADDED
@@ -0,0 +1,90 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: kevintyll-ofac
3
+ version: !ruby/object:Gem::Version
4
+ version: 1.0.0
5
+ platform: ruby
6
+ authors:
7
+ - Kevin Tyll
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+
12
+ date: 2009-05-11 00:00:00 -07:00
13
+ default_executable:
14
+ dependencies: []
15
+
16
+ description: Attempts to find a hit on the Office of Foreign Assets Control's Specially Designated Nationals list.
17
+ email: kevintyll@gmail.com
18
+ executables: []
19
+
20
+ extensions: []
21
+
22
+ extra_rdoc_files:
23
+ - LICENSE
24
+ - README.rdoc
25
+ files:
26
+ - History.txt
27
+ - LICENSE
28
+ - PostInstall.txt
29
+ - README.rdoc
30
+ - Rakefile
31
+ - VERSION.yml
32
+ - generators/ofac_migration/ofac_migration_generator.rb
33
+ - generators/ofac_migration/templates/migration.rb
34
+ - lib/ofac.rb
35
+ - lib/ofac/models/ofac.rb
36
+ - lib/ofac/models/ofac_sdn.rb
37
+ - lib/ofac/models/ofac_sdn_loader.rb
38
+ - lib/ofac/ofac_match.rb
39
+ - lib/ofac/ruby_string_extensions.rb
40
+ - lib/tasks/ofac.rake
41
+ - test/files/test_address_data_load.pip
42
+ - test/files/test_alt_data_load.pip
43
+ - test/files/test_sdn_data_load.pip
44
+ - test/files/valid_flattened_file.csv
45
+ - test/mocks/test/ofac_sdn_loader.rb
46
+ - test/ofac_sdn_loader_test.rb
47
+ - test/ofac_test.rb
48
+ - test/test_helper.rb
49
+ has_rdoc: true
50
+ homepage: http://github.com/kevintyll/ofac
51
+ post_install_message: |-
52
+ For more information on ofac, see http://kevintyll.github.com/ofac/
53
+
54
+ * To create the necessary db migration, from the command line, run:
55
+ script/generate ofac_migration
56
+ * Require the gem in your environment.rb file in the Rails::Initializer block:
57
+ config.gem 'kevintyll-ofac', :lib => 'ofac'
58
+ * To load your table with the current OFAC data, from the command line, run:
59
+ rake ofac:update_data
60
+
61
+ * The OFAC data is not updated with any regularity, but you can sign up for email notifications when the data changes at
62
+ http://www.treas.gov/offices/enforcement/ofac/sdn/index.shtml.
63
+ rdoc_options:
64
+ - --charset=UTF-8
65
+ require_paths:
66
+ - lib
67
+ required_ruby_version: !ruby/object:Gem::Requirement
68
+ requirements:
69
+ - - ">="
70
+ - !ruby/object:Gem::Version
71
+ version: "0"
72
+ version:
73
+ required_rubygems_version: !ruby/object:Gem::Requirement
74
+ requirements:
75
+ - - ">="
76
+ - !ruby/object:Gem::Version
77
+ version: "0"
78
+ version:
79
+ requirements: []
80
+
81
+ rubyforge_project:
82
+ rubygems_version: 1.2.0
83
+ signing_key:
84
+ specification_version: 2
85
+ summary: Attempts to find a hit on the Office of Foreign Assets Control's Specially Designated Nationals list.
86
+ test_files:
87
+ - test/mocks/test/ofac_sdn_loader.rb
88
+ - test/ofac_sdn_loader_test.rb
89
+ - test/ofac_test.rb
90
+ - test/test_helper.rb