kevintyll-ofac 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/History.txt ADDED
@@ -0,0 +1,9 @@
1
+ == 0.1.0 2009-05-7
2
+
3
+ * 1 major enhancement:
4
+ * Table creation and data load task complete
5
+
6
+ == 1.0.0 2009-05-11
7
+
8
+ * 1 major enhancement:
9
+ * Initail release
data/LICENSE ADDED
@@ -0,0 +1,20 @@
1
+ Copyright (c) 2009 Kevin Tyll
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining
4
+ a copy of this software and associated documentation files (the
5
+ "Software"), to deal in the Software without restriction, including
6
+ without limitation the rights to use, copy, modify, merge, publish,
7
+ distribute, sublicense, and/or sell copies of the Software, and to
8
+ permit persons to whom the Software is furnished to do so, subject to
9
+ the following conditions:
10
+
11
+ The above copyright notice and this permission notice shall be
12
+ included in all copies or substantial portions of the Software.
13
+
14
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
17
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
18
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
19
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
20
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/PostInstall.txt ADDED
@@ -0,0 +1,11 @@
1
+ For more information on ofac, see http://kevintyll.github.com/ofac/
2
+
3
+ * To create the necessary db migration, from the command line, run:
4
+ script/generate ofac_migration
5
+ * Require the gem in your environment.rb file in the Rails::Initializer block:
6
+ config.gem 'kevintyll-ofac', :lib => 'ofac'
7
+ * To load your table with the current OFAC data, from the command line, run:
8
+ rake ofac:update_data
9
+
10
+ * The OFAC data is not updated with any regularity, but you can sign up for email notifications when the data changes at
11
+ http://www.treas.gov/offices/enforcement/ofac/sdn/index.shtml.
data/README.rdoc ADDED
@@ -0,0 +1,109 @@
1
+ = ofac
2
+
3
+ * http://kevintyll.github.com/ofac
4
+ * http://www.drexel-labs.com
5
+
6
+ * http://www.treas.gov/offices/enforcement/ofac/sdn/index.shtml
7
+
8
+ == DESCRIPTION:
9
+
10
+ ofac is a ruby gem that tries to find a match of a person's name and address against the
11
+ Office of Foreign Assets Control's Specially Designated Nationals list...the so called
12
+ terrorist watch list.
13
+
14
+ This gem, like the ssn_validator gem, started as a need for the company I work for, Clarity Services Inc.
15
+ We decided once again to create a gem out of it and share it with the community. Much
16
+ thanks goes to the management at Clarity Services Inc. for allowing this code to be open sourced. Thanks
17
+ also to Larry Berland at Clarity Services Inc. The matching logic in the ofac_match.rb file was derived from
18
+ his work.
19
+
20
+ == FEATURES:
21
+
22
+ Creates a score, 1 - 100, based on how well the name, address and city match the data on the SDN list. Since
23
+ we have to match on strings, the likely hood of an exact match are virtually nil. So we've created an
24
+ algorithm that creates a score. The better the match, the higher the score. A score of 100 would be
25
+ a perfect match.
26
+
27
+ The score is calculated by adding up the weightings of each part that is matched. So
28
+ if only name is matched, then the max score is the weight for <tt>:name</tt> which is 60
29
+
30
+ It's possible to get partial matches, which will add partial weight to the score. If there
31
+ is not a match on the element as it is passed in, then each word element gets broken down
32
+ and matches are tried on each partial element. The weighting is distrubuted equally for
33
+ each partial that is matched.
34
+
35
+ If exact matches are not made, then a sounds like match is attempted. Any match made by sounds like
36
+ is given 75% of it's weight to the score.
37
+ Example:
38
+
39
+ If you are trying to match the name Kevin Tyll and there is a record for Smith, Kevin in the database, then
40
+ we will try to match both Kevin and Tyll separately, with each element Smith and Kevin. Since only Kevin
41
+ will find a match, and there were 2 elements in the searched name, the score will be added by half the weighting
42
+ for <tt>:name</tt>. So since the weight for <tt>:name</tt> is 60, then we will add 30 to the score.
43
+
44
+ If you are trying to match the name Kevin Gregory Tyll and there is a record for Tyll, Kevin in the database, then
45
+ we will try to match Kevin and Gregory and Tyll separately, with each element Tyll and Kevin. Since both Kevin
46
+ and Tyll will find a match, and there were 3 elements in the searched name, the score will be added by 2/3 the weighting
47
+ for <tt>:name</tt>. So since the weight for <tt>:name</tt> is 60, then we will add 40 to the score.
48
+
49
+ If you are trying to match the name Kevin Tyll and there is a record for Kevin Gregory Tyll in the database, then
50
+ we will try to match Kevin and Tyll separately, with each element Tyll and Kevin and Gregory. Since both Kevin
51
+ and Tyll will find a match, and there were 2 elements in the searched name, the score will be added by 2/2 the weighting
52
+ for <tt>:name</tt>. So since the weight for <tt>:name</tt> is 60, then we will add 60 to the score.
53
+
54
+ If you are trying to match the name Kevin Tyll, and there is a record for Teel, Kevin in the database, then an exact match
55
+ will be found for Kevin, and a sounds like match will be made for Tyll. Since there were 2 elements in hte searched name,
56
+ and the weight for <tt>:name</tt> is 60, then each element is worth 30. Since Kevin was an exact match, it will add 30, and
57
+ since Tyll was a sounds like match, it will add 30 * .75. So the <tt>:name</tt> portion of the search will be worth 53.
58
+
59
+ Matches for name are made for both the name and any aliases in the OFAC database.
60
+
61
+ Matches for <tt>:city</tt> and <tt>:address</tt> will only be added to the score if there is first a match on <tt>:name</tt>.
62
+
63
+ == SYNOPSIS:
64
+ Accepts a hash with the identity's demographic information
65
+
66
+ Ofac.new({:name => 'Oscar Hernandez', :city => 'Clearwater', :address => '123 somewhere ln'})
67
+
68
+ <tt>:name</tt> is required to get a score. If <tt>:name</tt> is missing, an error will not be thrown, but a score of 0 will be returned.
69
+
70
+ The more information provided, the higher the score could be. A score of 100 would mean all fields
71
+ were passed in, and all fields were 100% matches. If only the name is passed in without an address,
72
+ it will be impossible to get a score of 100, even if the name matches perfectly.
73
+
74
+ Acceptable hash keys and their weighting in score calculation:
75
+
76
+ * <tt>:name</tt> (weighting = 60%) (required) This can be a person, business, or marine vessel
77
+ * <tt>:address</tt> (weighting = 10%)
78
+ * <tt>:city</tt> (weighting = 30%)
79
+
80
+ * Instantiate the object with the identity's name, street address, and city.
81
+ ofac = Ofac.new(:name => 'Kevin Tyll', :city => 'Clearwater', :address => '123 Somewhere Ln.')
82
+
83
+ * Then get the score
84
+ ofac.score => return the score 1 - 100
85
+
86
+ * You can also get the list of all the partial matches with the score of each record.
87
+ ofac.possible_hits => returns an array of hashes.
88
+
89
+ == REQUIREMENTS:
90
+
91
+ * Rails 2.0.0 or greater
92
+
93
+ == INSTALL:
94
+
95
+ * To install the gem:
96
+ sudo gem install kevintyll-ofac
97
+ * To create the necessary db migration, from the command line, run:
98
+ script/generate ofac_migration
99
+ * Require the gem in your environment.rb file in the Rails::Initializer block:
100
+ config.gem 'kevintyll-ofac', :lib => 'ofac'
101
+ * To load your table with the current OFAC data, from the command line, run:
102
+ rake ofac:update_data
103
+
104
+ * The OFAC data is not updated with any regularity, but you can sign up for email notifications when the data changes at
105
+ http://www.treas.gov/offices/enforcement/ofac/sdn/index.shtml.
106
+
107
+ == Copyright
108
+
109
+ Copyright (c) 2009 Kevin Tyll. See LICENSE for details.
data/Rakefile ADDED
@@ -0,0 +1,57 @@
1
+ require 'rubygems'
2
+ require 'rake'
3
+
4
+ begin
5
+ require 'jeweler'
6
+ Jeweler::Tasks.new do |gem|
7
+ gem.name = "ofac"
8
+ gem.summary = %Q{Attempts to find a hit on the Office of Foreign Assets Control's Specially Designated Nationals list.}
9
+ gem.description = %Q{Attempts to find a hit on the Office of Foreign Assets Control's Specially Designated Nationals list.}
10
+ gem.email = "kevintyll@gmail.com"
11
+ gem.homepage = "http://github.com/kevintyll/ofac"
12
+ gem.authors = ["Kevin Tyll"]
13
+ gem.post_install_message = File.readlines("PostInstall.txt").join("")
14
+ # gem is a Gem::Specification... see http://www.rubygems.org/read/chapter/20 for additional settings
15
+ end
16
+ rescue LoadError
17
+ puts "Jeweler not available. Install it with: sudo gem install technicalpickles-jeweler -s http://gems.github.com"
18
+ end
19
+
20
+ require 'rake/testtask'
21
+ Rake::TestTask.new(:test) do |test|
22
+ test.libs << 'lib' << 'test'
23
+ test.pattern = 'test/**/*_test.rb'
24
+ test.verbose = true
25
+ end
26
+
27
+ begin
28
+ require 'rcov/rcovtask'
29
+ Rcov::RcovTask.new do |test|
30
+ test.libs << 'test'
31
+ test.pattern = 'test/**/*_test.rb'
32
+ test.verbose = true
33
+ end
34
+ rescue LoadError
35
+ task :rcov do
36
+ abort "RCov is not available. In order to run rcov, you must: sudo gem install spicycode-rcov"
37
+ end
38
+ end
39
+
40
+
41
+ task :default => :test
42
+
43
+ require 'rake/rdoctask'
44
+ Rake::RDocTask.new do |rdoc|
45
+ if File.exist?('VERSION.yml')
46
+ config = YAML.load(File.read('VERSION.yml'))
47
+ version = "#{config[:major]}.#{config[:minor]}.#{config[:patch]}"
48
+ else
49
+ version = ""
50
+ end
51
+
52
+ rdoc.rdoc_dir = 'rdoc'
53
+ rdoc.title = "ofac #{version}"
54
+ rdoc.rdoc_files.include('README*')
55
+ rdoc.rdoc_files.include('lib/**/*.rb')
56
+ end
57
+
data/VERSION.yml ADDED
@@ -0,0 +1,4 @@
1
+ ---
2
+ :minor: 0
3
+ :patch: 0
4
+ :major: 1
@@ -0,0 +1,12 @@
1
+ class OfacMigrationGenerator < Rails::Generator::Base
2
+ def manifest
3
+ record do |m|
4
+ #m.directory File.join('db')
5
+ m.migration_template 'migration.rb', 'db/migrate'
6
+ end
7
+ end
8
+
9
+ def file_name
10
+ "create_ofac_sdn_table"
11
+ end
12
+ end
@@ -0,0 +1,30 @@
1
+ class CreateOfacSdnTable < ActiveRecord::Migration
2
+
3
+ def self.up
4
+ create_table :ofac_sdns do |t|
5
+ t.text :name
6
+ t.string :sdn_type
7
+ t.string :program
8
+ t.string :title
9
+ t.string :vessel_call_sign
10
+ t.string :vessel_type
11
+ t.string :vessel_tonnage
12
+ t.string :gross_registered_tonnage
13
+ t.string :vessel_flag
14
+ t.string :vessel_owner
15
+ t.text :remarks
16
+ t.text :address
17
+ t.string :city
18
+ t.string :country
19
+ t.string :address_remarks
20
+ t.string :alternate_identity_type
21
+ t.text :alternate_identity_name
22
+ t.string :alternate_identity_remarks
23
+ t.timestamps
24
+ end
25
+ end
26
+
27
+ def self.down
28
+ drop_table :ofac_sdns
29
+ end
30
+ end
data/lib/ofac.rb ADDED
@@ -0,0 +1,9 @@
1
+ require 'rake'
2
+ require 'ofac/ruby_string_extensions'
3
+ require 'ofac/ofac_match'
4
+ require 'ofac/models/ofac_sdn'
5
+ require 'ofac/models/ofac_sdn_loader'
6
+ require 'ofac/models/ofac'
7
+
8
+ # Load rake file
9
+ import "#{File.dirname(__FILE__)}/tasks/ofac.rake"
@@ -0,0 +1,119 @@
1
+ require 'activerecord'
2
+ require 'active_record/connection_adapters/mysql_adapter'
3
+
4
+ class Ofac
5
+
6
+
7
+ # Accepts a hash with the identity's demographic information
8
+ #
9
+ # Ofac.new({:name => 'Oscar Hernandez', :city => 'Clearwater', :address => '123 somewhere ln'})
10
+ #
11
+ # <tt>:name</tt> is required to get a score. If <tt>:name</tt> is missing, an error will not be thrown, but a score of 0 will be returned.
12
+ #
13
+ # The more information provided, the higher the score could be. A score of 100 would mean all fields
14
+ # were passed in, and all fields were 100% matches. If only the name is passed in without an address,
15
+ # it will be impossible to get a score of 100, even if the name matches perfectly.
16
+ #
17
+ # Acceptable hash keys and their weighting in score calculation:
18
+ #
19
+ # * <tt>:name</tt> (weighting = 60%) (required) This can be a person, business, or marine vessel
20
+ # * <tt>:address</tt> (weighting = 10%)
21
+ # * <tt>:city</tt> (weighting = 30%)
22
+ def initialize(identity)
23
+ @identity = identity
24
+ end
25
+
26
+ # Creates a score, 1 - 100, based on how well the name and address match the data on the
27
+ # SDN (Specially Designated Nationals) list.
28
+ #
29
+ # The score is calculated by adding up the weightings of each part that is matched. So
30
+ # if only name is matched, then the max score is the weight for <tt>:name</tt> which is 60
31
+ #
32
+ # It's possible to get partial matches, which will add partial weight to the score. If there
33
+ # is not a match on the element as it is passed in, then each word element gets broken down
34
+ # and matches are tried on each partial element. The weighting is distrubuted equally for
35
+ # each partial that is matched.
36
+ #
37
+ # If exact matches are not made, then a sounds like match is attempted. Any match made by sounds like
38
+ # is given 75% of it's weight to the score.
39
+ #
40
+ # Example:
41
+ #
42
+ # If you are trying to match the name Kevin Tyll and there is a record for Smith, Kevin in the database, then
43
+ # we will try to match both Kevin and Tyll separately, with each element Smith and Kevin. Since only Kevin
44
+ # will find a match, and there were 2 elements in the searched name, the score will be added by half the weighting
45
+ # for <tt>:name</tt>. So since the weight for <tt>:name</tt> is 60, then we will add 30 to the score.
46
+ #
47
+ # If you are trying to match the name Kevin Gregory Tyll and there is a record for Tyll, Kevin in the database, then
48
+ # we will try to match Kevin and Gregory and Tyll separately, with each element Tyll and Kevin. Since both Kevin
49
+ # and Tyll will find a match, and there were 3 elements in the searched name, the score will be added by 2/3 the weighting
50
+ # for <tt>:name</tt>. So since the weight for <tt>:name</tt> is 60, then we will add 40 to the score.
51
+ #
52
+ # If you are trying to match the name Kevin Tyll and there is a record for Kevin Gregory Tyll in the database, then
53
+ # we will try to match Kevin and Tyll separately, with each element Tyll and Kevin and Gregory. Since both Kevin
54
+ # and Tyll will find a match, and there were 2 elements in the searched name, the score will be added by 2/2 the weighting
55
+ # for <tt>:name</tt>. So since the weight for <tt>:name</tt> is 60, then we will add 60 to the score.
56
+ #
57
+ # If you are trying to match the name Kevin Tyll, and there is a record for Teel, Kevin in the database, then an exact match
58
+ # will be found for Kevin, and a sounds like match will be made for Tyll. Since there were 2 elements in hte searched name,
59
+ # and the weight for <tt>:name</tt> is 60, then each element is worth 30. Since Kevin was an exact match, it will add 30, and
60
+ # since Tyll was a sounds like match, it will add 30 * .75. So the <tt>:name</tt> portion of the search will be worth 53.
61
+ #
62
+ # Matches for name are made for both the name and any aliases in the OFAC database.
63
+ #
64
+ # Matches for <tt>:city</tt> and <tt>:address</tt> will only be added to the score if there is first a match on <tt>:name</tt>.
65
+ def score
66
+ @score || calculate_score
67
+ end
68
+
69
+ # Returns an array of hashes of records in the OFAC data that found partial matches with that record's score.
70
+ #
71
+ # Ofac.new({:name => 'Oscar Hernandez', :city => 'Clearwater', :address => '123 somewhere ln'}).possible_hits
72
+ #returns
73
+ # [{:address=>"123 Somewhere Ln", :score=>100, :name=>"HERNANDEZ, Oscar|GUAMATUR, S.A.", :city=>"Clearwater"}, {:address=>"123 Somewhere Ln", :score=>100, :name=>"HERNANDEZ, Oscar|Alternate Name", :city=>"Clearwater"}]
74
+ #
75
+ def possible_hits
76
+ @possible_hits || retrieve_possible_hits
77
+ end
78
+
79
+ private
80
+
81
+ def retrieve_possible_hits
82
+ score
83
+ @possible_hits
84
+ end
85
+
86
+ def calculate_score
87
+ unless @identity[:name].to_s == ''
88
+ if OfacSdn.connection.kind_of?(ActiveRecord::ConnectionAdapters::MysqlAdapter)
89
+ #first get a list from the database of possible matches by name
90
+ #this query is pretty liberal, we just want to get a list of possible
91
+ #matches from the database that we can run through our ruby matching algorithm
92
+ partial_name = @identity[:name].gsub!(/\W/,'|')
93
+ name_array = partial_name.split('|')
94
+ name_array.delete('')
95
+ sql_name_partial = name_array.collect {|partial_name| "INSTR(SUBSTR(SOUNDEX(concat('O',name)), 2), REPLACE(SUBSTR(SOUNDEX('O#{partial_name}'), 2), '0', '')) > 0"}.join(' and ')
96
+ sql_alt_name_partial = name_array.collect {|partial_name| "INSTR(SUBSTR(SOUNDEX(concat('O',alternate_identity_name)), 2), REPLACE(SUBSTR(SOUNDEX('O#{partial_name}'), 2), '0', '')) > 0"}.join(' and ')
97
+ ##this sql for getting "accurate sounds like" functionality comes from:
98
+ #http://jgeewax.wordpress.com/2006/07/21/efficient-sounds-like-searches-in-mysql/
99
+ possible_sdns = OfacSdn.connection.select_all("select concat(name,'|', alternate_identity_name) name, address, city
100
+ from ofac_sdns
101
+ where name is not null
102
+ and (((#{sql_name_partial}))
103
+ or ((#{sql_alt_name_partial})))")
104
+ else
105
+ possible_sdns = OfacSdn.find(:all, :select => 'name, alternate_identity_name, address, city').collect{|sdn| {:name => "#{sdn.name}|#{sdn.alternate_identity_name}", :address => sdn.address, :city => sdn.city}}
106
+ end
107
+
108
+ match = OfacMatch.new({:name => {:weight => 60, :token => "#{@identity[:name]}"},
109
+ :address => {:weight => 10, :token => @identity[:address]},
110
+ :city => {:weight => 30, :token => @identity[:city]}})
111
+
112
+ score = match.score(possible_sdns)
113
+ @possible_hits = match.possible_hits
114
+ end
115
+ @score = score || 0
116
+ return @score
117
+ end
118
+
119
+ end
@@ -0,0 +1,5 @@
1
+ require 'activerecord'
2
+
3
+ class OfacSdn < ActiveRecord::Base
4
+
5
+ end
@@ -0,0 +1,305 @@
1
+ require 'net/http'
2
+ require 'activerecord'
3
+ require 'active_record/connection_adapters/mysql_adapter'
4
+
5
+ class OfacSdnLoader
6
+
7
+
8
+ #Loads the most recent file from http://www.treas.gov/offices/enforcement/ofac/sdn/delimit/index.shtml
9
+ def self.load_current_sdn_file
10
+ puts "Reloading OFAC sdn data"
11
+ puts "Downloading OFAC data from http://www.treas.gov/offices/enforcement/ofac/sdn"
12
+ #get the 3 data files
13
+ sdn = Tempfile.new('sdn')
14
+ sdn.write(Net::HTTP.get(URI.parse('http://www.treas.gov/offices/enforcement/ofac/sdn/delimit/sdn.pip')))
15
+ sdn.rewind
16
+ address = Tempfile.new('sdn')
17
+ address.write(Net::HTTP.get(URI.parse('http://www.treas.gov/offices/enforcement/ofac/sdn/delimit/add.pip')))
18
+ address.rewind
19
+ alt = Tempfile.new('sdn')
20
+ alt.write(Net::HTTP.get(URI.parse('http://www.treas.gov/offices/enforcement/ofac/sdn/delimit/alt.pip')))
21
+ alt.rewind
22
+
23
+ if OfacSdn.connection.kind_of?(ActiveRecord::ConnectionAdapters::MysqlAdapter)
24
+ puts "Converting file to csv format for Mysql import. This could take several minutes."
25
+
26
+ csv_file = convert_to_flattened_csv(sdn, address, alt)
27
+
28
+ bulk_mysql_update(csv_file)
29
+ else
30
+ active_record_file_load(sdn, address, alt)
31
+ end
32
+
33
+ sdn.close
34
+ @address.close
35
+ @alt.close
36
+ end
37
+
38
+
39
+ private
40
+
41
+ #convert the file's null value to an empty string
42
+ #and removes " chars.
43
+ def self.clean_file_string(line)
44
+ line.gsub!(/-0-(\s)?/,'')
45
+ line.gsub!(/\n/,'')
46
+ line.gsub(/\"/,'')
47
+ end
48
+
49
+ #split the line into an array
50
+ def self.convert_line_to_array(line)
51
+ clean_file_string(line).split('|') unless line.nil?
52
+ end
53
+
54
+ #return an 2 arrays of the records matching the sdn primary key
55
+ #1 array of address records and one array of alt records
56
+ def self.foreign_key_records(sdn_id)
57
+ address_records = []
58
+ alt_records = []
59
+
60
+ #the first element in each array is the primary and foreign keys
61
+ #we are denormalizing the data
62
+ if @current_address_hash && @current_address_hash[:id] == sdn_id
63
+ address_records << @current_address_hash
64
+ loop do
65
+ @current_address_hash = address_text_to_hash(@address.gets)
66
+ if @current_address_hash && @current_address_hash[:id] == sdn_id
67
+ address_records << @current_address_hash
68
+ else
69
+ break
70
+ end
71
+ end
72
+ end
73
+
74
+ if @current_alt_hash && @current_alt_hash[:id] == sdn_id
75
+ alt_records << @current_alt_hash
76
+ loop do
77
+ @current_alt_hash = alt_text_to_hash(@alt.gets)
78
+ if @current_alt_hash && @current_alt_hash[:id] == sdn_id
79
+ alt_records << @current_alt_hash
80
+ else
81
+ break
82
+ end
83
+ end
84
+ end
85
+ return address_records, alt_records
86
+ end
87
+
88
+ def self.sdn_text_to_hash(line)
89
+ unless line.nil?
90
+ value_array = convert_line_to_array(line)
91
+ {:id => value_array[0],
92
+ :name => value_array[1],
93
+ :sdn_type => value_array[2],
94
+ :program => value_array[3],
95
+ :title => value_array[4],
96
+ :vessel_call_sign => value_array[5],
97
+ :vessel_type => value_array[6],
98
+ :vessel_tonnage => value_array[7],
99
+ :gross_registered_tonnage => value_array[8],
100
+ :vessel_flag => value_array[9],
101
+ :vessel_owner => value_array[10],
102
+ :remarks => value_array[11]
103
+ }
104
+ end
105
+ end
106
+
107
+ def self.address_text_to_hash(line)
108
+ unless line.nil?
109
+ value_array = convert_line_to_array(line)
110
+ {:id => value_array[0],
111
+ :address => value_array[2],
112
+ :city => value_array[3],
113
+ :country => value_array[4],
114
+ :address_remarks => value_array[5]
115
+ }
116
+ end
117
+ end
118
+
119
+ def self.alt_text_to_hash(line)
120
+ unless line.nil?
121
+ value_array = convert_line_to_array(line)
122
+ {:id => value_array[0],
123
+ :alternate_identity_type => value_array[2],
124
+ :alternate_identity_name => value_array[3],
125
+ :alternate_identity_remarks => value_array[4]
126
+ }
127
+ end
128
+ end
129
+
130
+ def self.convert_hash_to_mysql_import_string(record_hash)
131
+ # empty field for id to be generated by mysql.
132
+ new_line = "``|" +
133
+ # :name
134
+ "`#{record_hash[:name]}`|" +
135
+ # :sdn_type
136
+ "`#{record_hash[:sdn_type]}`|" +
137
+ # :program
138
+ "`#{record_hash[:program]}`|" +
139
+ # :title
140
+ "`#{record_hash[:title]}`|" +
141
+ # :vessel_call_sign
142
+ "`#{record_hash[:vessel_call_sign]}`|" +
143
+ # :vessel_type
144
+ "`#{record_hash[:vessel_type]}`|" +
145
+ # :vessel_tonnage
146
+ "`#{record_hash[:vessel_tonnage]}`|" +
147
+ # :gross_registered_tonnage
148
+ "`#{record_hash[:gross_registered_tonnage]}`|" +
149
+ # :vessel_flag
150
+ "`#{record_hash[:vessel_flag]}`|" +
151
+ # :vessel_owner
152
+ "`#{record_hash[:vessel_owner]}`|" +
153
+ # :remarks
154
+ "`#{record_hash[:remarks]}`|" +
155
+ # :address
156
+ "`#{record_hash[:address]}`|" +
157
+ # :city
158
+ "`#{record_hash[:city]}`|" +
159
+ # :country
160
+ "`#{record_hash[:country]}`|" +
161
+ # :address_remarks
162
+ "`#{record_hash[:address_remarks]}`|" +
163
+ # :alternate_identity_type
164
+ "`#{record_hash[:alternate_identity_type]}`|" +
165
+ # :alternate_identity_name
166
+ "`#{record_hash[:alternate_identity_name]}`|" +
167
+ # :alternate_identity_remarks
168
+ "`#{record_hash[:alternate_identity_remarks]}`|" +
169
+ #:created_at
170
+ "`#{Time.now.to_s(:db)}`|" +
171
+ # updated_at
172
+ "`#{Time.now.to_s(:db)}`" + "\n"
173
+
174
+ new_line
175
+ end
176
+
177
+ def self.convert_to_flattened_csv(sdn_file, address_file, alt_file)
178
+ @address = address_file
179
+ @alt = alt_file
180
+
181
+ csv_file = Tempfile.new("ofac") # create temp file for converted csv format.
182
+ #get the first line from the address and alt files
183
+ @current_address_hash = address_text_to_hash(@address.gets)
184
+ @current_alt_hash = alt_text_to_hash(@alt.gets)
185
+
186
+ start = Time.now
187
+
188
+ sdn_file.each_with_index do |line, i|
189
+
190
+ #initialize the address and alt atributes to empty strings
191
+ address_attributes = address_text_to_hash("|||||")
192
+ alt_attributes = alt_text_to_hash("||||")
193
+
194
+ sdn_attributes = sdn_text_to_hash(line)
195
+
196
+ #get the foreign key records for this sdn
197
+ address_records, alt_records = foreign_key_records(sdn_attributes[:id])
198
+
199
+ if address_records.empty?
200
+ #no matching address records, so initialized blank values will be used.
201
+ if alt_records.empty?
202
+ #no matching address records, so initialized blank values will be used.
203
+ csv_file.syswrite(convert_hash_to_mysql_import_string(sdn_attributes.merge(address_attributes).merge(alt_attributes)))
204
+ else
205
+ alt_records.each do |alt|
206
+ csv_file.syswrite(convert_hash_to_mysql_import_string(sdn_attributes.merge(address_attributes).merge(alt)))
207
+ end
208
+ end
209
+ else
210
+ address_records.each do |address|
211
+ if alt_records.empty?
212
+ #no matching address records, so initialized blank values will be used.
213
+ csv_file.syswrite(convert_hash_to_mysql_import_string(sdn_attributes.merge(address).merge(alt_attributes)))
214
+ else
215
+ alt_records.each do |alt|
216
+ csv_file.syswrite(convert_hash_to_mysql_import_string(sdn_attributes.merge(address).merge(alt)))
217
+ end
218
+ end
219
+ end
220
+ end
221
+ puts "#{i} records processed." if (i % 1000 == 0) && (i > 0)
222
+ end
223
+ puts "File conversion ran for #{(Time.now - start) / 60} minutes."
224
+ return csv_file
225
+ end
226
+
227
+ def self.active_record_file_load(sdn_file, address_file, alt_file)
228
+ @address = address_file
229
+ @alt = alt_file
230
+
231
+ #OFAC data is a complete list, so we have to dump and load
232
+ OfacSdn.delete_all
233
+
234
+ #get the first line from the address and alt files
235
+ @current_address_hash = address_text_to_hash(@address.gets)
236
+ @current_alt_hash = alt_text_to_hash(@alt.gets)
237
+ attributes = {}
238
+ sdn_file.each_with_index do |line, i|
239
+
240
+ #initialize the address and alt atributes to empty strings
241
+ address_attributes = address_text_to_hash("|||||")
242
+ alt_attributes = alt_text_to_hash("||||")
243
+
244
+ sdn_attributes = sdn_text_to_hash(line)
245
+
246
+ #get the foreign key records for this sdn
247
+ address_records, alt_records = foreign_key_records(sdn_attributes[:id])
248
+
249
+ if address_records.empty?
250
+ #no matching address records, so initialized blank values will be used.
251
+ if alt_records.empty?
252
+ #no matching address records, so initialized blank values will be used.
253
+ attributes = sdn_attributes.merge(address_attributes).merge(alt_attributes)
254
+ attributes.delete(:id)
255
+ OfacSdn.create(attributes)
256
+ else
257
+ alt_records.each do |alt|
258
+ attributes = sdn_attributes.merge(address_attributes).merge(alt)
259
+ attributes.delete(:id)
260
+ OfacSdn.create(attributes)
261
+ end
262
+ end
263
+ else
264
+ address_records.each do |address|
265
+ if alt_records.empty?
266
+ #no matching address records, so initialized blank values will be used.
267
+ attributes = sdn_attributes.merge(address).merge(alt_attributes)
268
+ attributes.delete(:id)
269
+ OfacSdn.create(attributes)
270
+ else
271
+ alt_records.each do |alt|
272
+ attributes = sdn_attributes.merge(address).merge(alt)
273
+ attributes.delete(:id)
274
+ OfacSdn.create(attributes)
275
+ end
276
+ end
277
+ end
278
+ end
279
+
280
+ puts "#{i} records processed." if (i % 5000 == 0) && (i > 0)
281
+ end
282
+ end
283
+
284
+ # For mysql, use:
285
+ # LOAD DATA LOCAL INFILE 'ssdm1.csv' INTO TABLE death_master_files FIELDS TERMINATED BY '|' ENCLOSED BY "`" LINES TERMINATED BY '\n';
286
+ # This is a much faster way of loading large amounts of data into mysql. For information on the LOAD DATA command
287
+ # see http://dev.mysql.com/doc/refman/5.1/en/load-data.html
288
+ def self.bulk_mysql_update(csv_file)
289
+ puts "Deleting all records in ofac_sdn..."
290
+
291
+ #OFAC data is a complete list, so we have to dump and load
292
+ OfacSdn.delete_all
293
+
294
+ puts "Importing into Mysql..."
295
+
296
+ mysql_command = <<-TEXT
297
+ LOAD DATA LOCAL INFILE '#{csv_file.path}' REPLACE INTO TABLE ofac_sdns FIELDS TERMINATED BY '|' ENCLOSED BY "`" LINES TERMINATED BY '\n';
298
+ TEXT
299
+
300
+ OfacSdn.connection.execute(mysql_command)
301
+ puts "Mysql import complete."
302
+
303
+ end
304
+
305
+ end
@@ -0,0 +1,132 @@
1
+ class OfacMatch
2
+
3
+ attr_reader :possible_hits
4
+
5
+ #Intialize a Match object with a record hash of fields you want to match on.
6
+ #Each key in the hash, also has a data hash value for the weight, token, and type.
7
+ #
8
+ # match = Ofac::Match.new({:name => {:weight => 10, :token => 'Kevin Tyll'},
9
+ # :city => {:weight => 40, :token => 'Clearwater', },
10
+ # :address => {:weight => 40, :token => '1234 Park St.', },
11
+ # :zip => {:weight => 10, :token => '33759', :type => :number}})
12
+ #
13
+ # data hash keys:
14
+ # * <tt>data[:weight]</tt> - value to apply to the score if there is a match (Default is 100/number of key in the record hash)
15
+ # * <tt>data[:token]</tt> - string to match
16
+ # * <tt>data[:match]</tt> - set from records hash
17
+ # * <tt>data[:score]</tt> - output field
18
+ # * <tt>data[:type]</tt> - the type of match that should be performed (valid values are +:sound+ | +:number+) (Default is +:sound+)
19
+ def initialize(stats={})
20
+ @possible_hits = []
21
+ @stats = stats.dup
22
+ weight = 100
23
+ weight = 100 / @stats.length if @stats.length > 0
24
+ @stats.each_value do |data|
25
+ data[:weight] ||= weight
26
+ data[:match] ||= ''
27
+ data[:type] ||= :sound
28
+ data[:score] ||= 0
29
+ data[:token] = data[:token].to_s.upcase
30
+ end
31
+ end
32
+
33
+ # match_records is an array of hashes.
34
+ #
35
+ # The hash keys must match the record hash keys set when initialized.
36
+ #
37
+ # score will return the highest score of all the records that
38
+ # are sent in match_records.
39
+ def score(match_records)
40
+ score_results = Array.new
41
+ unless match_records.empty?
42
+ #place the match_records information
43
+ #into our @stats hash
44
+ match_records.each do |match|
45
+ match.each do |key, value|
46
+ @stats[key.to_sym][:match] = value.to_s.upcase
47
+ end
48
+ record_score = calculate_record
49
+ score_results.push(record_score)
50
+ @possible_hits << match.merge(:score => record_score) if record_score > 0
51
+ end
52
+ score = score_results.max #take max score
53
+ end
54
+ @possible_hits.uniq!
55
+ score ||= 0
56
+ end
57
+
58
+ private
59
+
60
+
61
+ # calculate the score for this record
62
+ # comparing the token to the match fields in the @stats hash
63
+ # and storing the score into the record
64
+ def calculate_record
65
+ score = 0
66
+ unless @stats.nil?
67
+ #need to make sure we check the name first, since city and address don't
68
+ #get added to the score unless there is a name match
69
+ [:name,:city,:address].each do |field|
70
+ data = @stats[field]
71
+ if (data[:token].blank?)
72
+ value = 0 #token is blank can't be sure of a match if nothing to match against
73
+ else
74
+ if (data[:match].blank?)
75
+ value = 0 #token has value match is blank
76
+ else
77
+ #token and match both have values
78
+ if (data[:type] == :number)
79
+ value = data[:token] == data[:match] ? 1 : 0
80
+ else
81
+ #first see if there is an exact match
82
+ value = data[:token] == data[:match] ? 1 : 0
83
+
84
+ unless value > 0
85
+ #do a sounds like with the data as given to see if we get a match
86
+ #if match on sounds_like, only give .75 of the weight.
87
+ value = data[:token].ofac_sounds_like(data[:match],false) ? 0.75 : 0
88
+ end
89
+
90
+ #if no match, then break the data down and see if we can find matches on the
91
+ #individual words
92
+ unless value > 0
93
+ token_data = data[:token].gsub(/\W/,'|')
94
+ token_array = token_data.split('|')
95
+ token_array.delete('')
96
+
97
+ match_data = data[:match].gsub(/\W/,'|')
98
+ match_array = match_data.split('|')
99
+ match_array.delete('')
100
+
101
+ value = 0
102
+ partial_weight = 1/token_array.length.to_f
103
+ token_array.each do |partial_token|
104
+ #first see if we get an exact match of the partial
105
+ if match_array.include?(partial_token)
106
+ value += partial_weight
107
+ else
108
+ #otherwise, see if the partial sounds like any part of the OFAC record
109
+ match_array.each do |partial_match|
110
+ if partial_match.ofac_sounds_like(partial_token,false)
111
+ #give partial value for every part of token that is matched.
112
+ value += partial_weight * 0.75
113
+ break
114
+ end
115
+ end
116
+ end
117
+ end
118
+ end
119
+ end
120
+ end
121
+ end
122
+ data[:score] = data[:weight] * value
123
+ score += data[:score]
124
+ break if field == :name && data[:score] == 0
125
+ end
126
+
127
+ end
128
+ score.round
129
+ end
130
+
131
+ end
132
+
@@ -0,0 +1,22 @@
1
+ class String
2
+
3
+ Ofac_SoundexChars = 'BPFVCSKGJQXZDTLMNR'
4
+ Ofac_SoundexNums = '111122222222334556'
5
+ Ofac_SoundexCharsEx = '^' + Ofac_SoundexChars
6
+ Ofac_SoundexCharsDel = '^A-Z'
7
+
8
+ # desc: http://en.wikipedia.org/wiki/Soundex
9
+ def ofac_soundex(census = true)
10
+ str = upcase.delete(Ofac_SoundexCharsDel).squeeze
11
+
12
+ str[0 .. 0] + str[1 .. -1].
13
+ delete(Ofac_SoundexCharsEx).
14
+ tr(Ofac_SoundexChars, Ofac_SoundexNums)[0 .. (census ? 2 : -1)].
15
+ ljust(3, '0') rescue ''
16
+ end
17
+
18
+ def ofac_sounds_like(other, census = true)
19
+ ofac_soundex(census) == other.ofac_soundex(census)
20
+ end
21
+
22
+ end
@@ -0,0 +1,8 @@
1
+
2
+ namespace :ofac do
3
+ desc "Loads the current file from http://www.treas.gov/offices/enforcement/ofac/sdn/delimit/index.shtml."
4
+ task :update_data => :environment do
5
+ OfacSdnLoader.load_current_sdn_file
6
+ end
7
+
8
+ end
@@ -0,0 +1,10 @@
1
+ 10|7|-0- |-0- |"Panama"|-0-
2
+ 15|12|-0- |-0- |"Panama"|-0-
3
+ 22|14|"123 Somewhere Ln"|"Clearwater"|"United States"|-0-
4
+ 39|27|-0- |"Managua"|"Nicaragua"|-0-
5
+ 39|29|"Bal Harbour Shopping Center, Via Italia"|"Panama City"|"Panama"|-0-
6
+ 41|41|"Avenida de Concha, Espina 8, E-28036"|"Madrid"|"Spain"|-0-
7
+ 41|102|-0- |-0- |-0- |-0-
8
+ 66|111|-0- |"Milan"|"Italy"|-0-
9
+ 66|117|-0- |-0- |"Panama"|-0-
10
+ 66|125|"1840 West 49th Street"|"Hialeah, FL"|"United States"|-0-
@@ -0,0 +1,10 @@
1
+ 15|14|"aka"|"VIAJES GUAMA TOURS"|-0-
2
+ 22|15|"aka"|"HERNANDEZ, Oscar Grouch"|-0-
3
+ 22|16|"aka"|"Alternate Name"|-0-
4
+ 25|57|"aka"|"AVIA IMPORT"|-0-
5
+ 36|219|"aka"|"BNC"|-0-
6
+ 36|220|"aka"|"NATIONAL BANK OF CUBA"|-0-
7
+ 36|221|"aka"|"BNC"|-0-
8
+ 41|222|"aka"|"NATIONAL BANK OF CUBA"|-0-
9
+ 66|223|"aka"|"BNC"|-0-
10
+ 66|224|"aka"|"NATIONAL BANK OF CUBA"|-0-
@@ -0,0 +1,9 @@
1
+ 10|"ABASTECEDORA NAVAL Y INDUSTRIAL, S.A."|-0- |"CUBA"|-0- |-0- |-0- |-0- |-0- |-0- |-0- |-0-
2
+ 15|"ABDELNUR| Nury de Jesus"|"individual"|"CUBA"|-0- |-0- |-0- |-0- |-0- |-0- |-0- |-0-
3
+ 22|"HERNANDEZ, Oscar"|"individual"|"CUBA"|-0- |-0- |"Unknown vessel type"|-0- |-0- |-0- |"Acechilly Navigation Co., Malta"|-0-
4
+ 24|"LOPEZ MENDEZ, Luis Eduardo"|"individual"|"CUBA"|-0- |-0- |"Unknown vessel type"|-0- |-0- |-0- |"Acefrosty Shipping Co., Malta"|-0-
5
+ 25|"ACEFROSTY SHIPPING CO., LTD."|-0- |"CUBA"|-0- |-0- |-0- |-0- |-0- |-0- |-0- |-0-
6
+ 36|"AEROCARIBBEAN AIRLINES"|-0- |"CUBA"|-0- |-0- |-0- |-0- |-0- |-0- |-0- |-0-
7
+ 39|"AEROTAXI EJECUTIVO, S.A."|-0- |"CUBA"|-0- |-0- |-0- |-0- |-0- |-0- |-0- |-0-
8
+ 41|"AGENCIA DE VIAJES GUAMA"|-0- |"CUBA"|-0- |-0- |-0- |-0- |-0- |-0- |-0- |-0-
9
+ 66|"AGUIAR, Raul"|"individual"|"CUBA"|"Director, Banco Nacional de Cuba"|-0- |-0- |-0- |-0- |-0- |-0- |"; Director, Banco Nacional de Cuba."
@@ -0,0 +1,19 @@
1
+ ``|`ABASTECEDORA NAVAL Y INDUSTRIAL, S.A.`|``|`CUBA`|``|``|``|``|``|``|``|``|``|``|`Panama`|``|``|``|``|`2009-05-06 15:55:24`|`2009-05-06 15:55:24`
2
+ ``|`ABDELNUR`|` Nury de Jesus`|`individual`|`CUBA`|``|``|``|``|``|``|``|``|``|`Panama`|``|`aka`|`VIAJES GUAMA TOURS`|``|`2009-05-06 15:55:24`|`2009-05-06 15:55:24`
3
+ ``|`HERNANDEZ, Oscar`|`individual`|`CUBA`|``|``|`Unknown vessel type`|``|``|``|`Acechilly Navigation Co., Malta`|``|`123 Somewhere Ln`|`Clearwater`|`United States`|``|`aka`|`HERNANDEZ, Oscar Grouch`|``|`2009-05-06 15:55:24`|`2009-05-06 15:55:24`
4
+ ``|`HERNANDEZ, Oscar`|`individual`|`CUBA`|``|``|`Unknown vessel type`|``|``|``|`Acechilly Navigation Co., Malta`|``|`123 Somewhere Ln`|`Clearwater`|`United States`|``|`aka`|`Alternate Name`|``|`2009-05-06 15:55:24`|`2009-05-06 15:55:24`
5
+ ``|`LOPEZ MENDEZ, Luis Eduardo`|`individual`|`CUBA`|``|``|`Unknown vessel type`|``|``|``|`Acefrosty Shipping Co., Malta`|``|``|``|``|``|``|``|``|`2009-05-06 15:55:24`|`2009-05-06 15:55:24`
6
+ ``|`ACEFROSTY SHIPPING CO., LTD.`|``|`CUBA`|``|``|``|``|``|``|``|``|``|``|``|``|`aka`|`AVIA IMPORT`|``|`2009-05-06 15:55:24`|`2009-05-06 15:55:24`
7
+ ``|`AEROCARIBBEAN AIRLINES`|``|`CUBA`|``|``|``|``|``|``|``|``|``|``|``|``|`aka`|`BNC`|``|`2009-05-06 15:55:24`|`2009-05-06 15:55:24`
8
+ ``|`AEROCARIBBEAN AIRLINES`|``|`CUBA`|``|``|``|``|``|``|``|``|``|``|``|``|`aka`|`NATIONAL BANK OF CUBA`|``|`2009-05-06 15:55:24`|`2009-05-06 15:55:24`
9
+ ``|`AEROCARIBBEAN AIRLINES`|``|`CUBA`|``|``|``|``|``|``|``|``|``|``|``|``|`aka`|`BNC`|``|`2009-05-06 15:55:24`|`2009-05-06 15:55:24`
10
+ ``|`AEROTAXI EJECUTIVO, S.A.`|``|`CUBA`|``|``|``|``|``|``|``|``|``|`Managua`|`Nicaragua`|``|``|``|``|`2009-05-06 15:55:24`|`2009-05-06 15:55:24`
11
+ ``|`AEROTAXI EJECUTIVO, S.A.`|``|`CUBA`|``|``|``|``|``|``|``|``|`Bal Harbour Shopping Center, Via Italia`|`Panama City`|`Panama`|``|``|``|``|`2009-05-06 15:55:24`|`2009-05-06 15:55:24`
12
+ ``|`AGENCIA DE VIAJES GUAMA`|``|`CUBA`|``|``|``|``|``|``|``|``|`Avenida de Concha, Espina 8, E-28036`|`Madrid`|`Spain`|``|`aka`|`NATIONAL BANK OF CUBA`|``|`2009-05-06 15:55:24`|`2009-05-06 15:55:24`
13
+ ``|`AGENCIA DE VIAJES GUAMA`|``|`CUBA`|``|``|``|``|``|``|``|``|``|``|``|``|`aka`|`NATIONAL BANK OF CUBA`|``|`2009-05-06 15:55:24`|`2009-05-06 15:55:24`
14
+ ``|`AGUIAR, Raul`|`individual`|`CUBA`|`Director, Banco Nacional de Cuba`|``|``|``|``|``|``|`; Director, Banco Nacional de Cuba.`|``|`Milan`|`Italy`|``|`aka`|`BNC`|``|`2009-05-06 15:55:24`|`2009-05-06 15:55:24`
15
+ ``|`AGUIAR, Raul`|`individual`|`CUBA`|`Director, Banco Nacional de Cuba`|``|``|``|``|``|``|`; Director, Banco Nacional de Cuba.`|``|`Milan`|`Italy`|``|`aka`|`NATIONAL BANK OF CUBA`|``|`2009-05-06 15:55:24`|`2009-05-06 15:55:24`
16
+ ``|`AGUIAR, Raul`|`individual`|`CUBA`|`Director, Banco Nacional de Cuba`|``|``|``|``|``|``|`; Director, Banco Nacional de Cuba.`|``|``|`Panama`|``|`aka`|`BNC`|``|`2009-05-06 15:55:24`|`2009-05-06 15:55:24`
17
+ ``|`AGUIAR, Raul`|`individual`|`CUBA`|`Director, Banco Nacional de Cuba`|``|``|``|``|``|``|`; Director, Banco Nacional de Cuba.`|``|``|`Panama`|``|`aka`|`NATIONAL BANK OF CUBA`|``|`2009-05-06 15:55:24`|`2009-05-06 15:55:24`
18
+ ``|`AGUIAR, Raul`|`individual`|`CUBA`|`Director, Banco Nacional de Cuba`|``|``|``|``|``|``|`; Director, Banco Nacional de Cuba.`|`1840 West 49th Street`|`Hialeah, FL`|`United States`|``|`aka`|`BNC`|``|`2009-05-06 15:55:24`|`2009-05-06 15:55:24`
19
+ ``|`AGUIAR, Raul`|`individual`|`CUBA`|`Director, Banco Nacional de Cuba`|``|``|``|``|``|``|`; Director, Banco Nacional de Cuba.`|`1840 West 49th Street`|`Hialeah, FL`|`United States`|``|`aka`|`NATIONAL BANK OF CUBA`|``|`2009-05-06 15:55:24`|`2009-05-06 15:55:24`
@@ -0,0 +1,20 @@
1
+ require 'ofac/models/ofac_sdn_loader'
2
+
3
+ class OfacSdnLoader
4
+
5
+ def self.load_current_sdn_file
6
+ sdn = File.new(File.dirname(__FILE__) + '/../../files/test_sdn_data_load.pip')
7
+ address = File.new(File.dirname(__FILE__) + '/../../files/test_address_data_load.pip')
8
+ alt = File.new(File.dirname(__FILE__) + '/../../files/test_alt_data_load.pip')
9
+ active_record_file_load(sdn, address, alt)
10
+ sdn.close
11
+ address.close
12
+ alt.close
13
+ end
14
+
15
+ #Gives access to the private convert_to_flattened_csv method
16
+ def self.create_csv_file(sdn, address, alt)
17
+ convert_to_flattened_csv(sdn, address, alt)
18
+ end
19
+
20
+ end
@@ -0,0 +1,40 @@
1
+ require 'test_helper'
2
+
3
+ class OfacSdnLoaderTest < Test::Unit::TestCase
4
+
5
+ context '' do
6
+ setup do setup_ofac_sdn_table end
7
+
8
+ should "load table from files multiple times and always have the same record count" do
9
+ assert_equal(0,OfacSdn.count)
10
+ OfacSdnLoader.load_current_sdn_file #this method is mocked to load test files instead of the live files from the web.
11
+ assert_equal(19, OfacSdn.count)
12
+ OfacSdnLoader.load_current_sdn_file
13
+ assert_equal(19, OfacSdn.count)
14
+ end
15
+
16
+ should "create flattened_csv_file_for_mysql_import" do
17
+ #since, I'm using sqlight3 for it's in memory db, I can't test the mysql load
18
+ #but I can test the csv file creation.
19
+ sdn = File.new(File.dirname(__FILE__) + '/files/test_sdn_data_load.pip')
20
+ address = File.new(File.dirname(__FILE__) + '/files/test_address_data_load.pip')
21
+ alt = File.new(File.dirname(__FILE__) + '/files/test_alt_data_load.pip')
22
+
23
+ csv = OfacSdnLoader.create_csv_file(sdn, address, alt) #this method was created in the mock only to call the private convert_to_flattened_csv method
24
+ correctly_formatted_csv = File.open(File.dirname(__FILE__) + '/files/valid_flattened_file.csv')
25
+
26
+ csv.rewind
27
+ generated_file = csv.readlines
28
+ #compare the values of each csv line, with the correctly formated "control file"
29
+ correctly_formatted_csv.each_with_index do |line,i|
30
+ csv_line = generated_file[i]
31
+ correctly_formatted_record_array = line.split('|')
32
+ csv_record_array = csv_line.split('|')
33
+ (0..18).each do |i| #skip indices 19 and 20, they are the created_at and updated_at fields, they will never match.
34
+ assert_equal correctly_formatted_record_array[i], csv_record_array[i]
35
+ end
36
+ end
37
+ end
38
+
39
+ end
40
+ end
data/test/ofac_test.rb ADDED
@@ -0,0 +1,76 @@
1
+ require 'test_helper'
2
+
3
+ class OfacTest < Test::Unit::TestCase
4
+
5
+ context '' do
6
+ setup do
7
+ setup_ofac_sdn_table
8
+ OfacSdnLoader.load_current_sdn_file #this method is mocked to load test files instead of the live files from the web.
9
+ end
10
+
11
+ should "give a score of 0 if no name is given" do
12
+ assert_equal 0, Ofac.new({:address => '123 somewhere'}).score
13
+ end
14
+
15
+ should "give a score of 0 if there is no name match" do
16
+ assert_equal 0, Ofac.new({:name => 'Kevin'}).score
17
+ end
18
+
19
+ should "give a score of 0 if there is no name match but there is an address and city match" do
20
+ assert_equal 0, Ofac.new({:name => 'Kevin', :address => '123 somewhere ln', :city => 'Clearwater'}).score
21
+ end
22
+
23
+ should "give a score of 60 if there is a name match" do
24
+ assert_equal 60, Ofac.new({:name => 'Oscar Hernandez'}).score
25
+ assert_equal 60, Ofac.new({:name => 'Oscar Hernandez', :city => 'no match', :address => 'no match'}).score
26
+ assert_equal 60, Ofac.new({:name => 'Oscar Hernandez', :city => 'Las Vegas', :address => 'no match'}).score
27
+ assert_equal 60, Ofac.new({:name => 'Luis Lopez', :city => 'Las Vegas', :address => 'no match'}).score
28
+ end
29
+
30
+ should "give a score of 60 if there is a name match on alternate identity name" do
31
+ assert_equal 60, Ofac.new({:name => 'Alternate Name'}).score
32
+ end
33
+
34
+ should "give a partial score if there is a partial name match" do
35
+ assert_equal 40, Ofac.new({:name => 'Oscar middlename Hernandez'}).score
36
+ assert_equal 30, Ofac.new({:name => 'Oscar WrongLastName'}).score
37
+ assert_equal 70, Ofac.new({:name => 'Oscar middlename Hernandez',:city => 'Clearwater'}).score
38
+ end
39
+
40
+ should "give a score of 90 if there is a name and city match" do
41
+ assert_equal 90, Ofac.new({:name => 'Oscar Hernandez', :city => 'Clearwater', :address => 'no match'}).score
42
+ end
43
+
44
+ should "give a score of 100 if there is a name and city and address match" do
45
+ assert_equal 100, Ofac.new({:name => 'Oscar Hernandez', :city => 'Clearwater', :address => '123 somewhere ln'}).score
46
+ end
47
+
48
+ should "give partial scores for sounds like matches" do
49
+
50
+ #32456 summer lane sounds like 32456 Somewhere ln so is adds 75% of the address weight to the score, or 8.
51
+ assert_equal 98, Ofac.new({:name => 'Oscar Hernandez', :city => 'Clearwater', :address => '32456 summer lane'}).score
52
+
53
+ #summer sounds like somewhere, and all numbers sound alike, so 2 of the 3 address elements match by sound.
54
+ #Each element is worth 10\3 or 3.33. Exact matches add 2.33 each, and the sounds like adds 2.33 * .75 or 2.5
55
+ #because sounds like matches only add 75% of it's weight.
56
+ #2.5 + 2.5 = 5
57
+ assert_equal 95, Ofac.new({:name => 'Oscar Hernandez', :city => 'Clearwater', :address => '12358 summer blvd'}).score
58
+
59
+
60
+ #Louis sounds like Luis, and Lopez is an exact match:
61
+ #:name has a weight of 60, so each element is worth 30. A sounds like match is worth 30 * .75
62
+ assert_equal 53, Ofac.new({:name => 'Louis Lopez', :city => 'Las Vegas', :address => 'no match'}).score
63
+ end
64
+
65
+ should "return an array of possible hits" do
66
+ #it should matter which order you call score or possible hits.
67
+ sdn = Ofac.new({:name => 'Oscar Hernandez', :city => 'Clearwater', :address => '123 somewhere ln'})
68
+ assert sdn.score > 0
69
+ assert !sdn.possible_hits.empty?
70
+
71
+ sdn = Ofac.new({:name => 'Oscar Hernandez', :city => 'Clearwater', :address => '123 somewhere ln'})
72
+ assert !sdn.possible_hits.empty?
73
+ assert sdn.score > 0
74
+ end
75
+ end
76
+ end
@@ -0,0 +1,48 @@
1
+ require 'rubygems'
2
+ require 'test/unit'
3
+ require 'shoulda'
4
+ require 'mocks/test/ofac_sdn_loader'
5
+
6
+ $LOAD_PATH.unshift(File.join(File.dirname(__FILE__), '..', 'lib'))
7
+ $LOAD_PATH.unshift(File.dirname(__FILE__))
8
+ require 'ofac'
9
+
10
+ ActiveRecord::Base.establish_connection :adapter => 'sqlite3', :database => ':memory:'
11
+
12
+ class Test::Unit::TestCase
13
+ def setup_ofac_sdn_table
14
+ ActiveRecord::Base.connection.tables.each { |table| ActiveRecord::Base.connection.drop_table(table) }
15
+ create_ofac_sdn_table
16
+ end
17
+
18
+ private
19
+
20
+ def create_ofac_sdn_table
21
+ silence_stream(STDOUT) do
22
+ ActiveRecord::Schema.define(:version => 1) do
23
+ create_table :ofac_sdns do |t|
24
+ t.text :name
25
+ t.string :sdn_type
26
+ t.string :program
27
+ t.string :title
28
+ t.string :vessel_call_sign
29
+ t.string :vessel_type
30
+ t.string :vessel_tonnage
31
+ t.string :gross_registered_tonnage
32
+ t.string :vessel_flag
33
+ t.string :vessel_owner
34
+ t.text :remarks
35
+ t.text :address
36
+ t.string :city
37
+ t.string :country
38
+ t.string :address_remarks
39
+ t.string :alternate_identity_type
40
+ t.text :alternate_identity_name
41
+ t.string :alternate_identity_remarks
42
+ t.timestamps
43
+ end
44
+ end
45
+ end
46
+ end
47
+
48
+ end
metadata ADDED
@@ -0,0 +1,90 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: kevintyll-ofac
3
+ version: !ruby/object:Gem::Version
4
+ version: 1.0.0
5
+ platform: ruby
6
+ authors:
7
+ - Kevin Tyll
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+
12
+ date: 2009-05-11 00:00:00 -07:00
13
+ default_executable:
14
+ dependencies: []
15
+
16
+ description: Attempts to find a hit on the Office of Foreign Assets Control's Specially Designated Nationals list.
17
+ email: kevintyll@gmail.com
18
+ executables: []
19
+
20
+ extensions: []
21
+
22
+ extra_rdoc_files:
23
+ - LICENSE
24
+ - README.rdoc
25
+ files:
26
+ - History.txt
27
+ - LICENSE
28
+ - PostInstall.txt
29
+ - README.rdoc
30
+ - Rakefile
31
+ - VERSION.yml
32
+ - generators/ofac_migration/ofac_migration_generator.rb
33
+ - generators/ofac_migration/templates/migration.rb
34
+ - lib/ofac.rb
35
+ - lib/ofac/models/ofac.rb
36
+ - lib/ofac/models/ofac_sdn.rb
37
+ - lib/ofac/models/ofac_sdn_loader.rb
38
+ - lib/ofac/ofac_match.rb
39
+ - lib/ofac/ruby_string_extensions.rb
40
+ - lib/tasks/ofac.rake
41
+ - test/files/test_address_data_load.pip
42
+ - test/files/test_alt_data_load.pip
43
+ - test/files/test_sdn_data_load.pip
44
+ - test/files/valid_flattened_file.csv
45
+ - test/mocks/test/ofac_sdn_loader.rb
46
+ - test/ofac_sdn_loader_test.rb
47
+ - test/ofac_test.rb
48
+ - test/test_helper.rb
49
+ has_rdoc: true
50
+ homepage: http://github.com/kevintyll/ofac
51
+ post_install_message: |-
52
+ For more information on ofac, see http://kevintyll.github.com/ofac/
53
+
54
+ * To create the necessary db migration, from the command line, run:
55
+ script/generate ofac_migration
56
+ * Require the gem in your environment.rb file in the Rails::Initializer block:
57
+ config.gem 'kevintyll-ofac', :lib => 'ofac'
58
+ * To load your table with the current OFAC data, from the command line, run:
59
+ rake ofac:update_data
60
+
61
+ * The OFAC data is not updated with any regularity, but you can sign up for email notifications when the data changes at
62
+ http://www.treas.gov/offices/enforcement/ofac/sdn/index.shtml.
63
+ rdoc_options:
64
+ - --charset=UTF-8
65
+ require_paths:
66
+ - lib
67
+ required_ruby_version: !ruby/object:Gem::Requirement
68
+ requirements:
69
+ - - ">="
70
+ - !ruby/object:Gem::Version
71
+ version: "0"
72
+ version:
73
+ required_rubygems_version: !ruby/object:Gem::Requirement
74
+ requirements:
75
+ - - ">="
76
+ - !ruby/object:Gem::Version
77
+ version: "0"
78
+ version:
79
+ requirements: []
80
+
81
+ rubyforge_project:
82
+ rubygems_version: 1.2.0
83
+ signing_key:
84
+ specification_version: 2
85
+ summary: Attempts to find a hit on the Office of Foreign Assets Control's Specially Designated Nationals list.
86
+ test_files:
87
+ - test/mocks/test/ofac_sdn_loader.rb
88
+ - test/ofac_sdn_loader_test.rb
89
+ - test/ofac_test.rb
90
+ - test/test_helper.rb