address_standardization 0.3.0

Sign up to get free protection for your applications and to get access to all the features.
data/.gitignore ADDED
@@ -0,0 +1,3 @@
1
+ .DS_Store
2
+ *.gemspec
3
+ *.gem
data/README.rdoc ADDED
@@ -0,0 +1,90 @@
1
+ # address_standardization
2
+
3
+ ## Summary
4
+
5
+ A tiny Ruby library to quickly standardize a postal address, either through MelissaData or Google Maps.
6
+
7
+ ## Installation
8
+
9
+ If you are using Rails, put this in your environment.rb:
10
+
11
+ config.gem 'address_standardization'
12
+
13
+ Then run `rake gems:install` to install the gem.
14
+
15
+ Otherwise, just run
16
+
17
+ gem install address_standardization
18
+
19
+ ## Usage
20
+
21
+ Right now this library supports two services: MelissaData and Google Maps.
22
+
23
+ MelissaData provides two services itself: [US address lookup](http://www.melissadata.com/lookups/AddressVerify.asp) and [Canadian address lookup](http://www.melissadata.com/lookups/CanadianAddressVerify.asp). They both work the same way, however. First, here's how to standardize a US address:
24
+
25
+ addr = AddressStandardization::MelissaData::USAddress.standardize(
26
+ :street => "1 Infinite Loop",
27
+ :city => "Cupertino",
28
+ :state => "CA"
29
+ )
30
+
31
+ This submits the address to MelissaData. If the address can't be found, you'll get back `nil`. But if the address can be found (as in this case), you'll get an instance of `AddressStandardization::MelissaData::USAddress`. If you store the instance, you can refer to the individual fields like so:
32
+
33
+ addr.street #=> "1 INFINITE LOOP"
34
+ addr.city #=> "CUPERTINO"
35
+ addr.state #=> "CA"
36
+ addr.zip #=> "95014-2083"
37
+
38
+ And standardizing a Canadian address:
39
+
40
+ addr = AddressStandardization::MelissaData::CanadianAddress.standardize(
41
+ :street => "103 Metig St",
42
+ :city => "Sault Ste Marie",
43
+ :province => "ON"
44
+ )
45
+ addr.street #=> "103 METIG ST RR 4"
46
+ addr.city #=> "SAULT STE MARIE"
47
+ addr.province #=> "ON"
48
+ addr.postalcode #=> "P6A 5K9"
49
+
50
+ Note that we refer to the province as `province`, but `state` works too. The postal code may also be referred to as `zip`.
51
+
52
+ Using Google Maps to validate an address is just as easy:
53
+
54
+ addr = AddressStandardization::GoogleMaps::Address.standardize(
55
+ :street => "1600 Amphitheatre Parkway",
56
+ :city => "Mountain View",
57
+ :state => "CA"
58
+ )
59
+ addr.street #=> "1600 AMPHITHEATRE PKWY"
60
+ addr.city #=> "MOUNTAIN VIEW"
61
+ addr.state #=> "CA"
62
+ addr.zip #=> "94043"
63
+ addr.country #=> "USA"
64
+
65
+ And, again, a Canadian address:
66
+
67
+ addr = AddressStandardization::GoogleMaps::Address.standardize(
68
+ :street => "1770 Stenson Blvd.",
69
+ :city => "Peterborough",
70
+ :province => "ON"
71
+ )
72
+ addr.street #=> "1770 STENSON BLVD"
73
+ addr.city #=> "PETERBOROUGH"
74
+ addr.province #=> "ON"
75
+ addr.postalcode #=> "K9K"
76
+ addr.country #=> "CANADA"
77
+
78
+ Sharp eyes will notice that the Google Maps API doesn't return the full postal code for Canadian addresses. If you know why this is please let me know (my email address is below).
79
+
80
+ ## Support
81
+
82
+ If you find any bugs with this plugin, feel free to:
83
+
84
+ * file a bug report in the [Issues area on Github](http://github.com/mcmire/address_standardization/issues)
85
+ * fork the [project on Github](http://github.com/mcmire/address_standardization) and send me a pull request
86
+ * email me (*firstname* dot *lastname* at gmail dot com)
87
+
88
+ ## Author/License
89
+
90
+ (c) 2008 Elliot Winkler. Released under the MIT license.
data/Rakefile ADDED
@@ -0,0 +1,56 @@
1
+ require 'rubygems'
2
+ require 'rake'
3
+
4
+ begin
5
+ require 'jeweler'
6
+ Jeweler::Tasks.new do |gem|
7
+ gem.name = "address_standardization"
8
+ gem.summary = %Q{A tiny Ruby library to quickly standardize a postal address}
9
+ gem.description = %Q{A tiny Ruby library to quickly standardize a postal address}
10
+ gem.authors = ["Elliot Winkler"]
11
+ gem.email = "elliot.winkler@gmail.com"
12
+ gem.homepage = "http://github.com/mcmire/address_standardization"
13
+ gem.add_dependency "mechanize"
14
+ gem.add_dependency "hpricot"
15
+ gem.add_development_dependency "mcmire-contest"
16
+ gem.add_development_dependency "mcmire-matchy"
17
+ # gem is a Gem::Specification... see http://www.rubygems.org/read/chapter/20 for additional settings
18
+ end
19
+ Jeweler::GemcutterTasks.new
20
+ rescue LoadError
21
+ puts "Jeweler (or a dependency) not available. Install it with: gem install jeweler"
22
+ end
23
+
24
+ require 'rake/testtask'
25
+ Rake::TestTask.new(:test) do |test|
26
+ test.libs << 'lib' << 'test'
27
+ test.pattern = 'test/*_test.rb'
28
+ test.verbose = true
29
+ end
30
+
31
+ begin
32
+ require 'rcov/rcovtask'
33
+ Rcov::RcovTask.new do |test|
34
+ test.libs << 'test'
35
+ test.pattern = 'test/**/test_*.rb'
36
+ test.verbose = true
37
+ end
38
+ rescue LoadError
39
+ task :rcov do
40
+ abort "RCov is not available. In order to run rcov, you must: sudo gem install spicycode-rcov"
41
+ end
42
+ end
43
+
44
+ task :test => :check_dependencies
45
+
46
+ task :default => :test
47
+
48
+ require 'rake/rdoctask'
49
+ Rake::RDocTask.new do |rdoc|
50
+ version = File.exist?('VERSION') ? File.read('VERSION') : ""
51
+
52
+ rdoc.rdoc_dir = 'rdoc'
53
+ rdoc.title = "address_standardization #{version}"
54
+ rdoc.rdoc_files.include('README*')
55
+ rdoc.rdoc_files.include('lib/**/*.rb')
56
+ end
data/TODO ADDED
@@ -0,0 +1 @@
1
+ * MelissaData will let us know if there are other residents in the building that the address points to and thus whether to supply the suite number -- tap into this
data/VERSION ADDED
@@ -0,0 +1 @@
1
+ 0.3.0
@@ -0,0 +1,9 @@
1
+ # address_standardization: A tiny Ruby library to quickly standardize a postal address.
2
+ # Copyright (C) 2008 Elliot Winkler. Released under the MIT license.
3
+
4
+ require File.dirname(__FILE__)+'/address_standardization/ruby_ext'
5
+ require File.dirname(__FILE__)+'/address_standardization/class_level_inheritable_attributes'
6
+
7
+ require File.dirname(__FILE__)+'/address_standardization/abstract_address'
8
+ require File.dirname(__FILE__)+'/address_standardization/melissa_data'
9
+ require File.dirname(__FILE__)+'/address_standardization/google_maps'
@@ -0,0 +1,58 @@
1
+ module AddressStandardization
2
+ class StandardizationError < StandardError; end
3
+
4
+ class AbstractAddress
5
+
6
+ extend ClassLevelInheritableAttributes
7
+ cattr_inheritable :valid_keys
8
+
9
+ def self.standardize
10
+ raise NotImplementedError, "You must override .standardize in a subclass"
11
+ end
12
+
13
+ attr_reader :address_info
14
+
15
+ def initialize(address_info)
16
+ raise NotImplementedError, "You must define valid_keys" unless self.class.valid_keys
17
+ raise ArgumentError, "No address given!" if address_info.empty?
18
+ address_info = address_info.inject({}) {|h,(k,v)| h[k.to_s] = v; h } # stringify keys
19
+ validate_keys(address_info)
20
+ standardize_values!(address_info)
21
+ @address_info = address_info
22
+ end
23
+
24
+ def validate_keys(hash)
25
+ # assume keys are already stringified
26
+ invalid_keys = hash.keys - self.class.valid_keys
27
+ unless invalid_keys.empty?
28
+ raise ArgumentError, "Invalid keys: #{invalid_keys.join(', ')}. Valid keys are: #{self.class.valid_keys.join(', ')}"
29
+ end
30
+ end
31
+
32
+ def method_missing(name, *args)
33
+ name = name.to_s
34
+ if self.class.valid_keys.include?(name)
35
+ if args.empty?
36
+ @address_info[name]
37
+ else
38
+ @address_info[name] = standardize_value(args.first)
39
+ end
40
+ else
41
+ super(name.to_sym, *args)
42
+ end
43
+ end
44
+
45
+ def ==(other)
46
+ other.kind_of?(AbstractAddress) && @address_info == other.address_info
47
+ end
48
+
49
+ private
50
+ def standardize_values!(hash)
51
+ hash.each {|k,v| hash[k] = standardize_value(v) }
52
+ end
53
+
54
+ def standardize_value(value)
55
+ value ? value.strip_whitespace : ""
56
+ end
57
+ end
58
+ end
@@ -0,0 +1,20 @@
1
+ # from <http://railstips.org/2008/6/13/a-class-instance-variable-update>
2
+ module ClassLevelInheritableAttributes
3
+ def cattr_inheritable(*args)
4
+ @cattr_inheritable_attrs ||= [:cattr_inheritable_attrs]
5
+ @cattr_inheritable_attrs += args
6
+ args.each do |arg|
7
+ class_eval %(
8
+ class << self; attr_accessor :#{arg} end
9
+ )
10
+ end
11
+ @cattr_inheritable_attrs
12
+ end
13
+
14
+ def inherited(subclass)
15
+ @cattr_inheritable_attrs.each do |inheritable_attribute|
16
+ instance_var = "@#{inheritable_attribute}"
17
+ subclass.instance_variable_set(instance_var, instance_variable_get(instance_var))
18
+ end
19
+ end
20
+ end
@@ -0,0 +1,59 @@
1
+ require 'rubygems'
2
+ require 'hpricot'
3
+
4
+ module AddressStandardization
5
+ module GoogleMaps
6
+ class << self
7
+ attr_accessor :api_key
8
+ end
9
+
10
+ class Address < AbstractAddress
11
+ self.valid_keys = %w(street city state province postalcode zip country full_address precision)
12
+
13
+ class << self
14
+ # much of this code was borrowed from GeoKit, thanks...
15
+ def standardize(address_info)
16
+ raise "API key not specified.\nCall AddressStandardization::GoogleMaps.api_key = '...' before you call .standardize()." unless GoogleMaps.api_key
17
+
18
+ address_str = "%s, %s, %s %s" % [
19
+ address_info[:street],
20
+ address_info[:city],
21
+ (address_info[:state] || address_info[:province]),
22
+ address_info[:zip]
23
+ ]
24
+ url = "http://maps.google.com/maps/geo?q=#{address_str.url_escape}&output=xml&key=#{GoogleMaps.api_key}&oe=utf-8"
25
+ # puts url
26
+ uri = URI.parse(url)
27
+ res = Net::HTTP.get_response(uri)
28
+ unless res.is_a?(Net::HTTPSuccess)
29
+ File.open("test.xml", "w") {|f| f.write("(no response or response was unsuccessful)") }
30
+ return nil
31
+ end
32
+ xml = res.body
33
+ #File.open("test.xml", "w") {|f| f.write(xml) }
34
+ xml = Hpricot::XML(xml)
35
+
36
+ if xml.at("//kml/Response/Status/code").inner_text == "200"
37
+ addr = {}
38
+
39
+ addr[:street] = get_inner_text(xml, '//ThoroughfareName')
40
+ addr[:city] = get_inner_text(xml, '//LocalityName')
41
+ addr[:province] = addr[:state] = get_inner_text(xml, '//AdministrativeAreaName')
42
+ addr[:zip] = addr[:postalcode] = get_inner_text(xml, '//PostalCodeNumber')
43
+ addr[:country] = get_inner_text(xml, '//CountryName')
44
+
45
+ new(addr)
46
+ else
47
+ #File.open("test.xml", "w") {|f| f.write("(no response or response was unsuccessful)") }
48
+ nil
49
+ end
50
+ end
51
+
52
+ private
53
+ def get_inner_text(xml, xpath)
54
+ lambda {|x| x && x.inner_text.upcase }.call(xml.at(xpath))
55
+ end
56
+ end
57
+ end # Address
58
+ end # GoogleMaps
59
+ end # AddressStandardization
@@ -0,0 +1,71 @@
1
+ require 'rubygems'
2
+ require 'mechanize'
3
+
4
+ module AddressStandardization
5
+ class MelissaData
6
+ class BaseAddress < AbstractAddress
7
+ cattr_inheritable :start_url
8
+
9
+ def initialize(address_info)
10
+ raise NotImplementedError, "You must define start_url" unless self.class.start_url
11
+ super(address_info)
12
+ end
13
+
14
+ class << self
15
+ protected
16
+ def standardize(address_info, action, attrs_to_fields)
17
+ is_canada = (action =~ /Canadian/)
18
+ addr = new(address_info)
19
+ fields = nil
20
+ WWW::Mechanize.new do |ua|
21
+ form_page = ua.get(start_url)
22
+ form = form_page.form_with(:action => action) do |form|
23
+ attrs_to_fields.each do |attr, field|
24
+ form[field] = addr.send(attr)
25
+ end
26
+ end
27
+ results_page = form.submit(form.buttons.first)
28
+
29
+ table = results_page.search("table.Tableresultborder")[1]
30
+ return unless table
31
+ status_row = table.at("span.Titresultableok")
32
+ return unless status_row && status_row.inner_text =~ /Address Verified/
33
+ main_td = table.search("tr:eq(#{is_canada ? 2 : 3})/td:eq(2)")
34
+ street_part, city_state_zip_part = main_td.inner_html.split("<br>")[0..1]
35
+ street = street_part.strip_html.strip_whitespace
36
+ city, state, zip = city_state_zip_part.strip_html.split("\240\240")
37
+ #pp :main_td => main_td.to_s,
38
+ # :street_part => street_part,
39
+ # :city_state_zip_part => city_state_zip_part
40
+ fields = [ street.upcase, city.upcase, state.upcase, zip.upcase ]
41
+ end
42
+ fields
43
+ end
44
+ end
45
+ end
46
+
47
+ class USAddress < BaseAddress
48
+ self.start_url = 'http://www.melissadata.com/lookups/AddressVerify.asp'
49
+ self.valid_keys = %w(street city state zip)
50
+
51
+ def self.standardize(address_info)
52
+ if fields = super(address_info, "AddressVerify.asp", :street => 'Address', :city => 'city', :state => 'state', :zip => 'zip')
53
+ street, city, state, zip = fields
54
+ new(:street => street, :city => city, :state => state, :zip => zip)
55
+ end
56
+ end
57
+ end
58
+
59
+ class CanadianAddress < BaseAddress
60
+ self.start_url = 'http://www.melissadata.com/lookups/CanadianAddressVerify.asp'
61
+ self.valid_keys = %w(street city province postalcode)
62
+
63
+ def self.standardize(address_info)
64
+ if fields = super(address_info, "CanadianAddressVerify.asp", :street => 'Street', :city => 'city', :province => 'Province', :postalcode => 'Postcode')
65
+ street, city, province, postalcode = fields
66
+ new(:street => street, :city => city, :province => province, :postalcode => postalcode)
67
+ end
68
+ end
69
+ end
70
+ end
71
+ end
@@ -0,0 +1,17 @@
1
+ class String
2
+ def strip_html
3
+ gsub(/<\/?([^>]+)>/, '')
4
+ end
5
+ def strip_newlines
6
+ gsub(/[\r\n]+/, '')
7
+ end
8
+ def strip_whitespace
9
+ strip_newlines.squeeze(" ").strip
10
+ end
11
+
12
+ def url_escape
13
+ gsub(/([^ a-zA-Z0-9_.-]+)/n) do
14
+ '%' + $1.unpack('H2' * $1.size).join('%').upcase
15
+ end.tr(' ', '+')
16
+ end
17
+ end
@@ -0,0 +1,48 @@
1
+ require 'test_helper'
2
+
3
+ AddressStandardization::GoogleMaps.api_key = "ABQIAAAALHg3jKnK9wN9K3_ArJA6TxSTZ2OgdK08l2h0_gdsozNQ-6zpaxQvIY84J7Mh1fAHQrYGI4W27qKZaw"
4
+
5
+ class GoogleMapsTest < Test::Unit::TestCase
6
+ test "A valid US address" do
7
+ addr = AddressStandardization::GoogleMaps::Address.standardize(
8
+ :street => "1600 Amphitheatre Parkway",
9
+ :city => "Mountain View",
10
+ :state => "CA"
11
+ )
12
+ addr.should == AddressStandardization::GoogleMaps::Address.new(
13
+ "street" => "1600 AMPHITHEATRE PKWY",
14
+ "city" => "MOUNTAIN VIEW",
15
+ "state" => "CA",
16
+ "province" => "CA",
17
+ "postalcode" => "94043",
18
+ "zip" => "94043",
19
+ "country" => "USA"
20
+ )
21
+ end
22
+
23
+ test "A valid Canadian address" do
24
+ addr = AddressStandardization::GoogleMaps::Address.standardize(
25
+ :street => "1770 Stenson Boulevard",
26
+ :city => "Peterborough",
27
+ :province => "ON"
28
+ )
29
+ addr.should == AddressStandardization::GoogleMaps::Address.new(
30
+ "street" => "1770 STENSON BLVD",
31
+ "city" => "PETERBOROUGH",
32
+ "state" => "ON",
33
+ "province" => "ON",
34
+ "postalcode" => "K9K",
35
+ "zip" => "K9K",
36
+ "country" => "CANADA"
37
+ )
38
+ end
39
+
40
+ test "An invalid address" do
41
+ addr = AddressStandardization::GoogleMaps::Address.standardize(
42
+ :street => "123 Imaginary Lane",
43
+ :city => "Some Town",
44
+ :state => "AK"
45
+ )
46
+ addr.should == nil
47
+ end
48
+ end
@@ -0,0 +1,49 @@
1
+ require 'test_helper'
2
+
3
+ class MelissaDataTest < Test::Unit::TestCase
4
+ test "Valid US address" do
5
+ addr = AddressStandardization::MelissaData::USAddress.standardize(
6
+ :street => "1 Infinite Loop",
7
+ :city => "Cupertino",
8
+ :state => "CA"
9
+ )
10
+ addr.should == AddressStandardization::MelissaData::USAddress.new(
11
+ "street" => "1 INFINITE LOOP",
12
+ "city" => "CUPERTINO",
13
+ "state" => "CA",
14
+ "zip" => "95014-2083"
15
+ )
16
+ end
17
+
18
+ test "Invalid US address" do
19
+ addr = AddressStandardization::MelissaData::USAddress.standardize(
20
+ :street => "123 Imaginary Lane",
21
+ :city => "Some Town",
22
+ :state => "AK"
23
+ )
24
+ addr.should == nil
25
+ end
26
+
27
+ test "Valid Canadian address" do
28
+ addr = AddressStandardization::MelissaData::CanadianAddress.standardize(
29
+ :street => "3025 Clayhill Rd",
30
+ :city => "Mississauga",
31
+ :province => "ON"
32
+ )
33
+ addr.should == AddressStandardization::MelissaData::CanadianAddress.new(
34
+ "street" => "3025 CLAYHILL RD",
35
+ "province" => "ON",
36
+ "city" => "MISSISSAUGA",
37
+ "postalcode" => "L5B 4L2"
38
+ )
39
+ end
40
+
41
+ test "Invalid Canadian address" do
42
+ addr = AddressStandardization::MelissaData::CanadianAddress.standardize(
43
+ :street => "123 Imaginary Lane",
44
+ :city => "Some Town",
45
+ :province => "BC"
46
+ )
47
+ addr.should == nil
48
+ end
49
+ end
@@ -0,0 +1,12 @@
1
+ # http://sneaq.net/textmate-wtf
2
+ #$LOAD_PATH.reject! { |e| e.include? 'TextMate' }
3
+
4
+ #dir = File.dirname(__FILE__)
5
+ #lib = dir + "/../lib"
6
+ #$LOAD_PATH.unshift(lib)
7
+
8
+ require 'rubygems'
9
+ require 'context'
10
+ require 'matchy'
11
+
12
+ require 'address_standardization'
metadata ADDED
@@ -0,0 +1,110 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: address_standardization
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.3.0
5
+ platform: ruby
6
+ authors:
7
+ - Elliot Winkler
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+
12
+ date: 2010-01-01 00:00:00 -06:00
13
+ default_executable:
14
+ dependencies:
15
+ - !ruby/object:Gem::Dependency
16
+ name: mechanize
17
+ type: :runtime
18
+ version_requirement:
19
+ version_requirements: !ruby/object:Gem::Requirement
20
+ requirements:
21
+ - - ">="
22
+ - !ruby/object:Gem::Version
23
+ version: "0"
24
+ version:
25
+ - !ruby/object:Gem::Dependency
26
+ name: hpricot
27
+ type: :runtime
28
+ version_requirement:
29
+ version_requirements: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - ">="
32
+ - !ruby/object:Gem::Version
33
+ version: "0"
34
+ version:
35
+ - !ruby/object:Gem::Dependency
36
+ name: mcmire-contest
37
+ type: :development
38
+ version_requirement:
39
+ version_requirements: !ruby/object:Gem::Requirement
40
+ requirements:
41
+ - - ">="
42
+ - !ruby/object:Gem::Version
43
+ version: "0"
44
+ version:
45
+ - !ruby/object:Gem::Dependency
46
+ name: mcmire-matchy
47
+ type: :development
48
+ version_requirement:
49
+ version_requirements: !ruby/object:Gem::Requirement
50
+ requirements:
51
+ - - ">="
52
+ - !ruby/object:Gem::Version
53
+ version: "0"
54
+ version:
55
+ description: A tiny Ruby library to quickly standardize a postal address
56
+ email: elliot.winkler@gmail.com
57
+ executables: []
58
+
59
+ extensions: []
60
+
61
+ extra_rdoc_files:
62
+ - README.rdoc
63
+ - TODO
64
+ files:
65
+ - .gitignore
66
+ - README.rdoc
67
+ - Rakefile
68
+ - TODO
69
+ - VERSION
70
+ - lib/address_standardization.rb
71
+ - lib/address_standardization/abstract_address.rb
72
+ - lib/address_standardization/class_level_inheritable_attributes.rb
73
+ - lib/address_standardization/google_maps.rb
74
+ - lib/address_standardization/melissa_data.rb
75
+ - lib/address_standardization/ruby_ext.rb
76
+ - test/google_maps_test.rb
77
+ - test/melissa_data_test.rb
78
+ - test/test_helper.rb
79
+ has_rdoc: true
80
+ homepage: http://github.com/mcmire/address_standardization
81
+ licenses: []
82
+
83
+ post_install_message:
84
+ rdoc_options:
85
+ - --charset=UTF-8
86
+ require_paths:
87
+ - lib
88
+ required_ruby_version: !ruby/object:Gem::Requirement
89
+ requirements:
90
+ - - ">="
91
+ - !ruby/object:Gem::Version
92
+ version: "0"
93
+ version:
94
+ required_rubygems_version: !ruby/object:Gem::Requirement
95
+ requirements:
96
+ - - ">="
97
+ - !ruby/object:Gem::Version
98
+ version: "0"
99
+ version:
100
+ requirements: []
101
+
102
+ rubyforge_project:
103
+ rubygems_version: 1.3.5
104
+ signing_key:
105
+ specification_version: 3
106
+ summary: A tiny Ruby library to quickly standardize a postal address
107
+ test_files:
108
+ - test/google_maps_test.rb
109
+ - test/melissa_data_test.rb
110
+ - test/test_helper.rb