RubyGems - fuzzy_match - Versions diffs - 1.1.1 → 1.2.1 - Mend

fuzzy_match 1.1.1 → 1.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (22) hide show

data/.gitignore +3 -1
data/README.markdown +124 -0
data/Rakefile +5 -8
data/benchmark/before-with-free.txt +25 -25
data/benchmark/before-without-last-result.txt +31 -31
data/benchmark/before.txt +29 -29
data/benchmark/memory.rb +3 -4
data/examples/bts_aircraft/{tighteners.csv → normalizers.csv} +0 -0
data/examples/bts_aircraft/test_bts_aircraft.rb +3 -3
data/lib/fuzzy_match/blocking.rb +1 -1
data/lib/fuzzy_match/identity.rb +1 -1
data/lib/fuzzy_match/{tightener.rb → normalizer.rb} +5 -5
data/lib/fuzzy_match/result.rb +1 -1
data/lib/fuzzy_match/version.rb +1 -1
data/lib/fuzzy_match/wrapper.rb +3 -3
data/lib/fuzzy_match.rb +30 -45
data/test/test_blocking.rb +5 -0
data/test/test_fuzzy_match.rb +40 -42
data/test/test_identity.rb +5 -0
data/test/{test_tightening.rb → test_normalizer.rb} +2 -2
metadata +26 -25
data/README.rdoc +0 -94

data/lib/fuzzy_match/version.rb CHANGED Viewed

@@ -1,3 +1,3 @@
 class FuzzyMatch
-  VERSION = '1.1.1'
+  VERSION = '1.2.1'
 end

data/lib/fuzzy_match/wrapper.rb CHANGED Viewed

@@ -58,9 +58,9 @@ class FuzzyMatch
     end
     def variants
-      @variants ||= fuzzy_match.tighteners.inject([ render ]) do |memo, tightener|
-        if tightener.apply? render
-          memo.push tightener.apply(render)
+      @variants ||= fuzzy_match.normalizers.inject([ render ]) do |memo, normalizer|
+        if normalizer.apply? render
+          memo.push normalizer.apply(render)
         end
         memo
       end.uniq

data/lib/fuzzy_match.rb CHANGED Viewed

@@ -5,18 +5,21 @@ if ::ActiveSupport::VERSION::MAJOR >= 3
 end
 require 'to_regexp'
+require 'fuzzy_match/normalizer'
+require 'fuzzy_match/stop_word'
+require 'fuzzy_match/blocking'
+require 'fuzzy_match/identity'
+require 'fuzzy_match/result'
+require 'fuzzy_match/wrapper'
+require 'fuzzy_match/similarity'
+require 'fuzzy_match/score'
+if defined?(::ActiveRecord)
+  require 'fuzzy_match/cached_result'
+end
 # See the README for more information.
 class FuzzyMatch
-  autoload :Tightener, 'fuzzy_match/tightener'
-  autoload :StopWord, 'fuzzy_match/stop_word'
-  autoload :Blocking, 'fuzzy_match/blocking'
-  autoload :Identity, 'fuzzy_match/identity'
-  autoload :Result, 'fuzzy_match/result'
-  autoload :Wrapper, 'fuzzy_match/wrapper'
-  autoload :Similarity, 'fuzzy_match/similarity'
-  autoload :Score, 'fuzzy_match/score'
-  autoload :CachedResult, 'fuzzy_match/cached_result'
   DEFAULT_OPTIONS = {
     :first_blocking_decides => false,
     :must_match_blocking => false,
@@ -28,33 +31,32 @@ class FuzzyMatch
   attr_reader :haystack
   attr_reader :blockings
   attr_reader :identities
-  attr_reader :tighteners
+  attr_reader :normalizers
   attr_reader :stop_words
   attr_reader :read
   attr_reader :default_options
   # haystack - a bunch of records that will compete to see who best matches the needle
   #
-  # rules (can only be specified at initialization or by using a setter)
-  # * tighteners: regexps (see readme)
-  # * identities: regexps
-  # * blockings: regexps
-  # * stop_words: regexps
-  # * read: how to interpret each entry in the 'haystack', either a Proc or a symbol
+  # Rules (can only be specified at initialization or by using a setter)
+  # * :<tt>normalizers</tt> - regexps (see README)
+  # * :<tt>identities</tt> - regexps
+  # * :<tt>blockings</tt> - regexps
+  # * :<tt>stop_words</tt> - regexps
   #
-  # options (can be specified at initialization or when calling #find)
-  # * first_blocking_decides
-  # * must_match_blocking
-  # * must_match_at_least_one_word
-  # * gather_last_result
-  # * find_all
+  # Options (can be specified at initialization or when calling #find)
+  # * :<tt>read</tt> - how to interpret each record in the 'haystack', either a Proc or a symbol
+  # * :<tt>must_match_blocking</tt> - don't return a match unless the needle fits into one of the blockings you specified
+  # * :<tt>must_match_at_least_one_word</tt> - don't return a match unless the needle shares at least one word with the match
+  # * :<tt>first_blocking_decides</tt> - force records into the first blocking they match, rather than choosing a blocking that will give them a higher score
+  # * :<tt>gather_last_result</tt> - enable <tt>last_result</tt>
   def initialize(competitors, options_and_rules = {})
     options_and_rules = options_and_rules.symbolize_keys
     # rules
     self.blockings = options_and_rules.delete(:blockings) || []
     self.identities = options_and_rules.delete(:identities) || []
-    self.tighteners = options_and_rules.delete(:tighteners) || []
+    self.normalizers = options_and_rules.delete(:normalizers) || options_and_rules.delete(:tighteners) || []
     self.stop_words = options_and_rules.delete(:stop_words) || []
     @read = options_and_rules.delete(:read) || options_and_rules.delete(:haystack_reader)
@@ -73,8 +75,8 @@ class FuzzyMatch
     @identities = ary.map { |regexp_or_str| Identity.new regexp_or_str }
   end
-  def tighteners=(ary)
-    @tighteners = ary.map { |regexp_or_str| Tightener.new regexp_or_str }
+  def normalizers=(ary)
+    @normalizers = ary.map { |regexp_or_str| Normalizer.new regexp_or_str }
   end
   def stop_words=(ary)
@@ -95,8 +97,6 @@ class FuzzyMatch
   end
   def find(needle, options = {})
-    raise ::RuntimeError, "[fuzzy_match] Dictionary has already been freed, can't perform more finds" if freed?
     options = options.symbolize_keys.reverse_merge default_options
     gather_last_result = options[:gather_last_result]
@@ -106,7 +106,6 @@ class FuzzyMatch
     must_match_at_least_one_word = options[:must_match_at_least_one_word]
     if gather_last_result
-      free_last_result
       @last_result = Result.new
       last_result.read = read
       last_result.haystack = haystack
@@ -118,7 +117,7 @@ EOS
     end
     if gather_last_result
-      last_result.tighteners = tighteners
+      last_result.normalizers = normalizers
       last_result.identities = identities
       last_result.blockings = blockings
       last_result.stop_words = stop_words
@@ -263,21 +262,7 @@ EOS
     last_result.explain
   end
-  def freed?
-    @freed == true
-  end
+  # DEPRECATED - doesn't do anything
   def free
-    free_last_result
-    @haystack.try :clear
-    @haystack = nil
-  ensure
-    @freed = true
-  end
-  private
-  def free_last_result
-    @last_result = nil
   end
 end

data/test/test_blocking.rb CHANGED Viewed

@@ -20,4 +20,9 @@ class TestBlocking < Test::Unit::TestCase
     b = FuzzyMatch::Blocking.new %r{apple}
     assert_equal nil, b.join?('orange', 'orange')
   end
+  def test_004_accepts_case_insensitivity
+    b = FuzzyMatch::Blocking.new %r{apple}i
+    assert_equal true, b.match?('2 Apples')
+  end
 end

data/test/test_fuzzy_match.rb CHANGED Viewed

@@ -6,12 +6,12 @@ class TestFuzzyMatch < Test::Unit::TestCase
     d = FuzzyMatch.new %w{ RATZ CATZ }
     assert_equal 'RATZ', d.find('RITZ')
     assert_equal 'RATZ', d.find('RíTZ')
     d = FuzzyMatch.new [ 'X' ]
     assert_equal 'X', d.find('X')
     assert_equal nil, d.find('A')
   end
   def test_002_dont_gather_last_result_by_default
     d = FuzzyMatch.new %w{ NISSAN HONDA }
     d.find('MISSAM')
@@ -19,88 +19,86 @@ class TestFuzzyMatch < Test::Unit::TestCase
       d.last_result
     end
   end
   def test_003_last_result
     d = FuzzyMatch.new %w{ NISSAN HONDA }
     d.find 'MISSAM', :gather_last_result => true
     assert_equal 0.6, d.last_result.score
     assert_equal 'NISSAN', d.last_result.winner
   end
-  def test_004_false_positive_without_tightener
+  def test_005_correct_with_normalizer
     d = FuzzyMatch.new ['BOEING 737-100/200', 'BOEING 737-900']
-    assert_equal 'BOEING 737-900', d.find('BOEING 737100 number 900')
-  end
-  def test_005_correct_with_tightener
-    tighteners = [
+    assert_equal 'BOEING 737-900', d.find('BOEING 737100 number 900') # false positive without normalizer
+    normalizers = [
       %r{(7\d)(7|0)-?(\d{1,3})} # tighten 737-100/200 => 737100, which will cause it to win over 737-900
     ]
-    d = FuzzyMatch.new ['BOEING 737-100/200', 'BOEING 737-900'], :tighteners => tighteners
+    d = FuzzyMatch.new ['BOEING 737-100/200', 'BOEING 737-900'], :normalizers => normalizers
     assert_equal 'BOEING 737-100/200', d.find('BOEING 737100 number 900')
   end
   def test_008_false_positive_without_identity
     d = FuzzyMatch.new %w{ foo bar }
     assert_equal 'bar', d.find('baz')
   end
   def test_008_identify_false_positive
     d = FuzzyMatch.new %w{ foo bar }, :identities => [ /ba(.)/ ]
     assert_equal nil, d.find('baz')
   end
   # TODO this is not very helpful
   def test_009_blocking
     d = FuzzyMatch.new [ 'X' ], :blockings => [ /X/, /Y/ ]
     assert_equal 'X', d.find('X')
     assert_equal nil, d.find('A')
   end
   # TODO this is not very helpful
   def test_0095_must_match_blocking
     d = FuzzyMatch.new [ 'X' ], :blockings => [ /X/, /Y/ ], :must_match_blocking => true
     assert_equal 'X', d.find('X')
     assert_equal nil, d.find('A')
     d = FuzzyMatch.new [ 'X' ], :blockings => [ /X/, /Y/ ]
     assert_equal 'X', d.find('X', :must_match_blocking => true)
     assert_equal nil, d.find('A', :must_match_blocking => true)
   end
-  def test_011_free
-    d = FuzzyMatch.new %w{ NISSAN HONDA }
-    d.free
-    assert_raises(::RuntimeError, /free/) do
-      d.find('foobar')
+  def test_011_free_does_nothing
+    d = FuzzyMatch.new %w{ A B }
+    assert_nothing_raised do
+      d.free
+      d.find 'A'
     end
   end
   def test_012_find_all
     d = FuzzyMatch.new [ 'X', 'X22', 'Y', 'Y4' ], :blockings => [ /X/, /Y/ ], :must_match_blocking => true
     assert_equal ['X', 'X22' ], d.find_all('X')
     assert_equal [], d.find_all('A')
   end
   def test_013_first_blocking_decides
     d = FuzzyMatch.new [ 'Boeing 747', 'Boeing 747SR', 'Boeing ER6' ], :blockings => [ /(boeing \d{3})/i, /boeing/i ]
     assert_equal [ 'Boeing 747', 'Boeing 747SR', 'Boeing ER6' ], d.find_all('Boeing 747')
     d = FuzzyMatch.new [ 'Boeing 747', 'Boeing 747SR', 'Boeing ER6' ], :blockings => [ /(boeing \d{3})/i, /boeing/i ], :first_blocking_decides => true
     assert_equal [ 'Boeing 747', 'Boeing 747SR' ], d.find_all('Boeing 747')
     # first_blocking_decides refers to the needle
     d = FuzzyMatch.new [ 'Boeing 747', 'Boeing 747SR', 'Boeing ER6' ], :blockings => [ /(boeing \d{3})/i, /boeing/i ], :first_blocking_decides => true
     assert_equal ["Boeing ER6", "Boeing 747", "Boeing 747SR"], d.find_all('Boeing ER6')
     d = FuzzyMatch.new [ 'Boeing 747', 'Boeing 747SR', 'Boeing ER6' ], :blockings => [ /(boeing \d{3})/i, /boeing (7|E)/i, /boeing/i ], :first_blocking_decides => true
     assert_equal [ 'Boeing ER6' ], d.find_all('Boeing ER6')
     # or equivalently with an identity
     d = FuzzyMatch.new [ 'Boeing 747', 'Boeing 747SR', 'Boeing ER6' ], :blockings => [ /(boeing \d{3})/i, /boeing/i ], :first_blocking_decides => true, :identities => [ /boeing (7|E)/i ]
     assert_equal [ 'Boeing ER6' ], d.find_all('Boeing ER6')
   end
   MyStruct = Struct.new(:one, :two)
   def test_014_symbol_read_sends_method
     ab = MyStruct.new('a', 'b')
@@ -115,7 +113,7 @@ class TestFuzzyMatch < Test::Unit::TestCase
     assert_equal ba, by_first.find('b')
     assert_equal ba, by_last.find('a')
   end
   def test_015_symbol_read_reads_array
     ab = ['a', 'b']
     ba = ['b', 'a']
@@ -127,7 +125,7 @@ class TestFuzzyMatch < Test::Unit::TestCase
     assert_equal ba, by_first.find('b')
     assert_equal ba, by_last.find('a')
   end
   def test_016_symbol_read_reads_hash
     ab = { :one => 'a', :two => 'b' }
     ba = { :one => 'b', :two => 'a' }
@@ -139,7 +137,7 @@ class TestFuzzyMatch < Test::Unit::TestCase
     assert_equal ba, by_first.find('b')
     assert_equal ba, by_last.find('a')
   end
   def test_017_understands_haystack_reader_option
     ab = ['a', 'b']
     ba = ['b', 'a']
@@ -148,31 +146,31 @@ class TestFuzzyMatch < Test::Unit::TestCase
     assert_equal ab, by_first.find('a')
     assert_equal ba, by_first.find('b')
   end
   def test_018_no_result_if_best_score_is_zero
     assert_equal nil, FuzzyMatch.new(['a']).find('b')
   end
   def test_019_must_match_at_least_one_word
     d = FuzzyMatch.new %w{ RATZ CATZ }, :must_match_at_least_one_word => true
     assert_equal nil, d.find('RITZ')
     d = FuzzyMatch.new ["Foo's Bar"], :must_match_at_least_one_word => true
     assert_equal nil, d.find("Jacob's")
     assert_equal "Foo's Bar", d.find("Foo's")
   end
   def test_020_stop_words
     d = FuzzyMatch.new [ 'A HOTEL', 'B HTL' ]
     assert_equal 'B HTL', d.find('A HTL', :must_match_at_least_one_word => true)
     d = FuzzyMatch.new [ 'A HOTEL', 'B HTL' ], :must_match_at_least_one_word => true
     assert_equal 'B HTL', d.find('A HTL')
     d = FuzzyMatch.new [ 'A HOTEL', 'B HTL' ], :must_match_at_least_one_word => true, :stop_words => [ %r{HO?TE?L} ]
     assert_equal 'A HOTEL', d.find('A HTL')
   end
   def test_021_explain_prints_to_stdout
     require 'stringio'
     capture = StringIO.new
@@ -187,15 +185,15 @@ class TestFuzzyMatch < Test::Unit::TestCase
     capture.rewind
     assert capture.read.include?('CATZ')
   end
   def test_022_compare_words_with_words
     d = FuzzyMatch.new [ 'PENINSULA HOTELS' ], :must_match_at_least_one_word => true
     assert_equal nil, d.find('DOLCE LA HULPE BXL FI')
   end
   def test_023_must_match_at_least_one_word_is_case_insensitive
     d = FuzzyMatch.new [ 'A', 'B' ]
     assert_equal 'A', d.find('a', :must_match_at_least_one_word => true)
   end
 end

data/test/test_identity.rb CHANGED Viewed

@@ -30,4 +30,9 @@ class TestIdentity < Test::Unit::TestCase
     i = FuzzyMatch::Identity.new '/\A\\\?\/(.*)etc\/mysql\$$/'
     assert_equal %r{\A\\?/(.*)etc/mysql\$$}, i.regexp
   end
+  def test_007_accepts_case_insensitivity
+    i = FuzzyMatch::Identity.new %r{(A)[ ]*(\d)}i
+    assert_equal true, i.identical?('A1', 'a     1foobar')
+  end
 end

data/test/{test_tightening.rb → test_normalizer.rb} RENAMED Viewed

@@ -1,8 +1,8 @@
 require 'helper'
-class TestTightener < Test::Unit::TestCase
+class TestNormalizer < Test::Unit::TestCase
   def test_001_apply
-    t = FuzzyMatch::Tightener.new %r{(Ford )[ ]*(F)[\- ]*(\d\d\d)}i
+    t = FuzzyMatch::Normalizer.new %r{(Ford )[ ]*(F)[\- ]*(\d\d\d)}i
     assert_equal 'Ford F350', t.apply('Ford F-350')
     assert_equal 'Ford F150', t.apply('Ford F150')
     assert_equal 'Ford F350', t.apply('Ford F 350')

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: fuzzy_match
 version: !ruby/object:Gem::Version
-  version: 1.1.1
+  version: 1.2.1
   prerelease:
 platform: ruby
 authors:
@@ -9,11 +9,11 @@ authors:
 autorequire:
 bindir: bin
 cert_chain: []
-date: 2012-01-17 00:00:00.000000000Z
+date: 2012-01-18 00:00:00.000000000Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: shoulda
-  requirement: &2153863620 !ruby/object:Gem::Requirement
+  requirement: &2177380220 !ruby/object:Gem::Requirement
     none: false
     requirements:
     - - ! '>='
@@ -21,10 +21,10 @@ dependencies:
         version: '0'
   type: :development
   prerelease: false
-  version_requirements: *2153863620
+  version_requirements: *2177380220
 - !ruby/object:Gem::Dependency
   name: remote_table
-  requirement: &2153862820 !ruby/object:Gem::Requirement
+  requirement: &2177379700 !ruby/object:Gem::Requirement
     none: false
     requirements:
     - - ! '>='
@@ -32,10 +32,10 @@ dependencies:
         version: '0'
   type: :development
   prerelease: false
-  version_requirements: *2153862820
+  version_requirements: *2177379700
 - !ruby/object:Gem::Dependency
   name: activerecord
-  requirement: &2153861940 !ruby/object:Gem::Requirement
+  requirement: &2177379100 !ruby/object:Gem::Requirement
     none: false
     requirements:
     - - ! '>='
@@ -43,10 +43,10 @@ dependencies:
         version: '3'
   type: :development
   prerelease: false
-  version_requirements: *2153861940
+  version_requirements: *2177379100
 - !ruby/object:Gem::Dependency
   name: mysql
-  requirement: &2153861380 !ruby/object:Gem::Requirement
+  requirement: &2177378440 !ruby/object:Gem::Requirement
     none: false
     requirements:
     - - ! '>='
@@ -54,10 +54,10 @@ dependencies:
         version: '0'
   type: :development
   prerelease: false
-  version_requirements: *2153861380
+  version_requirements: *2177378440
 - !ruby/object:Gem::Dependency
   name: cohort_scope
-  requirement: &2153860800 !ruby/object:Gem::Requirement
+  requirement: &2177377600 !ruby/object:Gem::Requirement
     none: false
     requirements:
     - - ! '>='
@@ -65,10 +65,10 @@ dependencies:
         version: '0'
   type: :development
   prerelease: false
-  version_requirements: *2153860800
+  version_requirements: *2177377600
 - !ruby/object:Gem::Dependency
   name: weighted_average
-  requirement: &2153860020 !ruby/object:Gem::Requirement
+  requirement: &2177377020 !ruby/object:Gem::Requirement
     none: false
     requirements:
     - - ! '>='
@@ -76,10 +76,10 @@ dependencies:
         version: '0'
   type: :development
   prerelease: false
-  version_requirements: *2153860020
+  version_requirements: *2177377020
 - !ruby/object:Gem::Dependency
   name: rake
-  requirement: &2153858540 !ruby/object:Gem::Requirement
+  requirement: &2177376420 !ruby/object:Gem::Requirement
     none: false
     requirements:
     - - ! '>='
@@ -87,10 +87,10 @@ dependencies:
         version: '0'
   type: :development
   prerelease: false
-  version_requirements: *2153858540
+  version_requirements: *2177376420
 - !ruby/object:Gem::Dependency
   name: activesupport
-  requirement: &2153857380 !ruby/object:Gem::Requirement
+  requirement: &2177375240 !ruby/object:Gem::Requirement
     none: false
     requirements:
     - - ! '>='
@@ -98,10 +98,10 @@ dependencies:
         version: '3'
   type: :runtime
   prerelease: false
-  version_requirements: *2153857380
+  version_requirements: *2177375240
 - !ruby/object:Gem::Dependency
   name: to_regexp
-  requirement: &2153856360 !ruby/object:Gem::Requirement
+  requirement: &2177374500 !ruby/object:Gem::Requirement
     none: false
     requirements:
     - - ! '>='
@@ -109,7 +109,7 @@ dependencies:
         version: 0.0.3
   type: :runtime
   prerelease: false
-  version_requirements: *2153856360
+  version_requirements: *2177374500
 description: Find a needle in a haystack using string similarity and (optionally)
   regexp rules. Replaces loose_tight_dictionary.
 email:
@@ -122,7 +122,7 @@ files:
 - .gitignore
 - Gemfile
 - LICENSE
-- README.rdoc
+- README.markdown
 - Rakefile
 - THANKS-WILLIAM-JAMES.rb
 - benchmark/before-with-free.txt
@@ -137,10 +137,10 @@ files:
 - examples/bts_aircraft/blockings.csv
 - examples/bts_aircraft/identities.csv
 - examples/bts_aircraft/negatives.csv
+- examples/bts_aircraft/normalizers.csv
 - examples/bts_aircraft/number_260.csv
 - examples/bts_aircraft/positives.csv
 - examples/bts_aircraft/test_bts_aircraft.rb
-- examples/bts_aircraft/tighteners.csv
 - examples/first_name_matching.rb
 - examples/icao-bts.xls
 - fuzzy_match.gemspec
@@ -148,11 +148,11 @@ files:
 - lib/fuzzy_match/blocking.rb
 - lib/fuzzy_match/cached_result.rb
 - lib/fuzzy_match/identity.rb
+- lib/fuzzy_match/normalizer.rb
 - lib/fuzzy_match/result.rb
 - lib/fuzzy_match/score.rb
 - lib/fuzzy_match/similarity.rb
 - lib/fuzzy_match/stop_word.rb
-- lib/fuzzy_match/tightener.rb
 - lib/fuzzy_match/version.rb
 - lib/fuzzy_match/wrapper.rb
 - test/helper.rb
@@ -161,7 +161,7 @@ files:
 - test/test_fuzzy_match.rb
 - test/test_fuzzy_match_convoluted.rb.disabled
 - test/test_identity.rb
-- test/test_tightening.rb
+- test/test_normalizer.rb
 homepage: https://github.com/seamusabshere/fuzzy_match
 licenses: []
 post_install_message:
@@ -194,4 +194,5 @@ test_files:
 - test/test_fuzzy_match.rb
 - test/test_fuzzy_match_convoluted.rb.disabled
 - test/test_identity.rb
-- test/test_tightening.rb
+- test/test_normalizer.rb
+has_rdoc:

data/README.rdoc DELETED Viewed

@@ -1,94 +0,0 @@
-= fuzzy_match
-Find a needle in a haystack based on string similarity (using the Pair Distance algorithm and Levenshtein distance) and regular expressions.
-Replaces {loose_tight_dictionary}[https://github.com/seamusabshere/loose_tight_dictionary] because that was a confusing name.
-== Quickstart
-    >> require 'fuzzy_match'
-    => true
-    >> FuzzyMatch.new(%w{seamus andy ben}).find('Shamus')
-    => "seamus"
-== String similarity matching
-Uses {Dice's Coefficient}[http://en.wikipedia.org/wiki/Dice's_coefficient] algorithm (aka Pair Distance).
-If that judges two strings to be be equally similar to a third string, then Levenshtein distance is used. For example, pair distance considers "RATZ" and "CATZ" to be equally similar to "RITZ" so we invoke Levenshtein.
-    >> require 'amatch'
-    => true
-    >> 'RITZ'.pair_distance_similar 'RATZ'
-    => 0.3333333333333333
-    >> 'RITZ'.pair_distance_similar 'CATZ'  # <-- pair distance can't tell the difference, so we fall back to levenshtein...
-    => 0.3333333333333333
-    >> 'RITZ'.levenshtein_similar 'RATZ'
-    => 0.75
-    >> 'RITZ'.levenshtein_similar 'CATZ'    # <-- which properly shows that RATZ should win
-    => 0.5
-== Production use
-Over 2 years in {Brighter Planet's environmental impact API}[http://impact.brighterplanet.com] and {reference data service}[http://data.brighterplanet.com].
-== Haystacks and how to read them
-The (admittedly imperfect) metaphor is "look for a needle in a haystack"
-* needle - the search term
-* haystack - the records you are searching (<b>your result will be an object from here</b>)
-So, what if your needle is a string like <tt>youruguay</tt> and your haystack is full of <tt>Country</tt> objects like <tt><Country name:"Uruguay"></tt>?
-    >> FuzzyMatch.new(countries, :read => :name).find('youruguay')
-    => <Country name:"Uruguay">
-== Regular expressions
-You can improve the default matchings with regular expressions.
-* Emphasize important words using <b>blockings</b> and <b>tighteners</b>
-* Filter out stop words with <b>tighteners</b>
-* Prevent impossible matches with <b>blockings</b> and <b>identities</b>
-* Ignore words with <b>stop words</b>
-=== Blockings
-Setting a blocking of <tt>/Airbus/</tt> ensures that strings containing "Airbus" will only be scored against to other strings containing "Airbus". A better blocking in this case would probably be <tt>/airbus/i</tt>.
-=== Tighteners
-Adding a tightener like <tt>/(boeing).*(7\d\d)/i</tt> will cause "BOEING COMPANY 747" and "boeing747" to be scored as if they were "BOEING 747" and "boeing 747", respectively. See also "Case sensitivity" below.
-=== Identities
-Adding an identity like <tt>/(F)\-?(\d50)/</tt> ensures that "Ford F-150" and "Ford F-250" never match.
-=== Stop words
-Adding a stop word like <tt>THE</tt> ensures that it is not taken into account when comparing "THE CAT", "THE DAT", and "THE CATT"
-== Case sensitivity
-Scoring is case-insensitive. Everything is downcased before scoring. This is a change from previous versions. Your regexps may still be case-sensitive, though.
-== Examples
-Check out the tests.
-== Speed (and who to thank for the algorithms)
-If you add the amatch[http://flori.github.com/amatch/] gem to your Gemfile, it will use that, which is much faster (but {segfaults have been seen in the wild}[https://github.com/flori/amatch/issues/3]). Thanks {Flori}[https://github.com/flori]!
-Otherwise, pure ruby versions of the string similarity algorithms derived from the {answer to a StackOverflow question}[http://stackoverflow.com/questions/653157/a-better-similarity-ranking-algorithm-for-variable-length-strings] and {the text gem}[https://github.com/threedaymonk/text/blob/master/lib/text/levenshtein.rb] are used. Thanks {marzagao}[http://stackoverflow.com/users/10997/marzagao] and {threedaymonk}[https://github.com/threedaymonk]!
-== Authors
-* Seamus Abshere <seamus@abshere.net>
-* Ian Hough <ijhough@gmail.com>
-* Andy Rossmeissl <andy@rossmeissl.net>
-== Copyright
-Copyright 2011 Brighter Planet, Inc.