icu_name 1.1.1 → 1.2.0

Sign up to get free protection for your applications and to get access to all the features.
data/README.rdoc CHANGED
@@ -51,10 +51,11 @@ The method <tt>alternatives</tt> can be used to list alternatives to a given fir
51
51
 
52
52
  Name.new('Stephen', 'Orr').alternatives(:first) # => ["Steve"]
53
53
  Name.new('Michael Stephen', 'Orr').alternatives(:first) # => ["Steve", "Mike", "Mick", "Mikey"],
54
+ Name.new('Oissine', 'Murphy').alternatives(:last) # => ["Murchadha"],
54
55
  Name.new('Mark', 'Orr').alternatives(:first) # => []
55
56
 
56
- By default the class is only aware of a few common alternatives for first names (e.g. _Bobby_ and _Robert_,
57
- _Bill_ and _William_, etc). However, this can be customized (see below).
57
+ By default the class uses a set of first and last name alternatives curated for the ICU.
58
+ However, this can be customized (see below).
58
59
 
59
60
  Supplying the <tt>match</tt> method with strings is equivalent to instantiating an instance with the same
60
61
  strings and then matching it. So, for example the following are equivalent:
@@ -109,7 +110,7 @@ The same option also relaxes the need for accented characters to match exactly:
109
110
  We saw above how _Bobby_ and _Robert_ were able to match because, by default, the
110
111
  matcher is aware of some common English nicknames. These name alternatives can be
111
112
  customised to handle additional nicknames and other types of alternative names
112
- such as common spelling error and player name changes.
113
+ such as common spelling errors and player name changes.
113
114
 
114
115
  The alternative names consist of two arrays, one for first names and
115
116
  one for last names. Each array element is itself an array of strings
@@ -117,58 +118,33 @@ representing a set of equivalent names. Here, for example, are some
117
118
  of the default first name alternatives:
118
119
 
119
120
  ["Anthony", "Tony"]
120
- ["James", "Jim", "Jimmy"]
121
- ["Michael", "Mike", "Mick", "Mikey"]
121
+ ["James", "Jim", "Jimmy", "Jamie"]
122
122
  ["Robert", "Bob", "Bobby"]
123
- ["Stephen", "Steve"]
124
- ["Steven", "Steve"]
123
+ ["Stephen", "Steve", "Steven"]
125
124
  ["Thomas", "Tom", "Tommy"]
126
- ["William", "Will", "Willy", "Willie", "Bill"]
127
125
 
128
126
  The first of these means that _Anthony_ and _Tony_ are considered equivalent and can match.
129
127
 
130
- Name.new("Tony", "Miles").match("Anthony", "Miles") # => true
131
-
132
- Note that both _Steven_ and _Stephen_ match _Steve_ but, because they don't occur in the
133
- same group, they don't match each other.
134
-
135
- Name.new("Steven", "Hanly").match("Steve", "Hanly") # => true
136
- Name.new("Stephen", "Hanly").match("Steve", "Hanly") # => true
137
- Name.new("Stephen", "Hanly").match("Steven", "Hanly") # => false
128
+ ICU::Name.new("Tony", "Miles").match("Anthony", "Miles") # => true
138
129
 
139
130
  To change alternative name behaviour, you can replace the default alternatives
140
131
  with a customized set perhaps stored in a database or a YAML file, as illustrated below:
141
132
 
133
+ ICU::Name.reset_alternatives
142
134
  data = YAML.load(File open "my_last_name_alternatives.yaml")
143
- Name.load_alternatives(:last, data)
135
+ ICU::Name.load_alternatives(:last, data)
144
136
  data = YAML.load(File open "my_first_name_alternatives.yaml")
145
- Name.load_alternatives(:first, data)
146
-
147
- An example of one way in which you might want to customize the alternatives is to
148
- cater for common spelling mistakes such as _Steven_ and _Stephen_. These two names
149
- don't match by default, but you can make them so by replacing the two default rules:
150
-
151
- ["Stephen", "Steve"]
152
- ["Steven", "Steve"]
153
-
154
- with the following single rule:
155
-
156
- ["Stephen", "Steven", "Steve"]
157
-
158
- so that now:
159
-
160
- Name.new("Stephen", "Hanly").match("Steven", "Hanly") # => true
137
+ ICU::Name.load_alternatives(:first, data)
161
138
 
162
- This kind of rule risks producing false positives - you must judge
163
- whether that risk is outweighed by the benefits of being able to overcome
164
- spelling mistakes in the context of your application.
139
+ Note that without the call to <tt>reset_alternatives</tt>, the new loaded alternatives
140
+ add to, rather than replace, the defaults.
165
141
 
166
- Another use is to cater for English and Irish versions of the same name.
167
- For example, for last names:
142
+ Other uses of alternatives is to cater for English and Irish versions of the same name,
143
+ for example (last names):
168
144
 
169
145
  [Murphy, Murchadha]
170
146
 
171
- or for first names, including spelling variations:
147
+ or for variations including spelling variations, for example (first names):
172
148
 
173
149
  [Patrick, Pat, Paddy, Padraig, Padraic, Padhraig, Padhraic]
174
150
 
@@ -1,37 +1,50 @@
1
1
  ---
2
+ - [Abdul, Abul]
2
3
  - [Alexander, Alex]
4
+ - [Anandagopal, Ananda]
3
5
  - [Andrew, Andy]
6
+ - [Anne, Ann]
4
7
  - [Anthony, Tony]
5
8
  - [Benjamin, Ben]
6
- - [Catherine, Cathy, Cath]
7
- - [Daniel, Danny, Dan]
9
+ - [Catherine, Cathy, Cath, Cate, Katherine, Kathy, Kath, Kate]
10
+ - [Charlie, Charles]
11
+ - [Chris, Christopher]
12
+ - [Daniel, Danial, Danny, Dan]
8
13
  - [David, Dave]
9
14
  - [Deborah, Debbie]
10
15
  - [Des, Desmond]
11
- - [Edward, Eddie, Eddy, Ed]
12
- - [Frederick, Fred]
13
- - [Frederic, Fred]
16
+ - [Douglas, Dougie]
17
+ - [Eamonn, Eamon]
18
+ - [Edward, Eddie, Eddy, Ed, Ned]
19
+ - [Eric, Erick, Erik]
20
+ - [Frederick, Frederic, Fred]
14
21
  - [Gerald, Gerry]
15
- - [Gerard, Gerry]
16
- - [James, Jim, Jimmy]
22
+ - [Gerhard, Gerard, Ger, Gerry]
23
+ - [James, Jim, Jimmy, Jamie]
24
+ - [Joanna, Joan, Joanne]
25
+ - [Joe, Joseph]
17
26
  - [John, Johnny]
18
27
  - [Jonathan, Jon]
19
- - [Kenneth, Ken, Kenny]
20
28
  - [Lyubomir, Lubomir]
21
- - [Michael, Mike, Mick, Mikey]
22
- - [Nic, Nick, Nicolas]
29
+ - [Kenneth, Ken, Kenny]
30
+ - [Michael, Mike, Mick, Micky, Mickie, Mikey, Micheal]
31
+ - [Muthu, Muthukumaran]
32
+ - [Nicholas, Nick, Nicolas]
23
33
  - [Nicola, Nickie, Nicky]
24
- - [Patrick, Pat]
25
- - [Patricia, Patty, Pat]
34
+ - [Patrick, Pat, Paddy, Padraig, Padraic, Padhraig, Padhraic]
35
+ - [Patricia, Paddy, Patty, Pat]
26
36
  - [Peter, Pete]
27
- - [Philip, Phil]
28
- - [Phillip, Phil]
37
+ - [Philip, Phillip, Phil]
38
+ - [Philippe, Phillippe, Phil]
39
+ - [Raymond, Ray]
29
40
  - [Rick, Ricky]
30
41
  - [Robert, Bob, Bobby]
31
- - [Samual, Sam]
32
- - [Samuel, Sam]
33
- - [Stephen, Steve]
34
- - [Steven, Steve]
42
+ - [Rodney, Rod]
43
+ - [Samual, Sam, Samuel]
44
+ - [Stef, Stefan, Stephan, Stefen, Stephen]
45
+ - [Steffy, Stefanie, Stephanie, Stefenie, Stephenie]
46
+ - [Stephen, Steve, Steven]
35
47
  - [Terence, Terry]
36
48
  - [Thomas, Tom, Tommy]
37
49
  - [William, Will, Willy, Willie, Bill]
50
+ - [Sean, John, !ruby/regexp /^Bradley$/]
@@ -1 +1,7 @@
1
- --- []
1
+ ---
2
+ - [Ffrench, French]
3
+ - [Murchadha, Murphy]
4
+ - [Quinn, Benjamin, !ruby/regexp /^(Debbie|Deborah)$/]
5
+ - [Astaneh Lopez, Lopez, !ruby/regexp /^Alex$/]
6
+ - [Gardenes Santiago, Gardenes, !ruby/regexp /^Manuel$/]
7
+ - ["O'Siochru", King, !ruby/regexp /^Mairead$/]
@@ -1,42 +1,11 @@
1
1
  ---
2
- - [Abdul, Abul]
3
- - [Alexander, Alex]
4
- - [Anandagopal, Ananda]
5
- - [Andrew, Andy]
6
- - [Anne, Ann]
7
- - [Anthony, Tony]
8
- - [Benjamin, Ben]
9
- - [Catherine, Cathy, Cath]
10
- - [Daniel, Danial, Danny, Dan]
11
- - [David, Dave]
12
2
  - [Deborah, Debbie]
13
- - [Des, Desmond]
14
- - [Eamonn, Eamon]
15
- - [Edward, Eddie, Eddy, Ed]
16
- - [Eric, Erick, Erik]
17
- - [Frederick, Frederic, Fred]
18
- - [Gerald, Gerry]
19
- - [Gerhard, Gerard, Ger, Gerry]
20
- - [James, Jim, Jimmy]
21
- - [Joanna, Joan, Joanne]
3
+ - [Demeter, Ceres]
22
4
  - [John, Johnny]
23
- - [Jonathan, Jon]
24
- - [Kenneth, Ken, Kenny]
25
5
  - [Lyubomir, Lubomir]
26
- - [Michael, Mike, Mick, Micky, Mickie, Mikey]
27
- - [Nicholas, Nick, Nicolas]
28
- - [Nicola, Nickie, Nicky]
6
+ - [Michael, Mike]
29
7
  - [Patrick, Pat, Paddy, Padraig, Padraic, Padhraig, Padhraic]
30
- - [Patricia, Paddy, Patty, Pat]
31
- - [Peter, Pete]
32
8
  - [Philippe, Philip, Phillippe, Phillip]
33
- - [Rick, Ricky]
34
- - [Robert, Bob, Bobby]
35
- - [Samual, Sam, Samuel]
36
- - [Stef, Stefan, Stephan, Stefen, Stephen]
37
- - [Steffy, Stefanie, Stephanie, Stefenie, Stephenie]
38
9
  - [Stephen, Steve, Steven]
39
- - [Terence, Terry]
40
- - [Thomas, Tom, Tommy]
41
10
  - [William, Will, Willy, Willie, Bill]
42
11
  - [Sean, John, !ruby/regexp /^Bradley$/]
@@ -1,4 +1,5 @@
1
1
  ---
2
+ - [Ares, Mars]
2
3
  - [Ffrench, French]
3
4
  - [Murchadha, Murphy]
4
5
  - [Quinn, Benjamin, !ruby/regexp /^(Debbie|Deborah)$/]
data/lib/icu_name/name.rb CHANGED
@@ -13,8 +13,8 @@ module ICU
13
13
 
14
14
  # Construct a new name from one or two strings or any objects that have a to_s method.
15
15
  def initialize(name1='', name2='')
16
- @name1 = Util.to_utf8(name1.to_s)
17
- @name2 = Util.to_utf8(name2.to_s)
16
+ @name1 = Util::String.to_utf8(name1.to_s)
17
+ @name2 = Util::String.to_utf8(name2.to_s)
18
18
  originalize
19
19
  canonicalize
20
20
  @first.freeze
@@ -69,10 +69,10 @@ module ICU
69
69
  match_first(first(opts), other.first(opts)) && match_last(last(opts), other.last(opts))
70
70
  end
71
71
 
72
- # Load a set of first or last name alternatives. If the YAML file name is absent,
73
- # the default set is loaded. <tt>type</tt> should be <tt>:first</tt> or <tt>:last</tt>.
74
- def self.load_alternatives(type, file=nil)
75
- compile_alts(check_type(type), file, true)
72
+ # Load a set of first or last name alternatives. If no data is absent, a default set will be loaded.
73
+ # <tt>type</tt> should be <tt>:first</tt> or <tt>:last</tt>.
74
+ def self.load_alternatives(type, data=nil)
75
+ compile_alts(check_type(type), data, true)
76
76
  end
77
77
 
78
78
  # Show first name or last name alternatives.
@@ -93,7 +93,7 @@ module ICU
93
93
  # Transliterate characters to ASCII.
94
94
  def transliterate(str, chars='US-ASCII')
95
95
  if chars.match(/ASCII/i)
96
- Util.transliterate(str)
96
+ Util::String.transliterate(str)
97
97
  else
98
98
  str.dup
99
99
  end
@@ -139,12 +139,12 @@ module ICU
139
139
  name.gsub!(/\s*-\s*/, '-')
140
140
  name.gsub!(/'+/, "'")
141
141
  name.strip!
142
- name = Util.downcase(name)
142
+ name = Util::String.downcase(name)
143
143
  name.split(/\s+/).map do |n|
144
144
  n.sub!(/^-+/, '')
145
145
  n.sub!(/-+$/, '')
146
146
  n.split(/-/).map do |p|
147
- Util.capitalize(p)
147
+ Util::String.capitalize(p)
148
148
  end.join('-')
149
149
  end.join(' ')
150
150
  end
@@ -156,11 +156,11 @@ module ICU
156
156
 
157
157
  # Apply final touches to finish canonicalising a last name.
158
158
  def finish_last(names)
159
- names.gsub!(/\b([A-Z\u{c0}-\u{de}]')([a-z\u{e0}-\u{ff}])/) { |m| $1 + Util.upcase($2) }
160
- names.gsub!(/\b(Mc)([a-z\u{e0}-\u{ff}])/) { |m| $1 + Util.upcase($2) }
159
+ names.gsub!(/\b([A-Z\u{c0}-\u{de}]')([a-z\u{e0}-\u{ff}])/) { |m| $1 + Util::String.upcase($2) }
160
+ names.gsub!(/\b(Mc)([a-z\u{e0}-\u{ff}])/) { |m| $1 + Util::String.upcase($2) }
161
161
  names.gsub!(/\bMac([a-z\u{e0}-\u{ff}])/) do |m|
162
162
  letter = $1 # capitalize after "Mac" only if the original clearly indicates it
163
- upper = Util.upcase(letter)
163
+ upper = Util::String.upcase(letter)
164
164
  'Mac'.concat(@original.match(/\bMac#{upper}/) ? upper : letter)
165
165
  end
166
166
  names.gsub!(/\bO ([A-Z\u{c0}-\u{de}])/) { |m| "O'" + $1 } # O Kelly => "O'Kelly"
data/lib/icu_name/util.rb CHANGED
@@ -2,51 +2,77 @@
2
2
 
3
3
  module ICU
4
4
  module Util
5
- LOWER_CHARS = "àáâãäåæçèéêëìíîïñòóôõöøùúûüýþ"
6
- UPPER_CHARS = "ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÑÒÓÔÕÖØÙÚÛÜÝÞ"
7
- ACCENTED_CHARS = "ÀÁÂÃÄÅÈÉÊËÌÍÎÏÑÒÓÔÕÖÙÚÛÜÝàáâãäåèéêëìíîïñòóôõöùúûüý"
8
- UNACCENTED_CHARS = "AAAAAAEEEEIIIINOOOOOUUUUYaaaaaaeeeeiiiinooooouuuuy"
5
+
6
+ # For converting strings in various ways.
7
+ module String
8
+ LOWER_CHARS = "àáâãäåæçèéêëìíîïñòóôõöøùúûüýþ"
9
+ UPPER_CHARS = "ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÑÒÓÔÕÖØÙÚÛÜÝÞ"
10
+ ACCENTED_CHARS = "ÀÁÂÃÄÅÈÉÊËÌÍÎÏÑÒÓÔÕÖÙÚÛÜÝàáâãäåèéêëìíîïñòóôõöùúûüý"
11
+ UNACCENTED_CHARS = "AAAAAAEEEEIIIINOOOOOUUUUYaaaaaaeeeeiiiinooooouuuuy"
9
12
 
10
- # Decide if a string is valid UTF-8 or not, returning true or false.
11
- def self.is_utf8(str)
12
- dup = str.dup
13
- dup.force_encoding("UTF-8")
14
- dup.valid_encoding?
15
- end
13
+ # Decide if a string is valid UTF-8 or not, returning true or false.
14
+ def self.is_utf8(str)
15
+ dup = str.dup
16
+ dup.force_encoding("UTF-8")
17
+ dup.valid_encoding?
18
+ end
16
19
 
17
- # Try to convert any string to UTF-8.
18
- def self.to_utf8(str)
19
- utf8 = is_utf8(str)
20
- dup = str.dup
21
- return dup.force_encoding("UTF-8") if utf8
22
- dup.force_encoding("Windows-1252") if dup.encoding.name.match(/^(ASCII-8BIT|UTF-8)$/)
23
- dup.encode("UTF-8")
24
- end
20
+ # Try to convert any string to UTF-8.
21
+ def self.to_utf8(str)
22
+ utf8 = is_utf8(str)
23
+ dup = str.dup
24
+ return dup.force_encoding("UTF-8") if utf8
25
+ dup.force_encoding("Windows-1252") if dup.encoding.name.match(/^(ASCII-8BIT|UTF-8)$/)
26
+ dup.encode("UTF-8")
27
+ end
25
28
 
26
- # Upcase a UTF-8 string that might contain accented characters.
27
- def self.upcase(str)
28
- str = str.upcase
29
- return str if str.ascii_only?
30
- str.tr(LOWER_CHARS, UPPER_CHARS)
31
- end
29
+ # Upcase a UTF-8 string that might contain accented characters.
30
+ def self.upcase(str)
31
+ str = str.upcase
32
+ return str if str.ascii_only?
33
+ str.tr(LOWER_CHARS, UPPER_CHARS)
34
+ end
32
35
 
33
- # Downcase a UTF-8 string that might contain accented characters.
34
- def self.downcase(str)
35
- str = str.downcase
36
- return str if str.ascii_only?
37
- str.tr(UPPER_CHARS, LOWER_CHARS)
38
- end
36
+ # Downcase a UTF-8 string that might contain accented characters.
37
+ def self.downcase(str)
38
+ str = str.downcase
39
+ return str if str.ascii_only?
40
+ str.tr(UPPER_CHARS, LOWER_CHARS)
41
+ end
39
42
 
40
- # Capilalize a UTF-8 string that might contain accented characters.
41
- def self.capitalize(str)
42
- return str.capitalize if str.ascii_only? || !str.match(/\A(.)(.*)\z/)
43
- upcase($1) + downcase($2)
43
+ # Capilalize a UTF-8 string that might contain accented characters.
44
+ def self.capitalize(str)
45
+ return str.capitalize if str.ascii_only? || !str.match(/\A(.)(.*)\z/)
46
+ upcase($1) + downcase($2)
47
+ end
48
+
49
+ # Transliterate Latin-1 accented characters to ASCII.
50
+ def self.transliterate(str)
51
+ return str.dup if str.ascii_only?
52
+ str.tr(ACCENTED_CHARS, UNACCENTED_CHARS)
53
+ end
44
54
  end
45
55
 
46
- # Transliterate Latin-1 accented characters to ASCII.
47
- def self.transliterate(str)
48
- return str.dup if str.ascii_only?
49
- str.tr(ACCENTED_CHARS, UNACCENTED_CHARS)
56
+ # For generating SQL queries relating to alternative first or last names.
57
+ module AlternativeNames
58
+ def last_name_like(last, first)
59
+ ICU::Name.new(first, last).alternatives(:last).push(last).map do |nam|
60
+ "last_name LIKE '%#{quote_str(nam)}%'"
61
+ end.sort.join(" OR ")
62
+ end
63
+
64
+ def first_name_like(first, last)
65
+ ICU::Name.new(first, last).alternatives(:first).push(first).map do |nam|
66
+ "first_name LIKE '%#{quote_str(nam)}%'"
67
+ end.sort.join(" OR ")
68
+ end
69
+
70
+ private
71
+
72
+ # Same as Rails version (ActiveRecord::ConnectionAdapters::Quoting).
73
+ def quote_str(s)
74
+ s.gsub(/\\/, '\&\&').gsub(/'/, "''")
75
+ end
50
76
  end
51
77
  end
52
78
  end
@@ -2,6 +2,6 @@
2
2
 
3
3
  module ICU
4
4
  class Name
5
- VERSION = "1.1.1"
5
+ VERSION = "1.2.0"
6
6
  end
7
7
  end
data/spec/name_spec.rb CHANGED
@@ -3,7 +3,8 @@ require File.expand_path(File.dirname(__FILE__) + '/spec_helper')
3
3
 
4
4
  module ICU
5
5
  describe Name do
6
- def load_alt_test(*types)
6
+ def load_alt_test(reset, *types)
7
+ Name.reset_alternatives if reset
7
8
  types.each do |type|
8
9
  file = File.expand_path(File.dirname(__FILE__) + "/../config/test_#{type}_alts.yaml")
9
10
  data = File.open(file) { |fd| YAML.load(fd) }
@@ -351,13 +352,13 @@ module ICU
351
352
  Name.new('Gerard', 'Orr').match('Gerald', 'Orr').should be_false
352
353
  end
353
354
 
354
- it "should by default be cautious about misspellings" do
355
- Name.new('Steven', 'Brady').match('Stephen', 'Brady').should be_false
356
- Name.new('Philip', 'Short').match('Phillip', 'Short').should be_false
355
+ it "should handle some common misspellings" do
356
+ Name.new('Steven', 'Brady').match('Stephen', 'Brady').should be_true
357
+ Name.new('Philip', 'Short').match('Phillip', 'Short').should be_true
357
358
  end
358
359
 
359
- it "should by default have no conditional matches" do
360
- Name.new('Sean', 'Bradley').match('John', 'Bradley').should be_false
360
+ it "should have some conditional matches" do
361
+ Name.new('Sean', 'Bradley').match('John', 'Bradley').should be_true
361
362
  end
362
363
 
363
364
  it "should not mix up nick names" do
@@ -379,9 +380,9 @@ module ICU
379
380
  Name.new('Darko', 'Polimac').match('Darko', 'Polimc').should be_false
380
381
  end
381
382
 
382
- it "should by defaut have no conditional matches" do
383
- Name.new('Debbie', 'Quinn').match('Debbie', 'Benjamin').should be_false
384
- Name.new('Mairead', "O'Siochru").match('Mairead', 'King').should be_false
383
+ it "should have some conditional matches" do
384
+ Name.new('Debbie', 'Quinn').match('Debbie', 'Benjamin').should be_true
385
+ Name.new('Mairead', "O'Siochru").match('Mairead', 'King').should be_true
385
386
  end
386
387
  end
387
388
 
@@ -404,7 +405,11 @@ module ICU
404
405
 
405
406
  context "configuring new first name alternatives" do
406
407
  before(:all) do
407
- load_alt_test(:first)
408
+ load_alt_test(true, :first)
409
+ end
410
+
411
+ after(:all) do
412
+ Name.reset_alternatives
408
413
  end
409
414
 
410
415
  it "should match some spelling errors" do
@@ -421,7 +426,11 @@ module ICU
421
426
 
422
427
  context "configuring new last name alternatives" do
423
428
  before(:all) do
424
- load_alt_test(:last)
429
+ load_alt_test(true, :last)
430
+ end
431
+
432
+ after(:all) do
433
+ Name.reset_alternatives
425
434
  end
426
435
 
427
436
  it "should match some spelling errors" do
@@ -444,7 +453,11 @@ module ICU
444
453
 
445
454
  context "configuring new first and new last name alternatives" do
446
455
  before(:all) do
447
- load_alt_test(:first, :last)
456
+ load_alt_test(true, :first, :last)
457
+ end
458
+
459
+ after(:all) do
460
+ Name.reset_alternatives
448
461
  end
449
462
 
450
463
  it "should allow some awesome matches" do
@@ -455,16 +468,20 @@ module ICU
455
468
 
456
469
  context "reverting to the default configuration" do
457
470
  before(:all) do
458
- load_alt_test(:first, :last)
471
+ load_alt_test(true, :first, :last)
459
472
  end
460
473
 
461
- it "should not match so boldly after reverting" do
462
- Name.new('french, steven').match('Stephen', 'Ffrench').should be_true
474
+ after(:all) do
475
+ Name.reset_alternatives
476
+ end
477
+
478
+ it "should not match after reverting" do
479
+ Name.new('avril, demeter').match('Ceres', 'Avril').should be_true
463
480
  Name.load_alternatives(:first)
464
- Name.new('Patrick', 'Murphy').match('Padraic', 'Murchadha').should be_false
465
- Name.new('Patrick', 'Murphy').match('Patrick', 'Murchadha').should be_true
481
+ Name.new('avril, demeter').match('Ceres', 'Avril').should be_false
482
+ Name.new('Patrick', 'Ares').match('Patrick', 'Mars').should be_true
466
483
  Name.load_alternatives(:last)
467
- Name.new('Patrick', 'Murphy').match('Patrick', 'Murchadha').should be_false
484
+ Name.new('Patrick', 'Ares').match('Patrick', 'Mars').should be_false
468
485
  end
469
486
  end
470
487
 
@@ -472,35 +489,39 @@ module ICU
472
489
  it "should show common nicknames" do
473
490
  Name.new('William', 'Ffrench').alternatives(:first).should =~ %w{Bill Willy Willie Will}
474
491
  Name.new('Bill', 'Ffrench').alternatives(:first).should =~ %w{William Willy Will Willie}
475
- Name.new('Steven', 'Ffrench').alternatives(:first).should =~ %w{Steve}
476
- Name.new('Stephen', 'Ffrench').alternatives(:first).should =~ %w{Steve}
477
- Name.new('Michael Stephen', 'Ffrench').alternatives(:first).should =~ %w{Steve Mike Mick Mikey}
478
- Name.new('Stephen M.', 'Ffrench').alternatives(:first).should =~ %w{Steve}
492
+ Name.new('Steven', 'Ffrench').alternatives(:first).should =~ %w{Steve Stephen}
493
+ Name.new('Stephen', 'Ffrench').alternatives(:first).should =~ %w{Stef Stefan Stefen Stephan Steve Steven}
494
+ Name.new('Michael Stephen', 'Ffrench').alternatives(:first).should =~ %w{Micheal Mick Mickie Micky Mike Mikey Stef Stefan Stefen Stephan Steve Steven}
495
+ Name.new('Stephen M.', 'Ffrench').alternatives(:first).should =~ %w{Stef Stefan Stefen Stephan Steve Steven}
496
+ Name.new('Sean', 'Bradley').alternatives(:first).should =~ %w{John}
479
497
  Name.new('S.', 'Ffrench').alternatives(:first).should =~ []
480
- Name.new('Sean', 'Bradley').alternatives(:first).should =~ []
481
498
  end
482
499
 
483
500
  it "should have automatic last name alternatives for apostrophes to cater for FIDE's habits" do
484
- Name.new('Mairead', "O'Siochru").alternatives(:last).should =~ ["O`Siochru"]
485
- Name.new('Erwin E.', "L`Ami").alternatives(:last).should =~ ["L`Ami"]
501
+ Name.new('Mairead', "O'Siochru").alternatives(:last).should =~ %w{King O`Siochru}
502
+ Name.new('Erwin E.', "L`Ami").alternatives(:last).should =~ %w{L`Ami}
486
503
  end
487
504
 
488
- it "should not have any last name alternatives" do
489
- Name.new('William', 'Ffrench').alternatives(:last).should =~ []
490
- Name.new('Oissine', 'Murphy').alternatives(:last).should =~ []
491
- Name.new('Debbie', 'Quinn').alternatives(:last).should =~ []
505
+ it "should not have some last name alternatives" do
506
+ Name.new('William', 'Ffrench').alternatives(:last).should =~ %w{French}
507
+ Name.new('Oissine', 'Murphy').alternatives(:last).should =~ %w{Murchadha}
508
+ Name.new('Debbie', 'Quinn').alternatives(:last).should =~ %w{Benjamin}
492
509
  end
493
510
  end
494
511
 
495
512
  context "name alternatives with more adventurous configuration" do
496
513
  before(:all) do
497
- load_alt_test(:first, :last)
514
+ load_alt_test(true, :first, :last)
515
+ end
516
+
517
+ after(:all) do
518
+ Name.reset_alternatives
498
519
  end
499
520
 
500
- it "should show additional nicknames" do
521
+ it "should show different nicknames" do
501
522
  Name.new('Steven', 'Ffrench').alternatives(:first).should =~ %w{Stephen Steve}
502
- Name.new('Stephen', 'Ffrench').alternatives(:first).should =~ %w{Stef Stefan Stefen Stephan Steve Steven}
503
- Name.new('Stephen Mike', 'Ffrench').alternatives(:first).should =~ %w{Michael Mick Mickie Micky Mikey Stef Stefan Stefen Stephan Steve Steven}
523
+ Name.new('Stephen', 'Ffrench').alternatives(:first).should =~ %w{Steve Steven}
524
+ Name.new('Stephen Mike', 'Ffrench').alternatives(:first).should =~ %w{Michael Steve Steven}
504
525
  Name.new('Sean', 'Bradley').alternatives(:first).should =~ %w{John}
505
526
  Name.new('Sean', 'McDonagh').alternatives(:first).should =~ []
506
527
  Name.new('John', 'Bradley').alternatives(:first).should =~ %w{Sean Johnny}
@@ -521,6 +542,10 @@ module ICU
521
542
  Name.reset_alternatives
522
543
  end
523
544
 
545
+ after(:all) do
546
+ Name.reset_alternatives
547
+ end
548
+
524
549
  it "should be no more than necessary" do
525
550
  alt_compilations(:first).should == 0
526
551
  alt_compilations(:last).should == 0
@@ -530,16 +555,10 @@ module ICU
530
555
  Name.new('Debbie', 'Quinn').match('Deborah', 'Benjamin')
531
556
  alt_compilations(:first).should == 1
532
557
  alt_compilations(:last).should == 1
533
- load_alt_test(:first)
558
+ load_alt_test(false, :first)
534
559
  alt_compilations(:first).should == 2
535
560
  alt_compilations(:last).should == 1
536
- load_alt_test(:last)
537
- alt_compilations(:first).should == 2
538
- alt_compilations(:last).should == 2
539
- Name.new('William', 'Ffrench').match('Bill', 'French')
540
- Name.new('Debbie', 'Quinn').match('Deborah', 'Benjamin')
541
- Name.new('Mark', 'Orr').alternatives(:first)
542
- Name.new('Mark', 'Orr').alternatives(:last)
561
+ load_alt_test(false, :last)
543
562
  alt_compilations(:first).should == 2
544
563
  alt_compilations(:last).should == 2
545
564
  end
data/spec/util_spec.rb CHANGED
@@ -2,63 +2,86 @@
2
2
  require File.expand_path(File.dirname(__FILE__) + '/spec_helper')
3
3
 
4
4
  module ICU
5
- describe Util do
6
- context "#is_utf8" do
7
- it "recognises some encodings as a special case of UTF-8" do
8
- expect(Util.is_utf8("Resume".encode("US-ASCII"))).to be_true
9
- expect(Util.is_utf8("Resume".encode("ASCII-8BIT"))).to be_true
10
- expect(Util.is_utf8("Resume".encode("BINARY"))).to be_true
11
- end
5
+ module Util
6
+ describe String do
7
+ context "#is_utf8" do
8
+ it "recognises some encodings as a special case of UTF-8" do
9
+ expect(String.is_utf8("Resume".encode("US-ASCII"))).to be_true
10
+ expect(String.is_utf8("Resume".encode("ASCII-8BIT"))).to be_true
11
+ expect(String.is_utf8("Resume".encode("BINARY"))).to be_true
12
+ end
13
+
14
+ it "recognises UTF-8" do
15
+ expect(String.is_utf8("Résumé")).to be_true
16
+ expect(String.is_utf8("δog")).to be_true
17
+ end
12
18
 
13
- it "recognises UTF-8" do
14
- expect(Util.is_utf8("Résumé")).to be_true
15
- expect(Util.is_utf8("δog")).to be_true
19
+ it "should recognize other encodings as not being UTF-8" do
20
+ expect(String.is_utf8("Résumé".encode("ISO-8859-1"))).to be_false
21
+ expect(String.is_utf8("€50".encode("Windows-1252"))).to be_false
22
+ expect(String.is_utf8("ひらがな".encode("Shift_JIS"))).to be_false
23
+ expect(String.is_utf8("\xa3")).to be_false
24
+ end
16
25
  end
17
26
 
18
- it "should recognize other encodings as not being UTF-8" do
19
- expect(Util.is_utf8("Résumé".encode("ISO-8859-1"))).to be_false
20
- expect(Util.is_utf8("€50".encode("Windows-1252"))).to be_false
21
- expect(Util.is_utf8("ひらがな".encode("Shift_JIS"))).to be_false
22
- expect(Util.is_utf8("\xa3")).to be_false
27
+ context "#to_utf8" do
28
+ it "converts to UTF-8" do
29
+ expect(String.to_utf8("Resume")).to eq "Resume"
30
+ expect(String.to_utf8("Resume".force_encoding("US-ASCII")).encoding.name).to eq "UTF-8"
31
+ expect(String.to_utf8("Résumé".encode("ISO-8859-1"))).to eq "Résumé"
32
+ expect(String.to_utf8("Résumé".encode("Windows-1252"))).to eq "Résumé"
33
+ expect(String.to_utf8("€50".encode("Windows-1252"))).to eq "€50"
34
+ expect(String.to_utf8("\xa350".force_encoding("ASCII-8BIT"))).to eq "£50"
35
+ expect(String.to_utf8("\xa350")).to eq "£50"
36
+ expect(String.to_utf8("ひらがな".encode("Shift_JIS"))).to eq "ひらがな"
37
+ end
23
38
  end
24
- end
25
39
 
26
- context "#to_utf8" do
27
- it "converts to UTF-8" do
28
- expect(Util.to_utf8("Resume")).to eq "Resume"
29
- expect(Util.to_utf8("Resume".force_encoding("US-ASCII")).encoding.name).to eq "UTF-8"
30
- expect(Util.to_utf8("Résumé".encode("ISO-8859-1"))).to eq "Résumé"
31
- expect(Util.to_utf8("Résumé".encode("Windows-1252"))).to eq "Résumé"
32
- expect(Util.to_utf8("€50".encode("Windows-1252"))).to eq "€50"
33
- expect(Util.to_utf8("\xa350".force_encoding("ASCII-8BIT"))).to eq "£50"
34
- expect(Util.to_utf8("\xa350")).to eq "£50"
35
- expect(Util.to_utf8("ひらがな".encode("Shift_JIS"))).to eq "ひらがな"
40
+ context "#downcase" do
41
+ it "downcases characters in the Latin-1 range" do
42
+ expect(String.downcase("Eric")).to eq "eric"
43
+ expect(String.downcase("Éric")).to eq "éric"
44
+ expect(String.downcase("ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÑÒÓÔÕÖØÙÚÛÜÝÞ")).to eq "àáâãäåæçèéêëìíîïñòóôõöøùúûüýþ"
45
+ end
36
46
  end
37
- end
38
47
 
39
- context "#downcase" do
40
- it "downcases characters in the Latin-1 range" do
41
- expect(Util.downcase("Eric")).to eq "eric"
42
- expect(Util.downcase("Éric")).to eq "éric"
43
- expect(Util.downcase("ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÑÒÓÔÕÖØÙÚÛÜÝÞ")).to eq "àáâãäåæçèéêëìíîïñòóôõöøùúûüýþ"
48
+ context "#upcase" do
49
+ it "upcases characters in the Latin-1 range" do
50
+ expect(String.upcase("Gearoidin")).to eq "GEAROIDIN"
51
+ expect(String.upcase("Gearóidín")).to eq "GEARÓIDÍN"
52
+ expect(String.upcase("àáâãäåæçèéêëìíîïñòóôõöøùúûüýþ")).to eq "ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÑÒÓÔÕÖØÙÚÛÜÝÞ"
53
+ end
44
54
  end
45
- end
46
55
 
47
- context "#upcase" do
48
- it "upcases characters in the Latin-1 range" do
49
- expect(Util.upcase("Gearoidin")).to eq "GEAROIDIN"
50
- expect(Util.upcase("Gearóidín")).to eq "GEARÓIDÍN"
51
- expect(Util.upcase("àáâãäåæçèéêëìíîïñòóôõöøùúûüýþ")).to eq "ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÑÒÓÔÕÖØÙÚÛÜÝÞ"
56
+ context "#capitalize" do
57
+ it "capitalizes strings that might contain accented characters" do
58
+ expect(String.capitalize("gearoidin")).to eq "Gearoidin"
59
+ expect(String.capitalize("GEAROIDIN")).to eq "Gearoidin"
60
+ expect(String.capitalize("gEAróiDÍn")).to eq "Gearóidín"
61
+ expect(String.capitalize("ériC")).to eq "Éric"
62
+ expect(String.capitalize("ÉRIc")).to eq "Éric"
63
+ end
52
64
  end
53
65
  end
66
+
67
+ describe AlternativeNames do
68
+ context "extends" do
69
+ class Dummy
70
+ extend AlternativeNames
71
+ end
72
+
73
+ it "#last_name_like" do
74
+ expect(Dummy.last_name_like("Murphy", "Oissine")).to eq "last_name LIKE '%Murchadha%' OR last_name LIKE '%Murphy%'"
75
+ expect(Dummy.last_name_like("O'Connor", "Jonathan")).to eq "last_name LIKE '%O''Connor%' OR last_name LIKE '%O`Connor%'"
76
+ expect(Dummy.last_name_like("Orr", "Mark")).to eq "last_name LIKE '%Orr%'"
77
+ expect(Dummy.last_name_like("", "Mark")).to eq "last_name LIKE '%%'"
78
+ end
54
79
 
55
- context "#capitalize" do
56
- it "capitalizes strings that might contain accented characters" do
57
- expect(Util.capitalize("gearoidin")).to eq "Gearoidin"
58
- expect(Util.capitalize("GEAROIDIN")).to eq "Gearoidin"
59
- expect(Util.capitalize("gEAróiDÍn")).to eq "Gearóidín"
60
- expect(Util.capitalize("ériC")).to eq "Éric"
61
- expect(Util.capitalize("ÉRIc")).to eq "Éric"
80
+ it "#first_name_like" do
81
+ expect(Dummy.first_name_like("sean", "bradley")).to eq "first_name LIKE '%John%' OR first_name LIKE '%sean%'"
82
+ expect(Dummy.first_name_like("Jonathan", "O'Connor")).to eq "first_name LIKE '%Jon%' OR first_name LIKE '%Jonathan%'"
83
+ expect(Dummy.first_name_like("", "O'Connor")).to eq "first_name LIKE '%%'"
84
+ end
62
85
  end
63
86
  end
64
87
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: icu_name
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.1.1
4
+ version: 1.2.0
5
5
  prerelease:
6
6
  platform: ruby
7
7
  authors:
@@ -9,7 +9,7 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2013-07-22 00:00:00.000000000 Z
12
+ date: 2013-10-23 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: bundler
@@ -112,7 +112,7 @@ required_ruby_version: !ruby/object:Gem::Requirement
112
112
  version: '0'
113
113
  segments:
114
114
  - 0
115
- hash: -3751186445395685587
115
+ hash: 1642061693049790720
116
116
  required_rubygems_version: !ruby/object:Gem::Requirement
117
117
  none: false
118
118
  requirements: