name_formatter 0.1.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/CHANGELOG.md +15 -0
- data/LICENSE +21 -0
- data/README.md +135 -0
- data/lib/name_formatter/version.rb +3 -0
- data/lib/name_formatter.rb +159 -0
- metadata +97 -0
checksums.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
---
|
2
|
+
SHA256:
|
3
|
+
metadata.gz: 6037e2a676981c2ef2026b3387a0e21a7e36db146449d51ba4710068e316473e
|
4
|
+
data.tar.gz: cbc2fd01074904d6d1225f0901b16adc941aa41a2c64f6f6b49e743a17b8c3fd
|
5
|
+
SHA512:
|
6
|
+
metadata.gz: a748dfe1cb4cb20fae52eb78a741281ba946f153b38acd35e520d2c5b858104f59701566dd7cedbaa2bdbbcec7201ac858b05463d9d09a5285481324cf510057
|
7
|
+
data.tar.gz: 266a1de96fd71ba93d85677eb694d6ebcbcf23744ac1c2d6030b20fde0ab3ff0a6c7ca784ca857810960abb13bca41866c8f9da77fbfcce3f53069291774c854
|
data/CHANGELOG.md
ADDED
@@ -0,0 +1,15 @@
|
|
1
|
+
# Changelog
|
2
|
+
|
3
|
+
All notable changes to this project will be documented in this file.
|
4
|
+
|
5
|
+
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
|
6
|
+
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
7
|
+
|
8
|
+
## [0.1.0] - 2024-06-28
|
9
|
+
Initial realase
|
10
|
+
|
11
|
+
### Added
|
12
|
+
|
13
|
+
- Support for parsing complex names with prefixes, suffixes, particles and beyond.
|
14
|
+
|
15
|
+
[0.1.0]: https://github.com/kylewelsby/name_formatter/releases/tag/v0.1.0
|
data/LICENSE
ADDED
@@ -0,0 +1,21 @@
|
|
1
|
+
The MIT License (MIT)
|
2
|
+
|
3
|
+
Copyright © 2024 Kyle Welsby, https://mekyle.com <kyle@mekyle.com>
|
4
|
+
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
6
|
+
of this software and associated documentation files (the “Software”), to deal
|
7
|
+
in the Software without restriction, including without limitation the rights
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
10
|
+
furnished to do so, subject to the following conditions:
|
11
|
+
|
12
|
+
The above copyright notice and this permission notice shall be included in
|
13
|
+
all copies or substantial portions of the Software.
|
14
|
+
|
15
|
+
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
|
21
|
+
THE SOFTWARE.
|
data/README.md
ADDED
@@ -0,0 +1,135 @@
|
|
1
|
+
![Status Badge](https://github.com/kylewelsby/name_formatter/actions/workflows/main.yml/badge.svg)
|
2
|
+
|
3
|
+
# NameFormatter
|
4
|
+
|
5
|
+
NameFormatter is a Ruby gem that provides robust name parsing and formatting capabilities. It handles a wide variety of name formats, including personal names from different cultures, company names, and names with prefixes and suffixes.
|
6
|
+
|
7
|
+
## ✨ Features
|
8
|
+
|
9
|
+
- Handles personal names from various cultures (Western, Spanish, German, etc.)
|
10
|
+
- Supports company names and legal entities
|
11
|
+
- Correctly formats prefixes, suffixes, and particles (e.g., "van", "de", "von")
|
12
|
+
- Preserves capitalization for names like "McDonald" or "DeVito"
|
13
|
+
- Unicode-aware, handling names with non-ASCII characters
|
14
|
+
|
15
|
+
## 🎲 Installation
|
16
|
+
|
17
|
+
Add this line to your application's Gemfile:
|
18
|
+
|
19
|
+
```ruby
|
20
|
+
gem 'name_formatter'
|
21
|
+
```
|
22
|
+
|
23
|
+
and then execute:
|
24
|
+
|
25
|
+
```bash
|
26
|
+
bundle install
|
27
|
+
```
|
28
|
+
|
29
|
+
or install it yourself as:
|
30
|
+
|
31
|
+
```bash
|
32
|
+
gem install name_formatter
|
33
|
+
```
|
34
|
+
|
35
|
+
## 🛠️ Usage
|
36
|
+
|
37
|
+
```ruby
|
38
|
+
require 'name_formatter'
|
39
|
+
|
40
|
+
formatter = NameFormatter.new
|
41
|
+
|
42
|
+
# Format a name
|
43
|
+
formatted = formatter.format("JOHN DOE")
|
44
|
+
puts formatted # Output: "John Doe"
|
45
|
+
|
46
|
+
# Parse and Format a name
|
47
|
+
parsed = formatter.parse_formatted("Dr. Jane Smith Jr.")
|
48
|
+
puts parsed
|
49
|
+
# Output: {
|
50
|
+
# prefix: "Dr.",
|
51
|
+
# first_name: "Jane",
|
52
|
+
# last_name: "Smith",
|
53
|
+
# suffix: "Jr."
|
54
|
+
# }
|
55
|
+
|
56
|
+
# Parse skip formatting a name
|
57
|
+
parsed = formatter.parse("Dr. Jane Smith Jr.")
|
58
|
+
puts parsed
|
59
|
+
# Output: {
|
60
|
+
# prefix: "Dr.",
|
61
|
+
# first_name: "Jane",
|
62
|
+
# last_name: "Smith",
|
63
|
+
# suffix: "Jr."
|
64
|
+
# }
|
65
|
+
|
66
|
+
# Handle complex names
|
67
|
+
puts formatter.format("MARÍA DEL CARMEN MARTÍNEZ-VILLASEÑOR")
|
68
|
+
# Output: "María del Carmen Martínez-Villaseñor"
|
69
|
+
|
70
|
+
# Handle company names
|
71
|
+
puts formatter.format("ACME CORPORATION, INC.")
|
72
|
+
# Output: "Acme Corporation, Inc."
|
73
|
+
```
|
74
|
+
|
75
|
+
|
76
|
+
## 👨💻 Development
|
77
|
+
|
78
|
+
1. Run the tests to ensure everything is setup correctly:
|
79
|
+
|
80
|
+
```ruby
|
81
|
+
ruby -Ilib test/*.rb
|
82
|
+
```
|
83
|
+
|
84
|
+
2. To run an interactive prompt that will allow you to expriement with the code, you can use:
|
85
|
+
|
86
|
+
```ruby
|
87
|
+
irb -Ilib -rname_formatter
|
88
|
+
```
|
89
|
+
|
90
|
+
Remember to add tests for new features or bugs fixes.
|
91
|
+
|
92
|
+
## 🤝 Contributing
|
93
|
+
|
94
|
+
Bug reports and pull requests are welcome on GitHub at [github.com/kylewelsby/name_formatter](https://github.com/kylewelsby/name_formatter).
|
95
|
+
|
96
|
+
## 🚀 Releasing
|
97
|
+
|
98
|
+
To release a new version:
|
99
|
+
|
100
|
+
1. Update the version number in `lib/name_formatter/version.rb`
|
101
|
+
2. Update `CHANGELOG.md`
|
102
|
+
3. Commit changes
|
103
|
+
4. Create a new tag with `git tag -a vX.X.X -m "Release X.X.X"`
|
104
|
+
5. Push the tag with `git push origin vX.X.X`
|
105
|
+
|
106
|
+
GitHub Actions will automatically build and publish the gem oto RubyGems.org when the tag is pushed.
|
107
|
+
|
108
|
+
|
109
|
+
### If the release fails
|
110
|
+
|
111
|
+
If the tests fail during the release process:
|
112
|
+
|
113
|
+
1. Fix the failing tests
|
114
|
+
2. Commit your changes
|
115
|
+
3. Update the tag
|
116
|
+
```
|
117
|
+
git tag -fa vX.X.X -M "Update release X.X.X"
|
118
|
+
```
|
119
|
+
4. Force-push the update tag:
|
120
|
+
```
|
121
|
+
git push origin vX.X.X --force
|
122
|
+
```
|
123
|
+
|
124
|
+
This will trigger the release process again with the updated code.
|
125
|
+
|
126
|
+
Remember, force-pushing a tag can cause issues if others have already pulled that tag, so it's generally best to use this approach only for fixing issues with releases that haven't been widely distributed yet.
|
127
|
+
|
128
|
+
For more significant changes, it might be better to increment the version number (e.g., from 0.1.0 to 0.1.1) and create a new tag instead of updating the existing one.
|
129
|
+
|
130
|
+
## 🎓 License
|
131
|
+
|
132
|
+
This gem is available as open source under ther terms of the MIT License.
|
133
|
+
|
134
|
+
https://kylewelsby.mit-license.org
|
135
|
+
|
@@ -0,0 +1,159 @@
|
|
1
|
+
require "name_formatter/version"
|
2
|
+
class NameFormatter
|
3
|
+
VERSION = NameFormatterModule::VERSION
|
4
|
+
PREFIXES = Set.new(["Mr", "Mrs", "Ms", "Miss", "The Hon", "Rev", "Dr", "Fr", "Pres", "Prof", "Msgr", "Sen", "Gov", "Rep", "Amb"]).freeze
|
5
|
+
SUFFIXES = Set.new([
|
6
|
+
"Esq", "Jr", "Sr", "III", "II", "I", "V", "IV", "MD", "DC", "DO", "DVM", "LLD", "VM", "DDS", "Ret", "CPA", "JD", "PhD",
|
7
|
+
"LLC", "Inc", "Corp", "Ltd", "Co", "LLC", "PLC", "GmbH", "AG", "SA", "SARL", "SRL", "BV", "CV", "NV", "SE", "SC", "SL", "SLL", "SLLC", "SCS", "SCA", "SCRL", "SCA"
|
8
|
+
]).freeze
|
9
|
+
|
10
|
+
NAME_REGEX = /^(?<prefix>(#{PREFIXES.join("|")}?)\.?)?\s*(?<first_name>[\w-]+)\s+(?<last_name>[\w\s'-]+)\s*(?<suffix>(#{SUFFIXES.join("|")})\.?)?$/ix
|
11
|
+
|
12
|
+
COMPANY_SUFFIX_REGEX = /^(.+)\s+(Inc\.?|Corp\.?|Ltd\.?|LLC|LLP|LP|Limited|Corporation|Company)$/i
|
13
|
+
FAMILY_BUSINESS_REGEX = /^(.+)\s+(?:and|&)\s+Sons$/i
|
14
|
+
MULTIPLE_FAMILY_NAMES_REGEX = /^[\w']+,\s+[\w']+\s+(?:and|&)\s+[\w']+$/i
|
15
|
+
DOUBLE_BARRELLED_NAME_REGEX = /^[\w']+-[\w']+$/
|
16
|
+
LAW_FIRM_REGEX = /^(?:[A-Z][a-z]+\s+){2,}(?:LLP|LLC|PC|PLLC)?$/
|
17
|
+
COMPANY_CO_REGEX = /^(.+)\s+(?:Company|Co\.)$/i
|
18
|
+
GROUP_HOLDINGS_REGEX = /^(.+)\s+(?:Group|Holdings)$/i
|
19
|
+
GEOGRAPHIC_COMPANY_REGEX = /^(.+)\s+of\s+[A-Z][a-z]+$/i
|
20
|
+
TRADING_ENTERPRISES_REGEX = /^(.+)\s+(?:Trading|Enterprises)$/i
|
21
|
+
|
22
|
+
DUNAME_REGEX = /^Du[b](?=[aeiou])/i
|
23
|
+
DENAME_REGEX = /^De[bfghjlmnpvw][aeioulr](?!(?:sik|a)(?:[-,])?$)/i
|
24
|
+
|
25
|
+
MCNAME_REGEX = /^Mc[a-z]+/i
|
26
|
+
MACNAME_REGEX = /^Mac(?:[aà][bdilmnors]|b[eh]|c[aeioruò]|d[h]|e[aò]|f[hiru]|g[ahilouy]|i[alo]|l[aeiouù]|m[hiua]|n[aeèiìo]|p[h]|r[aiìou]|s[hipu]|t[hiu]|u[airs])+/i
|
27
|
+
PARTICLE_REGEX = /^(de[rsl]|d[aiu]|v[oa]n|te[nr]|la|les|y|and|zu|dell[ao])$/i
|
28
|
+
|
29
|
+
def parse(name)
|
30
|
+
return parse_company_name(name) if company_name?(name)
|
31
|
+
|
32
|
+
parts = name.strip.split(/\s+/)
|
33
|
+
|
34
|
+
prefix = extract_prefix_or_suffix(parts, PREFIXES)
|
35
|
+
suffix = extract_prefix_or_suffix(parts.reverse, SUFFIXES)
|
36
|
+
|
37
|
+
if parts.size > 1
|
38
|
+
first_name = parts.shift
|
39
|
+
last_name = parts.reject { |part| part == suffix }.join(" ")
|
40
|
+
last_name = nil if last_name.empty?
|
41
|
+
elsif prefix && parts.size == 1
|
42
|
+
first_name = nil
|
43
|
+
last_name = parts.first
|
44
|
+
else
|
45
|
+
first_name = parts.first
|
46
|
+
last_name = nil
|
47
|
+
end
|
48
|
+
|
49
|
+
{
|
50
|
+
prefix: prefix,
|
51
|
+
first_name: first_name,
|
52
|
+
last_name: last_name,
|
53
|
+
suffix: suffix
|
54
|
+
}
|
55
|
+
end
|
56
|
+
|
57
|
+
def parse_formatted(name)
|
58
|
+
parsed = parse(name)
|
59
|
+
{
|
60
|
+
prefix: format_prefix(parsed[:prefix]),
|
61
|
+
first_name: format_first_name(parsed[:first_name]),
|
62
|
+
last_name: format_last_name(parsed[:last_name]),
|
63
|
+
suffix: format_suffix(parsed[:suffix])
|
64
|
+
}
|
65
|
+
end
|
66
|
+
|
67
|
+
def format(name)
|
68
|
+
parse_formatted(name).values.compact.join(" ")
|
69
|
+
end
|
70
|
+
|
71
|
+
private
|
72
|
+
|
73
|
+
def company_name?(name)
|
74
|
+
[COMPANY_SUFFIX_REGEX, FAMILY_BUSINESS_REGEX, MULTIPLE_FAMILY_NAMES_REGEX,
|
75
|
+
LAW_FIRM_REGEX, COMPANY_CO_REGEX, GROUP_HOLDINGS_REGEX,
|
76
|
+
GEOGRAPHIC_COMPANY_REGEX, TRADING_ENTERPRISES_REGEX, DOUBLE_BARRELLED_NAME_REGEX].any? { |regex| name.match?(regex) }
|
77
|
+
end
|
78
|
+
|
79
|
+
def parse_company_name(name)
|
80
|
+
parts = name.strip.split(/\s+/)
|
81
|
+
suffix = extract_prefix_or_suffix(parts.reverse, SUFFIXES)
|
82
|
+
{
|
83
|
+
prefix: nil,
|
84
|
+
first_name: nil,
|
85
|
+
last_name: parts.reject { |part| part == suffix }.join(" "),
|
86
|
+
suffix: suffix
|
87
|
+
}
|
88
|
+
end
|
89
|
+
|
90
|
+
def extract_prefix_or_suffix(parts, list)
|
91
|
+
list.each do |item|
|
92
|
+
item_parts = item.split
|
93
|
+
if item_parts.length <= parts.length
|
94
|
+
potential_match = parts.take(item_parts.length).join(" ").delete(".")
|
95
|
+
return parts.shift(item_parts.length).join(" ") if item.casecmp?(potential_match)
|
96
|
+
end
|
97
|
+
end
|
98
|
+
nil
|
99
|
+
end
|
100
|
+
|
101
|
+
def format_first_name(name)
|
102
|
+
return if name.nil?
|
103
|
+
parts = name.split(/\s+|(?<=-)/)
|
104
|
+
parts.map do |part|
|
105
|
+
format_first_name_part(part)
|
106
|
+
end.join(" ").gsub("- ", "-")
|
107
|
+
end
|
108
|
+
|
109
|
+
def format_first_name_part(part)
|
110
|
+
case part.downcase
|
111
|
+
when /^[od]'\w+/i
|
112
|
+
part[0..1].capitalize + part[2..].capitalize
|
113
|
+
else
|
114
|
+
part.capitalize
|
115
|
+
end
|
116
|
+
end
|
117
|
+
|
118
|
+
def format_last_name(name)
|
119
|
+
return if name.nil?
|
120
|
+
parts = name.split(/\s+|(?<=-)/)
|
121
|
+
formatted_parts = []
|
122
|
+
parts.each_with_index do |part, index|
|
123
|
+
next_part = parts[index + 1]
|
124
|
+
formatted_parts << format_last_name_part(part, next_part)
|
125
|
+
end
|
126
|
+
formatted_parts.join(" ").gsub("- ", "-")
|
127
|
+
end
|
128
|
+
|
129
|
+
def format_last_name_part(part, next_part)
|
130
|
+
case part.downcase
|
131
|
+
when /^(v[ao]n|te|ter|de)$/i
|
132
|
+
next_part&.match?(/der/i) ? part.downcase : part.capitalize
|
133
|
+
when PARTICLE_REGEX
|
134
|
+
next_part ? part.downcase : part.capitalize
|
135
|
+
when /^dell'\w+/i
|
136
|
+
part[0..4].capitalize + part[5..].capitalize
|
137
|
+
when MCNAME_REGEX, DUNAME_REGEX, DENAME_REGEX, /^[od]'\w+/i
|
138
|
+
part[0..1].capitalize + part[2..].capitalize
|
139
|
+
when /^von[r]+/i, MACNAME_REGEX
|
140
|
+
part[0..2].capitalize + part[3..].capitalize
|
141
|
+
else
|
142
|
+
part.capitalize
|
143
|
+
end
|
144
|
+
end
|
145
|
+
|
146
|
+
def format_prefix(part)
|
147
|
+
return if part.nil?
|
148
|
+
part.split.map(&:capitalize).join(" ")
|
149
|
+
end
|
150
|
+
|
151
|
+
def format_suffix(part)
|
152
|
+
return if part.nil?
|
153
|
+
matched = SUFFIXES.find { |suffix| suffix.casecmp?(part.gsub(/\.$/, "")) }
|
154
|
+
if matched
|
155
|
+
return matched + (part.end_with?(".") ? "." : "")
|
156
|
+
end
|
157
|
+
part.end_with?(".") ? part.capitalize : part.upcase
|
158
|
+
end
|
159
|
+
end
|
metadata
ADDED
@@ -0,0 +1,97 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: name_formatter
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 0.1.0
|
5
|
+
platform: ruby
|
6
|
+
authors:
|
7
|
+
- Kyle Welsby
|
8
|
+
autorequire:
|
9
|
+
bindir: bin
|
10
|
+
cert_chain: []
|
11
|
+
date: 2024-06-28 00:00:00.000000000 Z
|
12
|
+
dependencies:
|
13
|
+
- !ruby/object:Gem::Dependency
|
14
|
+
name: bundler
|
15
|
+
requirement: !ruby/object:Gem::Requirement
|
16
|
+
requirements:
|
17
|
+
- - "~>"
|
18
|
+
- !ruby/object:Gem::Version
|
19
|
+
version: '2.0'
|
20
|
+
type: :development
|
21
|
+
prerelease: false
|
22
|
+
version_requirements: !ruby/object:Gem::Requirement
|
23
|
+
requirements:
|
24
|
+
- - "~>"
|
25
|
+
- !ruby/object:Gem::Version
|
26
|
+
version: '2.0'
|
27
|
+
- !ruby/object:Gem::Dependency
|
28
|
+
name: minitest
|
29
|
+
requirement: !ruby/object:Gem::Requirement
|
30
|
+
requirements:
|
31
|
+
- - "~>"
|
32
|
+
- !ruby/object:Gem::Version
|
33
|
+
version: '5.0'
|
34
|
+
type: :development
|
35
|
+
prerelease: false
|
36
|
+
version_requirements: !ruby/object:Gem::Requirement
|
37
|
+
requirements:
|
38
|
+
- - "~>"
|
39
|
+
- !ruby/object:Gem::Version
|
40
|
+
version: '5.0'
|
41
|
+
- !ruby/object:Gem::Dependency
|
42
|
+
name: faker
|
43
|
+
requirement: !ruby/object:Gem::Requirement
|
44
|
+
requirements:
|
45
|
+
- - "~>"
|
46
|
+
- !ruby/object:Gem::Version
|
47
|
+
version: '3.4'
|
48
|
+
type: :development
|
49
|
+
prerelease: false
|
50
|
+
version_requirements: !ruby/object:Gem::Requirement
|
51
|
+
requirements:
|
52
|
+
- - "~>"
|
53
|
+
- !ruby/object:Gem::Version
|
54
|
+
version: '3.4'
|
55
|
+
description: |-
|
56
|
+
NameFormatter provides advanced name parsing and formatting capabilities,
|
57
|
+
handling personal names from various cultures, company names, and names
|
58
|
+
with prefixes and suffixes. It correctly formats particles, preserves
|
59
|
+
appropriate capitalization, and is Unicode-aware.
|
60
|
+
email:
|
61
|
+
- kyle@mekyle.com
|
62
|
+
executables: []
|
63
|
+
extensions: []
|
64
|
+
extra_rdoc_files: []
|
65
|
+
files:
|
66
|
+
- CHANGELOG.md
|
67
|
+
- LICENSE
|
68
|
+
- README.md
|
69
|
+
- lib/name_formatter.rb
|
70
|
+
- lib/name_formatter/version.rb
|
71
|
+
homepage: https://github.com/kylewelsby/name_formatter
|
72
|
+
licenses:
|
73
|
+
- MIT
|
74
|
+
metadata:
|
75
|
+
allowed_push_host: https://rubygems.org
|
76
|
+
changelog_uri: https://github.com/kylewelsby/name_formatter/blob/main/CHANGELOG.md
|
77
|
+
funding_uri: https://github.com/sponsors/kylewelsby
|
78
|
+
post_install_message:
|
79
|
+
rdoc_options: []
|
80
|
+
require_paths:
|
81
|
+
- lib
|
82
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
83
|
+
requirements:
|
84
|
+
- - ">="
|
85
|
+
- !ruby/object:Gem::Version
|
86
|
+
version: 2.5.0
|
87
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
88
|
+
requirements:
|
89
|
+
- - ">="
|
90
|
+
- !ruby/object:Gem::Version
|
91
|
+
version: '0'
|
92
|
+
requirements: []
|
93
|
+
rubygems_version: 3.5.11
|
94
|
+
signing_key:
|
95
|
+
specification_version: 4
|
96
|
+
summary: A robust name parsing and formatting library for Ruby
|
97
|
+
test_files: []
|