email_address 0.0.3 → 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.travis.yml +10 -0
- data/Gemfile +0 -1
- data/README.md +451 -197
- data/Rakefile +4 -9
- data/email_address.gemspec +9 -5
- data/lib/email_address.rb +55 -24
- data/lib/email_address/active_record_validator.rb +5 -5
- data/lib/email_address/address.rb +152 -72
- data/lib/email_address/canonical_email_address_type.rb +46 -0
- data/lib/email_address/config.rb +148 -64
- data/lib/email_address/email_address_type.rb +15 -31
- data/lib/email_address/exchanger.rb +31 -34
- data/lib/email_address/host.rb +327 -51
- data/lib/email_address/local.rb +304 -52
- data/lib/email_address/version.rb +1 -1
- data/test/activerecord/test_ar.rb +22 -0
- data/test/activerecord/user.rb +71 -0
- data/test/email_address/test_address.rb +53 -27
- data/test/email_address/test_config.rb +23 -8
- data/test/email_address/test_exchanger.rb +22 -10
- data/test/email_address/test_host.rb +47 -6
- data/test/email_address/test_local.rb +80 -16
- data/test/test_email_address.rb +38 -4
- data/test/test_helper.rb +7 -5
- metadata +68 -34
- data/lib/email_address/domain_matcher.rb +0 -98
- data/lib/email_address/domain_parser.rb +0 -69
- data/lib/email_address/matcher.rb +0 -119
- data/lib/email_address/validator.rb +0 -141
- data/test/email_address/test_domain_matcher.rb +0 -21
- data/test/email_address/test_domain_parser.rb +0 -29
- data/test/email_address/test_matcher.rb +0 -44
- data/test/email_address/test_validator.rb +0 -16
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 570217e88251b8510966cf63bd1a90a7fa79b458
|
4
|
+
data.tar.gz: c23a8a65b94f8a5a6de14f221e8db7f16d0df6e1
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: d3e7f8dd2a92753b889bc4d8734961fd12626a40d28a06a52ce7e5c9c2691a5a2c62a8a05fc061741d54ec3c3576e1a6bde782b60e1647c3d4770bce95080c52
|
7
|
+
data.tar.gz: 9c05fb4ace40a99b51bf5c6646b0cc1d398a857e9e2f3074e3b05418c0562610e51bc7b0cef55ca7be4946b74de53bbff3b8fa69e121c8f064c5b08fd97e854d
|
data/.travis.yml
ADDED
data/Gemfile
CHANGED
data/README.md
CHANGED
@@ -1,156 +1,140 @@
|
|
1
1
|
# Email Address
|
2
2
|
|
3
3
|
[](http://rubygems.org/gems/email_address)
|
4
|
+
[](https://travis-ci.org/afair/email_address)
|
5
|
+
[](https://codeclimate.com/github/afair/email_address)
|
4
6
|
|
5
|
-
The
|
6
|
-
|
7
|
-
accepted as a "best practice" and which should not be supported (in the
|
8
|
-
name of sanity).
|
9
|
-
|
10
|
-
This library provides:
|
11
|
-
|
12
|
-
* Email Address Validation
|
13
|
-
* Converting between email address forms
|
14
|
-
* **Original:** From the user or data source
|
15
|
-
* **Normalized:** A standardized format for identification
|
16
|
-
* **Canonical:** A format used to identify a unique user
|
17
|
-
* **Redacted:** A format used to store an email address privately
|
18
|
-
* **Reference:** Digest formats for sharing addresses without exposing
|
19
|
-
them.
|
20
|
-
* Matching addresses to Email/Internet Service Providers. Per-provider
|
21
|
-
rules for:
|
22
|
-
* Validation
|
23
|
-
* Address Tag formats
|
24
|
-
* Canonicalization
|
25
|
-
* Unicode Support
|
26
|
-
|
27
|
-
## Email Addresses: The Good Parts
|
28
|
-
|
29
|
-
Email Addresses are split into two parts: the `local` and `host` part,
|
30
|
-
separated by the `@` symbol, or of the generalized format:
|
31
|
-
|
32
|
-
mailbox+tag@subdomain.domain.tld
|
33
|
-
|
34
|
-
The **Mailbox** usually identifies the user, role account, or application.
|
35
|
-
A **Tag** is any suffix for the mailbox useful for separating and filtering
|
36
|
-
incoming email. It is usually preceded by a '+' or other character. Tags are
|
37
|
-
not always available for a given ESP or MTA.
|
38
|
-
|
39
|
-
Local Parts should consist of lower-case 7-bit ASCII alphanumeric and these characters:
|
40
|
-
`-+'.,` It should start with and end with an alphanumeric character and
|
41
|
-
no more than one special character should appear together.
|
42
|
-
|
43
|
-
Host parts contain a lower-case version of any standard domain name.
|
44
|
-
International Domain Names are allowed, and can be converted to
|
45
|
-
[Punycode](http://en.wikipedia.org/wiki/Punycode),
|
46
|
-
an encoding system of Unicode strings into the 7-bit ASCII character set.
|
47
|
-
Domain names should be configured with MX records in DNS to receive
|
48
|
-
email, though this is sometimes mis-configured and the A record can be
|
49
|
-
used as a backup.
|
50
|
-
|
51
|
-
This is the subset of the RFC Email Address specification that should be
|
52
|
-
used.
|
53
|
-
|
54
|
-
## Email Addresses: The Bad Parts
|
55
|
-
|
56
|
-
Email addresses are defined and redefined in a series of RFC standards.
|
57
|
-
Conforming to the full standards is not recommended for easily
|
58
|
-
identifying and supporting email addresses. Among these specification,
|
59
|
-
we reject are:
|
7
|
+
The `email_address` gem provides a ruby language library for working
|
8
|
+
with email addresses.
|
60
9
|
|
61
|
-
|
62
|
-
|
63
|
-
|
64
|
-
|
65
|
-
* IP and IPv6 addresses as hosts: `mailbox@[127.0.0.1]`
|
66
|
-
* Non-ASCII (7-bit) characters in the local part: `Pelé@example.com`
|
67
|
-
* Validation by regular expressions like:
|
68
|
-
```
|
69
|
-
(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*
|
70
|
-
| "(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]
|
71
|
-
| \\[\x01-\x09\x0b\x0c\x0e-\x7f])*")
|
72
|
-
@ (?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?
|
73
|
-
| \[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}
|
74
|
-
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:
|
75
|
-
(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]
|
76
|
-
| \\[\x01-\x09\x0b\x0c\x0e-\x7f])+)
|
77
|
-
\])
|
78
|
-
```
|
79
|
-
|
80
|
-
## Internationalization
|
10
|
+
By default, it validates against conventional usage,
|
11
|
+
the format preferred for user email addresses.
|
12
|
+
It can be configured to validate against RFC "Standard" formats,
|
13
|
+
common email service provider formats, and perform DNS validation.
|
81
14
|
|
82
|
-
|
83
|
-
|
84
|
-
|
85
|
-
|
86
|
-
encoded databases and upgraded to the new email standards.
|
15
|
+
Using `email_address` to validate user email addresses results in
|
16
|
+
fewer "false positives" due to typing errors and gibberish data.
|
17
|
+
It validates syntax more strictly for popular email providers,
|
18
|
+
and can deal with gmail's "optional dots" in addresses.
|
87
19
|
|
88
|
-
|
89
|
-
|
90
|
-
locale. Because of this, internationalized local parts are not yet
|
91
|
-
supported by default. They are more likely to be erroneous.
|
20
|
+
It provides Active Record (Rails) extensions, including an
|
21
|
+
address validator and attributes API custom datatypes.
|
92
22
|
|
93
|
-
|
94
|
-
|
95
|
-
|
23
|
+
*Note:* Version 0.1.0 contains significant API and internal changes over the 0.0.3
|
24
|
+
version. If you have been using the 0.0.x series of the gem, you may
|
25
|
+
want to continue using with your current version.
|
96
26
|
|
97
|
-
|
98
|
-
|
99
|
-
* The **original** email address is of the format given by the user.
|
100
|
-
* The **Normalized** address has:
|
101
|
-
* Lower-case the local and domain part
|
102
|
-
* Tags are kept as they are important for the user
|
103
|
-
* Remove comments and any "bad parts"
|
104
|
-
* This format is what should be used to identify the account.
|
105
|
-
* The **Canonical** form is used to uniquely identify the mailbox.
|
106
|
-
* Domains stored as punycode for IDN
|
107
|
-
* Address Tags removed
|
108
|
-
* Special characters removed (dots in gmail addresses are not
|
109
|
-
significant)
|
110
|
-
* Lower cased and "bad parts" removed
|
111
|
-
* Useful for locating a user who forgets registering with a tag or
|
112
|
-
with a "Bad part" in the email address.
|
113
|
-
* The **Redacted** format is used to store email address fingerprints
|
114
|
-
instead of the actual addresses:
|
115
|
-
* Format: sha1(canonical_address)@domain
|
116
|
-
* Given an email address, the record can be found
|
117
|
-
* Useful for treating email addresses as sensitive data and
|
118
|
-
complying with requests to remove the address from your database and
|
119
|
-
still maintain the state of the account.
|
120
|
-
* The **Reference** form allows you to publicly share an address without
|
121
|
-
revealing the actual address.
|
122
|
-
* Can be the MD5 or SHA1 of the normalized or canonical address
|
123
|
-
* Useful for "do not email" lists
|
124
|
-
* Useful for cookies that do not reveal the actual account
|
125
|
-
|
126
|
-
## Treating Email Addresses as Sensitive Data
|
27
|
+
Requires Ruby 2.0 or later.
|
127
28
|
|
128
|
-
|
129
|
-
becoming more important as a personal identifier on the internet.
|
130
|
-
Increasingly, we should treat email addresses as sensitive data. If your
|
131
|
-
site/database becomes compromised by hackers, these email addresses can
|
132
|
-
be stolen and used to spam your users and to try to gain access to their
|
133
|
-
accounts. You should not be storing passwords in plain text; perhaps you
|
134
|
-
don't need to store email addresses un-encoded either.
|
29
|
+
## Background
|
135
30
|
|
136
|
-
|
137
|
-
|
138
|
-
|
139
|
-
the user-supplied one and look up the record. Store the original address
|
140
|
-
in the session for the user, which goes away when the user logs out.
|
31
|
+
The email address specification is complex and often not what you want
|
32
|
+
when working with personal email addresses in applications. This library
|
33
|
+
introduces terms to distinguish types of email addresses.
|
141
34
|
|
142
|
-
|
143
|
-
|
144
|
-
|
145
|
-
access. Given the original email address again, the redacted account can
|
146
|
-
be identified if necessary.
|
35
|
+
* *Normal* - The edited form of any input email address. Typically, it
|
36
|
+
is lower-cased and minor "fixes" can be performed, depending on the
|
37
|
+
configurations and email address provider.
|
147
38
|
|
148
|
-
|
149
|
-
|
39
|
+
CKENT@DAILYPLANET.NEWS => ckent@dailyplanet.news
|
40
|
+
|
41
|
+
* *Conventional* - Most personal account addresses are in this basic
|
42
|
+
format, one or more "words" separated by a single simple punctuation
|
43
|
+
character. It consists of a mailbox (user name or role account) and
|
44
|
+
an optional address "tag" assigned by the user.
|
45
|
+
|
46
|
+
miles.o'brien@ncc-1701-d.ufp
|
47
|
+
|
48
|
+
* *Relaxed* - A less strict form of Conventional, same character set,
|
49
|
+
must begin and end with an alpha-numeric character, but order within
|
50
|
+
is not enforced.
|
51
|
+
|
52
|
+
aasdf-34-.z@example.com
|
53
|
+
|
54
|
+
* *Standard* - The RFC-Compliant syntax of an email address. This is
|
55
|
+
useful when working with software-generated addresses or handling
|
56
|
+
existing email addresses, but otherwise not useful for personal
|
57
|
+
addresses.
|
58
|
+
|
59
|
+
madness!."()<>[]:,;@\\\"!#$%&'*+-/=?^_`{}| ~.a(comment )"@example.org
|
150
60
|
|
151
|
-
|
61
|
+
* *Canonical* - An unique account address, lower-cased, without the
|
62
|
+
tag, and with irrelevant characters stripped.
|
152
63
|
|
153
|
-
|
64
|
+
clark.kent+scoops@gmail.com => clarkkent@gmail.com
|
65
|
+
|
66
|
+
* *Reference* - The MD5 of the Canonical format, used to share account
|
67
|
+
references without exposing the private email address directly.
|
68
|
+
|
69
|
+
Clark.Kent+scoops@gmail.com => c5be3597c391169a5ad2870f9ca51901
|
70
|
+
|
71
|
+
* *Redacted* - A form of the email address where it is replaced by
|
72
|
+
a SHA1-based version to remove the original address from the
|
73
|
+
database, or to store the address privately, yet still keep it
|
74
|
+
accessible at query time by converting the queried address to
|
75
|
+
the redacted form.
|
76
|
+
|
77
|
+
Clark.Kent+scoops@gmail.com => {bea3f3560a757f8142d38d212a931237b218eb5e}@gmail.com
|
78
|
+
|
79
|
+
* *Munged* - An obfuscated version of the email address suitable for
|
80
|
+
publishing on the internet, where email address harvesting
|
81
|
+
could occur.
|
82
|
+
|
83
|
+
Clark.Kent+scoops@gmail.com => cl\*\*\*\*\*@gm\*\*\*\*\*
|
84
|
+
|
85
|
+
Other terms:
|
86
|
+
|
87
|
+
* *Local* - The left-hand side of the "@", representing the user,
|
88
|
+
mailbox, or role, and an optional "tag".
|
89
|
+
|
90
|
+
mailbox+tag@example.com; Local part: mailbox+tag
|
91
|
+
|
92
|
+
* *Mailbox* - The destination user account or role account.
|
93
|
+
* *Tag* - A parameter added after the mailbox, usually after the
|
94
|
+
"+" symbol, set by the user for mail filtering and sub-accounts.
|
95
|
+
Not all mail systems support this.
|
96
|
+
* *Host* (sometimes called *Domain*) - The right-hand side of the "@"
|
97
|
+
indicating the domain or host name server to delivery the email.
|
98
|
+
If missing, "localhost" is assumed, or if not a fully-qualified
|
99
|
+
domain name, it assumed another computer on the same network, but
|
100
|
+
this is increasingly rare.
|
101
|
+
* *Provider* - The Email Service Provider (ESP) providing the email
|
102
|
+
service. Each provider may have its own email address validation
|
103
|
+
and canonicalization rules.
|
104
|
+
* *Punycode* - A host name with Unicode characters (International
|
105
|
+
Domain Name or IDN) needs conversion to this ASCII-encoded format
|
106
|
+
for DNS lookup.
|
107
|
+
|
108
|
+
"HIRO@こんにちは世界.com" => "hiro@xn--28j2a3ar1pp75ovm7c.com"
|
109
|
+
|
110
|
+
Wikipedia has a great article on
|
111
|
+
[Email Addresses](https://en.wikipedia.org/wiki/Email_address),
|
112
|
+
much more readable than the section within
|
113
|
+
[RFC 5322](https://tools.ietf.org/html/rfc5322#section-3.4)
|
114
|
+
|
115
|
+
## Avoiding the Bad Parts of RFC Specification
|
116
|
+
|
117
|
+
Following the RFC specification sounds like a good idea, until you
|
118
|
+
learn about all the madness contained therein. This library can
|
119
|
+
validate the RFC syntax, but this is never useful, especially when
|
120
|
+
validating user email address submissions. By default, it validates
|
121
|
+
to the *conventional* format.
|
122
|
+
|
123
|
+
Here are a few parts of the RFC specification you should avoid:
|
124
|
+
|
125
|
+
* Case-sensitive local parts: `First.Last@example.com`
|
126
|
+
* Spaces and Special Characters: `"():;<>@[\\]`
|
127
|
+
* Quoting and Escaping Requirements: `"first \"nickname\" last"@example.com`
|
128
|
+
* Comment Parts: `(comment)mailbox@example.com`
|
129
|
+
* IP and IPv6 addresses as hosts: `mailbox@[127.0.0.1]`
|
130
|
+
* Non-ASCII (7-bit) characters in the local part: `Pelé@example.com`
|
131
|
+
* Validation by voodoo regular expressions
|
132
|
+
* Gmail allows ".." in addresses since they are not meaningful, but
|
133
|
+
the standard does not.
|
134
|
+
|
135
|
+
## Installation With Rails or Bundler
|
136
|
+
|
137
|
+
If you are using Rails or a project with Bundler, add this line to your application's Gemfile:
|
154
138
|
|
155
139
|
gem 'email_address'
|
156
140
|
|
@@ -158,100 +142,361 @@ And then execute:
|
|
158
142
|
|
159
143
|
$ bundle
|
160
144
|
|
161
|
-
|
145
|
+
## Installation Without Bundler
|
146
|
+
|
147
|
+
If you are not using Bundler, you need to install the gem yourself.
|
162
148
|
|
163
149
|
$ gem install email_address
|
164
150
|
|
151
|
+
Require the gem inside your script.
|
152
|
+
|
153
|
+
require 'rubygems'
|
154
|
+
require 'email_address'
|
155
|
+
|
165
156
|
## Usage
|
166
157
|
|
167
|
-
|
168
|
-
|
158
|
+
Use `EmailAddress` to do transformations and validations. You can also
|
159
|
+
instantiate an object to inspect the address.
|
169
160
|
|
170
|
-
|
171
|
-
|
172
|
-
email.canonical #=> "user@example.com"
|
173
|
-
email.redact #=> "63a710569261a24b3766275b7000ce8d7b32e2f7@example.com"
|
174
|
-
email.sha1 #=> "63a710569261a24b3766275b7000ce8d7b32e2f7"
|
175
|
-
email.md5 #=> "dea073fb289e438a6d69c5384113454c"
|
161
|
+
These top-level helpers return edited email addresses and validation
|
162
|
+
check.
|
176
163
|
|
177
|
-
|
178
|
-
|
179
|
-
|
180
|
-
|
164
|
+
address = "Clark.Kent+scoops@gmail.com"
|
165
|
+
EmailAddress.valid?(address) #=> true
|
166
|
+
EmailAddress.normal(address) #=> "clark.kent+scoops@gmail.com"
|
167
|
+
EmailAddress.canonical(address) #=> "clarkkent@gmail.com"
|
168
|
+
EmailAddress.reference(address) #=> "c5be3597c391169a5ad2870f9ca51901"
|
169
|
+
EmailAddress.redact(address) #=> "{bea3f3560a757f8142d38d212a931237b218eb5e}@gmail.com"
|
170
|
+
EmailAddress.munge(address) #=> "cl*****@gm*****"
|
171
|
+
EmailAddress.matches?(address, 'google') #=> 'google' (true)
|
172
|
+
EmailAddress.error("#bad@example.com") #=> "Invalid Mailbox"
|
181
173
|
|
182
|
-
|
174
|
+
Or you can create an instance of the email address to work with it.
|
175
|
+
|
176
|
+
email = EmailAddress.new(address) #=> #<EmailAddress::Address:0x007fe6ee150540 ...>
|
177
|
+
email.normal #=> "clark.kent+scoops@gmail.com"
|
178
|
+
email.canonical #=> "clarkkent@gmail.com"
|
179
|
+
email.original #=> "Clark.Kent+scoops@gmail.com"
|
180
|
+
email.valid? #=> true
|
181
|
+
|
182
|
+
Here are some other methods that are available.
|
183
|
+
|
184
|
+
email.redact #=> "{bea3f3560a757f8142d38d212a931237b218eb5e}@gmail.com"
|
185
|
+
email.sha1 #=> "bea3f3560a757f8142d38d212a931237b218eb5e"
|
186
|
+
email.md5 #=> "c5be3597c391169a5ad2870f9ca51901"
|
187
|
+
email.host_name #=> "gmail.com"
|
183
188
|
email.provider #=> :google
|
184
|
-
email.
|
189
|
+
email.mailbox #=> "clark.kent"
|
190
|
+
email.tag #=> "scoops"
|
191
|
+
|
192
|
+
email.host.exchanger.first[:ip] #=> "2a00:1450:400b:c02::1a"
|
193
|
+
email.host.txt_hash #=> {:v=>"spf1", :redirect=>"\_spf.google.com"}
|
194
|
+
|
195
|
+
EmailAddress.normal("HIRO@こんにちは世界.com")
|
196
|
+
#=> "hiro@xn--28j2a3ar1pp75ovm7c.com"
|
197
|
+
EmailAddress.normal("hiro@xn--28j2a3ar1pp75ovm7c.com", host_encoding: :unicode)
|
198
|
+
#=> "hiro@こんにちは世界.com"
|
199
|
+
|
200
|
+
#### Rails Validator
|
201
|
+
|
202
|
+
For Rails' ActiveRecord classes, EmailAddress provides an ActiveRecordValidator.
|
203
|
+
Specify your email address attributes with `field: :user_email`, or
|
204
|
+
`fields: [:email1, :email2]`. If neither is given, it assumes to use the
|
205
|
+
`email` or `email_address` attribute.
|
206
|
+
|
207
|
+
class User < ActiveRecord::Base
|
208
|
+
validates_with EmailAddress::ActiveRecordValidator, field: :email
|
209
|
+
end
|
210
|
+
|
211
|
+
#### Rails Email Address Type Attribute
|
212
|
+
|
213
|
+
Initial support is provided for Active Record 5.0 attributes API.
|
214
|
+
|
215
|
+
First, you need to register the type in
|
216
|
+
`config/initializers/email_address.rb` along with any global
|
217
|
+
configurations you want.
|
218
|
+
|
219
|
+
ActiveRecord::Type.register(:email_address, EmailAddress::EmailAddressType)
|
220
|
+
ActiveRecord::Type.register(:canonical_email_address,
|
221
|
+
EmailAddress::CanonicalEmailAddressType)
|
222
|
+
|
223
|
+
Assume the Users table contains the columns "email" and "canonical_email".
|
224
|
+
We want to normalize the address in "email" and store the canonical/unique
|
225
|
+
version in "canonical_email". This code will set the canonical_email when
|
226
|
+
the email attribute is assigned. With the canonical_email column,
|
227
|
+
we can look up the User, even it the given email address didn't exactly
|
228
|
+
match the registered version.
|
229
|
+
|
230
|
+
class User < ApplicationRecord
|
231
|
+
attribute :email, :email_address
|
232
|
+
attribute :canonical_email, :canonical_email_address
|
185
233
|
|
186
|
-
|
187
|
-
|
188
|
-
original formatting, case, and tag information.
|
234
|
+
validates_with EmailAddress::ActiveRecordValidator,
|
235
|
+
fields: %i(email canonical_email)
|
189
236
|
|
190
|
-
|
237
|
+
def email=(email_address)
|
238
|
+
self[:canonical_email] = email_address
|
239
|
+
self[:email] = email_address
|
240
|
+
end
|
191
241
|
|
192
|
-
|
193
|
-
|
242
|
+
def self.find_by_email(email)
|
243
|
+
user = self.find_by(email: EmailAddress.normal(email))
|
244
|
+
user ||= self.find_by(canonical_email: EmailAddress.canonical(email))
|
245
|
+
user ||= self.find_by(canonical_email: EmailAddress.redacted(email))
|
246
|
+
user
|
247
|
+
end
|
194
248
|
|
195
|
-
|
249
|
+
def redact!
|
250
|
+
self[:canonical_email] = EmailAddress.redact(self.canonical_email)
|
251
|
+
self[:email] = self[:canonical_email]
|
252
|
+
end
|
253
|
+
end
|
254
|
+
|
255
|
+
Here is how the User model works:
|
256
|
+
|
257
|
+
user = User.create(email:"Pat.Smith+registrations@gmail.com")
|
258
|
+
user.email #=> "pat.smith+registrations@gmail.com"
|
259
|
+
user.canonical_email #=> "patsmith@gmail.com"
|
260
|
+
User.find_by_email("PAT.SMITH@GMAIL.COM")
|
261
|
+
#=> #<User email="pat.smith+registrations@gmail.com">
|
262
|
+
|
263
|
+
|
264
|
+
The `find_by_email` method looks up a given email address by the
|
265
|
+
normalized form (lower case), then by the canonical form, then finally
|
266
|
+
by the redacted form.
|
267
|
+
|
268
|
+
#### Validation
|
196
269
|
|
197
|
-
|
198
|
-
|
199
|
-
|
200
|
-
|
270
|
+
The only true validation is to send a message to the email address and
|
271
|
+
have the user (or process) verify it has been received. Syntax checks
|
272
|
+
help prevent erroneous input. Even sent messages can be silently
|
273
|
+
dropped, or bounced back after acceptance. Conditions such as a
|
274
|
+
"Mailbox Full" can mean the email address is known, but abandoned.
|
275
|
+
|
276
|
+
There are different levels of validations you can perform. By default, it will
|
277
|
+
validate to the "Provider" (if known), or "Conventional" format defined as the
|
278
|
+
"default" provider. You may pass a a list of parameters to select
|
279
|
+
which syntax and network validations to perform.
|
280
|
+
|
281
|
+
#### Comparison
|
201
282
|
|
202
283
|
You can compare email addresses:
|
203
284
|
|
204
|
-
e1 = EmailAddress.new("
|
205
|
-
|
206
|
-
e2 = EmailAddress.new("FirstLast+tag@Gmail.com")
|
207
|
-
e3.to_s #=> "firstlast+tag@gmail.com"
|
285
|
+
e1 = EmailAddress.new("Clark.Kent@Gmail.com")
|
286
|
+
e2 = EmailAddress.new("clark.kent+Superman@Gmail.com")
|
208
287
|
e3 = EmailAddress.new(e2.redact)
|
209
|
-
|
288
|
+
e1.to_s #=> "clark.kent@gmail.com"
|
289
|
+
e2.to_s #=> "clark.kent+superman@gmail.com"
|
290
|
+
e3.to_s #=> "{bea3f3560a757f8142d38d212a931237b218eb5e}@gmail.com"
|
210
291
|
|
211
292
|
e1 == e2 #=> false (Matches by normalized address)
|
212
293
|
e1.same_as?(e2) #=> true (Matches as canonical address)
|
213
294
|
e1.same_as?(e3) #=> true (Matches as redacted address)
|
214
295
|
e1 < e2 #=> true (Compares using normalized address)
|
215
296
|
|
216
|
-
|
297
|
+
#### Matching
|
217
298
|
|
218
|
-
|
299
|
+
Matching addresses by simple patterns:
|
219
300
|
|
220
|
-
|
221
|
-
|
222
|
-
|
223
|
-
|
224
|
-
|
225
|
-
|
226
|
-
|
227
|
-
|
301
|
+
* Top-Level-Domain: .org
|
302
|
+
* Domain Name: example.com
|
303
|
+
* Registration Name: hotmail. (matches any TLD)
|
304
|
+
* Domain Glob: *.exampl?.com
|
305
|
+
* Provider Name: google
|
306
|
+
* Mailbox Name or Glob: user00*@
|
307
|
+
* Address or Glob: postmaster@domain*.com
|
308
|
+
* Provider or Registration: msn
|
228
309
|
|
229
|
-
|
310
|
+
Usage:
|
230
311
|
|
231
|
-
|
312
|
+
e = EmailAddress.new("Clark.Kent@Gmail.com")
|
313
|
+
e.matches?("gmail.com") #=> true
|
314
|
+
e.matches?("google") #=> true
|
315
|
+
e.matches?(".org") #=> false
|
316
|
+
e.matches?("g*com") #=> true
|
317
|
+
e.matches?("gmail.") #=> true
|
318
|
+
e.matches?("*kent*@") #=> true
|
232
319
|
|
233
|
-
|
320
|
+
### Configuration
|
234
321
|
|
235
|
-
|
322
|
+
You can pass an options hash on the `.new()` and helper class methods to
|
323
|
+
control how the library treats that address. These can also be
|
324
|
+
configured during initialization by provider and default (see below).
|
236
325
|
|
237
|
-
|
238
|
-
|
239
|
-
* Registration names matching without the TLD. 'yahoo' matches:
|
240
|
-
* "www.yahoo.com" (with Subdomains)
|
241
|
-
* "yahoo.ca" (any TLD)
|
242
|
-
* "yahoo.co.jp" (2-char TLD with 2-char Second-level)
|
243
|
-
* But _may_ also match non-Yahoo domain names (yahoo.xxx)
|
326
|
+
EmailAddress.new("clark.kent@gmail.com",
|
327
|
+
dns_lookup::off, host_encoding: :unicode)
|
244
328
|
|
245
|
-
|
329
|
+
Globally, you can change and query configuration options:
|
246
330
|
|
247
|
-
|
331
|
+
EmailAddress::Config.setting(:dns_lookup, :mx)
|
332
|
+
EmailAddress::Config.setting(:dns_lookup) #=> :mx
|
248
333
|
|
249
|
-
|
250
|
-
|
251
|
-
|
252
|
-
|
334
|
+
Or set multiple settings at once:
|
335
|
+
|
336
|
+
EmailAddress::Config.configure(local_downcase:false, dns_lookup: :off)
|
337
|
+
|
338
|
+
You can add special rules by domain or provider. It takes the options
|
339
|
+
above and adds the :domain_match and :exchanger_match rules.
|
340
|
+
|
341
|
+
EmailAddress.define_provider('google',
|
342
|
+
domain_match: %w(gmail.com googlemail.com),
|
343
|
+
exchanger_match: %w(google.com), # Requires dns_lookup==:mx
|
344
|
+
local_size: 5..64,
|
345
|
+
mailbox_canonical: ->(m) {m.gsub('.','')})
|
346
|
+
|
347
|
+
The library ships with the most common set of provider rules. It is not meant
|
348
|
+
to house a database of all providers, but a separate `email_address-providers`
|
349
|
+
gem may be created to hold this data for those who need more complete rules.
|
350
|
+
|
351
|
+
Personal and Corporate email systems are not intended for either solution.
|
352
|
+
Any of these email systems may be configured locally.
|
353
|
+
|
354
|
+
Pre-configured email address providers include: Google (gmail), AOL, MSN
|
355
|
+
(hotmail, live, outlook), and Yahoo. Any address not matching one of
|
356
|
+
those patterns use the "default" provider rule set. Exchanger matches
|
357
|
+
matches against the Mail Exchanger (SMTP receivers) hosts defined in
|
358
|
+
DNS. If you specify an exchanger pattern, but requires a DNS MX lookup.
|
359
|
+
|
360
|
+
For Rails application, create an initializer file with your default
|
361
|
+
configuration options:
|
362
|
+
|
363
|
+
# ./config/initializers/email_address.rb
|
364
|
+
EmailAddress::Config.setting( local_format: :relaxed )
|
365
|
+
EmailAddress::Config.provider(:github,
|
366
|
+
host_match: %w(github.com), local_format: :standard)
|
367
|
+
|
368
|
+
### Available Configuration Settings
|
369
|
+
|
370
|
+
* dns_lookup: Enables DNS lookup for validation by
|
371
|
+
* :mx - DNS MX Record lookup
|
372
|
+
* :a - DNS A Record lookup (as some domains don't specify an MX incorrectly)
|
373
|
+
* :off - Do not perform DNS lookup (Test mode, network unavailable)
|
374
|
+
|
375
|
+
* sha1_secret -
|
376
|
+
This application-level secret is appended to the email_address to compute
|
377
|
+
the SHA1 Digest, making it unique to your application so it can't easily be
|
378
|
+
discovered by comparing against a known list of email/sha1 pairs.
|
379
|
+
|
380
|
+
* munge_string - "*****", the string to replace into munged addresses.
|
381
|
+
|
382
|
+
For local part configuration:
|
253
383
|
|
254
|
-
|
384
|
+
* local_downcase: true.
|
385
|
+
Downcase the local part. You probably want this for uniqueness.
|
386
|
+
RFC says local part is case insensitive, that's a bad part.
|
387
|
+
|
388
|
+
* local_fix: true.
|
389
|
+
Make simple fixes when available, remove spaces, condense multiple punctuations
|
390
|
+
|
391
|
+
* local_encoding: :ascii, :unicode,
|
392
|
+
Enable Unicode in local part. Most mail systems do not yet support this.
|
393
|
+
You probably want to stay with ASCII for now.
|
394
|
+
|
395
|
+
* local_parse: nil, ->(local) { [mailbox, tag, comment] }
|
396
|
+
Specify an optional lambda/Proc to parse the local part. It should return an
|
397
|
+
array (tuple) of mailbox, tag, and comment.
|
398
|
+
|
399
|
+
* local_format:
|
400
|
+
* :conventional - word ( puncuation{1} word )*
|
401
|
+
* :relaxed - alphanum ( allowed_characters)* alphanum
|
402
|
+
* :standard - RFC Compliant email addresses (anything goes!)
|
403
|
+
|
404
|
+
* local_size: 1..64,
|
405
|
+
A Range specifying the allowed size for mailbox + tags + comment
|
406
|
+
|
407
|
+
* tag_separator: nil, character (+)
|
408
|
+
Nil, or a character used to split the tag from the mailbox
|
409
|
+
|
410
|
+
For the mailbox (AKA account, role), without the tag
|
411
|
+
* mailbox_size: 1..64
|
412
|
+
A Range specifying the allowed size for mailbox
|
413
|
+
|
414
|
+
* mailbox_canonical: nil, ->(mailbox) { mailbox }
|
415
|
+
An optional lambda/Proc taking a mailbox name, returning a canonical
|
416
|
+
version of it. (E.G.: gmail removes '.' characters)
|
417
|
+
|
418
|
+
* mailbox_validator: nil, ->(mailbox) { true }
|
419
|
+
An optional lambda/Proc taking a mailbox name, returning true or false.
|
420
|
+
|
421
|
+
* host_encoding: :punycode, :unicode,
|
422
|
+
How to treat International Domain Names (IDN). Note that most mail and
|
423
|
+
DNS systems do not support unicode, so punycode needs to be passed.
|
424
|
+
:punycode Convert Unicode names to punycode representation
|
425
|
+
:unicode Keep Unicode names as is.
|
426
|
+
|
427
|
+
* host_validation:
|
428
|
+
:mx Ensure host is configured with DNS MX records
|
429
|
+
:a Ensure host is known to DNS (A Record)
|
430
|
+
:syntax Validate by syntax only, no Network verification
|
431
|
+
:connect Attempt host connection (not implemented, BAD!)
|
432
|
+
|
433
|
+
* host_size: 1..253,
|
434
|
+
A range specifying the size limit of the host part,
|
435
|
+
|
436
|
+
* host_allow_ip: false,
|
437
|
+
Allow IP address format in host: [127.0.0.1], [IPv6:::1]
|
438
|
+
|
439
|
+
* address_validation: :parts, :smtp, ->(address) { true }
|
440
|
+
Address validation policy
|
441
|
+
:parts Validate local and host.
|
442
|
+
:smtp Validate via SMTP (not implemented, BAD!)
|
443
|
+
A lambda/Proc taking the address string, returning true or false
|
444
|
+
|
445
|
+
* address_size: 3..254,
|
446
|
+
A range specifying the size limit of the complete address
|
447
|
+
|
448
|
+
* address_local: false,
|
449
|
+
Allow localhost, no domain, or local subdomains.
|
450
|
+
|
451
|
+
For provider rules to match to domain names and Exchanger hosts
|
452
|
+
The value is an array of match tokens.
|
453
|
+
* host_match: %w(.org example.com hotmail. user*@ sub.*.com)
|
454
|
+
* exchanger_match: %w(google.com 127.0.0.1 10.9.8.0/24 ::1/64)
|
455
|
+
|
456
|
+
## Notes
|
457
|
+
|
458
|
+
#### Internationalization
|
459
|
+
|
460
|
+
The industry is moving to support Unicode characters in the local part
|
461
|
+
of the email address. Currently, SMTP supports only 7-bit ASCII, but a
|
462
|
+
new `SMTPUTF8` standard is available, but not yet widely implemented.
|
463
|
+
To work properly, global Email systems must be converted to UTF-8
|
464
|
+
encoded databases and upgraded to the new email standards.
|
465
|
+
|
466
|
+
The problem with i18n email addresses is that support outside of the
|
467
|
+
given locale becomes hard to enter addresses on keyboards for another
|
468
|
+
locale. Because of this, internationalized local parts are not yet
|
469
|
+
supported by default. They are more likely to be erroneous.
|
470
|
+
|
471
|
+
Proper personal identity can still be provided using
|
472
|
+
[MIME Encoded-Words](http://en.wikipedia.org/wiki/MIME#Encoded-Word)
|
473
|
+
in Email headers.
|
474
|
+
|
475
|
+
|
476
|
+
#### Email Addresses as Sensitive Data
|
477
|
+
|
478
|
+
Like Social Security and Credit Card Numbers, email addresses are
|
479
|
+
becoming more important as a personal identifier on the internet.
|
480
|
+
Increasingly, we should treat email addresses as sensitive data. If your
|
481
|
+
site/database becomes compromised by hackers, these email addresses can
|
482
|
+
be stolen and used to spam your users and to try to gain access to their
|
483
|
+
accounts. You should not be storing passwords in plain text; perhaps you
|
484
|
+
don't need to store email addresses un-encoded either.
|
485
|
+
|
486
|
+
Consider this: upon registration, store the redacted email address for
|
487
|
+
the user, and of course, the salted, encrypted password.
|
488
|
+
When the user logs in, compute the redacted email address from
|
489
|
+
the user-supplied one and look up the record. Store the original address
|
490
|
+
in the session for the user, which goes away when the user logs out.
|
491
|
+
|
492
|
+
Sometimes, users demand you strike their information from the database.
|
493
|
+
Instead of deleting their account, you can "redact" their email
|
494
|
+
address, retaining the state of the account to prevent future
|
495
|
+
access. Given the original email address again, the redacted account can
|
496
|
+
be identified if necessary.
|
497
|
+
|
498
|
+
Because of these use cases, the **redact** method on the email address
|
499
|
+
instance has been provided.
|
255
500
|
|
256
501
|
## Contributing
|
257
502
|
|
@@ -260,3 +505,12 @@ See `lib/email_address/config.rb` for more options.
|
|
260
505
|
3. Commit your changes (`git commit -am 'Add some feature'`)
|
261
506
|
4. Push to the branch (`git push origin my-new-feature`)
|
262
507
|
5. Create new Pull Request
|
508
|
+
|
509
|
+
#### Project
|
510
|
+
|
511
|
+
This project lives at [https://github.com/afair/email_address/](https://github.com/afair/email_address/)
|
512
|
+
|
513
|
+
#### Authors
|
514
|
+
|
515
|
+
* [Allen Fair](https://github.com/afair) ([@allenfair](https://twitter.com/allenfair)):
|
516
|
+
I've worked with email-based applications and email addresses since 1999.
|