email_address 0.0.3 → 0.1.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/.travis.yml +10 -0
- data/Gemfile +0 -1
- data/README.md +451 -197
- data/Rakefile +4 -9
- data/email_address.gemspec +9 -5
- data/lib/email_address.rb +55 -24
- data/lib/email_address/active_record_validator.rb +5 -5
- data/lib/email_address/address.rb +152 -72
- data/lib/email_address/canonical_email_address_type.rb +46 -0
- data/lib/email_address/config.rb +148 -64
- data/lib/email_address/email_address_type.rb +15 -31
- data/lib/email_address/exchanger.rb +31 -34
- data/lib/email_address/host.rb +327 -51
- data/lib/email_address/local.rb +304 -52
- data/lib/email_address/version.rb +1 -1
- data/test/activerecord/test_ar.rb +22 -0
- data/test/activerecord/user.rb +71 -0
- data/test/email_address/test_address.rb +53 -27
- data/test/email_address/test_config.rb +23 -8
- data/test/email_address/test_exchanger.rb +22 -10
- data/test/email_address/test_host.rb +47 -6
- data/test/email_address/test_local.rb +80 -16
- data/test/test_email_address.rb +38 -4
- data/test/test_helper.rb +7 -5
- metadata +68 -34
- data/lib/email_address/domain_matcher.rb +0 -98
- data/lib/email_address/domain_parser.rb +0 -69
- data/lib/email_address/matcher.rb +0 -119
- data/lib/email_address/validator.rb +0 -141
- data/test/email_address/test_domain_matcher.rb +0 -21
- data/test/email_address/test_domain_parser.rb +0 -29
- data/test/email_address/test_matcher.rb +0 -44
- data/test/email_address/test_validator.rb +0 -16
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 570217e88251b8510966cf63bd1a90a7fa79b458
|
4
|
+
data.tar.gz: c23a8a65b94f8a5a6de14f221e8db7f16d0df6e1
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: d3e7f8dd2a92753b889bc4d8734961fd12626a40d28a06a52ce7e5c9c2691a5a2c62a8a05fc061741d54ec3c3576e1a6bde782b60e1647c3d4770bce95080c52
|
7
|
+
data.tar.gz: 9c05fb4ace40a99b51bf5c6646b0cc1d398a857e9e2f3074e3b05418c0562610e51bc7b0cef55ca7be4946b74de53bbff3b8fa69e121c8f064c5b08fd97e854d
|
data/.travis.yml
ADDED
data/Gemfile
CHANGED
data/README.md
CHANGED
@@ -1,156 +1,140 @@
|
|
1
1
|
# Email Address
|
2
2
|
|
3
3
|
[![Gem Version](https://badge.fury.io/rb/email_address.svg)](http://rubygems.org/gems/email_address)
|
4
|
+
[![Build Status](https://travis-ci.org/afair/email_address.svg?branch=v0.1)](https://travis-ci.org/afair/email_address)
|
5
|
+
[![Code Climate](https://codeclimate.com/github/afair/email_address/badges/gpa.svg)](https://codeclimate.com/github/afair/email_address)
|
4
6
|
|
5
|
-
The
|
6
|
-
|
7
|
-
accepted as a "best practice" and which should not be supported (in the
|
8
|
-
name of sanity).
|
9
|
-
|
10
|
-
This library provides:
|
11
|
-
|
12
|
-
* Email Address Validation
|
13
|
-
* Converting between email address forms
|
14
|
-
* **Original:** From the user or data source
|
15
|
-
* **Normalized:** A standardized format for identification
|
16
|
-
* **Canonical:** A format used to identify a unique user
|
17
|
-
* **Redacted:** A format used to store an email address privately
|
18
|
-
* **Reference:** Digest formats for sharing addresses without exposing
|
19
|
-
them.
|
20
|
-
* Matching addresses to Email/Internet Service Providers. Per-provider
|
21
|
-
rules for:
|
22
|
-
* Validation
|
23
|
-
* Address Tag formats
|
24
|
-
* Canonicalization
|
25
|
-
* Unicode Support
|
26
|
-
|
27
|
-
## Email Addresses: The Good Parts
|
28
|
-
|
29
|
-
Email Addresses are split into two parts: the `local` and `host` part,
|
30
|
-
separated by the `@` symbol, or of the generalized format:
|
31
|
-
|
32
|
-
mailbox+tag@subdomain.domain.tld
|
33
|
-
|
34
|
-
The **Mailbox** usually identifies the user, role account, or application.
|
35
|
-
A **Tag** is any suffix for the mailbox useful for separating and filtering
|
36
|
-
incoming email. It is usually preceded by a '+' or other character. Tags are
|
37
|
-
not always available for a given ESP or MTA.
|
38
|
-
|
39
|
-
Local Parts should consist of lower-case 7-bit ASCII alphanumeric and these characters:
|
40
|
-
`-+'.,` It should start with and end with an alphanumeric character and
|
41
|
-
no more than one special character should appear together.
|
42
|
-
|
43
|
-
Host parts contain a lower-case version of any standard domain name.
|
44
|
-
International Domain Names are allowed, and can be converted to
|
45
|
-
[Punycode](http://en.wikipedia.org/wiki/Punycode),
|
46
|
-
an encoding system of Unicode strings into the 7-bit ASCII character set.
|
47
|
-
Domain names should be configured with MX records in DNS to receive
|
48
|
-
email, though this is sometimes mis-configured and the A record can be
|
49
|
-
used as a backup.
|
50
|
-
|
51
|
-
This is the subset of the RFC Email Address specification that should be
|
52
|
-
used.
|
53
|
-
|
54
|
-
## Email Addresses: The Bad Parts
|
55
|
-
|
56
|
-
Email addresses are defined and redefined in a series of RFC standards.
|
57
|
-
Conforming to the full standards is not recommended for easily
|
58
|
-
identifying and supporting email addresses. Among these specification,
|
59
|
-
we reject are:
|
7
|
+
The `email_address` gem provides a ruby language library for working
|
8
|
+
with email addresses.
|
60
9
|
|
61
|
-
|
62
|
-
|
63
|
-
|
64
|
-
|
65
|
-
* IP and IPv6 addresses as hosts: `mailbox@[127.0.0.1]`
|
66
|
-
* Non-ASCII (7-bit) characters in the local part: `Pelé@example.com`
|
67
|
-
* Validation by regular expressions like:
|
68
|
-
```
|
69
|
-
(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*
|
70
|
-
| "(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]
|
71
|
-
| \\[\x01-\x09\x0b\x0c\x0e-\x7f])*")
|
72
|
-
@ (?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?
|
73
|
-
| \[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}
|
74
|
-
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:
|
75
|
-
(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]
|
76
|
-
| \\[\x01-\x09\x0b\x0c\x0e-\x7f])+)
|
77
|
-
\])
|
78
|
-
```
|
79
|
-
|
80
|
-
## Internationalization
|
10
|
+
By default, it validates against conventional usage,
|
11
|
+
the format preferred for user email addresses.
|
12
|
+
It can be configured to validate against RFC "Standard" formats,
|
13
|
+
common email service provider formats, and perform DNS validation.
|
81
14
|
|
82
|
-
|
83
|
-
|
84
|
-
|
85
|
-
|
86
|
-
encoded databases and upgraded to the new email standards.
|
15
|
+
Using `email_address` to validate user email addresses results in
|
16
|
+
fewer "false positives" due to typing errors and gibberish data.
|
17
|
+
It validates syntax more strictly for popular email providers,
|
18
|
+
and can deal with gmail's "optional dots" in addresses.
|
87
19
|
|
88
|
-
|
89
|
-
|
90
|
-
locale. Because of this, internationalized local parts are not yet
|
91
|
-
supported by default. They are more likely to be erroneous.
|
20
|
+
It provides Active Record (Rails) extensions, including an
|
21
|
+
address validator and attributes API custom datatypes.
|
92
22
|
|
93
|
-
|
94
|
-
|
95
|
-
|
23
|
+
*Note:* Version 0.1.0 contains significant API and internal changes over the 0.0.3
|
24
|
+
version. If you have been using the 0.0.x series of the gem, you may
|
25
|
+
want to continue using with your current version.
|
96
26
|
|
97
|
-
|
98
|
-
|
99
|
-
* The **original** email address is of the format given by the user.
|
100
|
-
* The **Normalized** address has:
|
101
|
-
* Lower-case the local and domain part
|
102
|
-
* Tags are kept as they are important for the user
|
103
|
-
* Remove comments and any "bad parts"
|
104
|
-
* This format is what should be used to identify the account.
|
105
|
-
* The **Canonical** form is used to uniquely identify the mailbox.
|
106
|
-
* Domains stored as punycode for IDN
|
107
|
-
* Address Tags removed
|
108
|
-
* Special characters removed (dots in gmail addresses are not
|
109
|
-
significant)
|
110
|
-
* Lower cased and "bad parts" removed
|
111
|
-
* Useful for locating a user who forgets registering with a tag or
|
112
|
-
with a "Bad part" in the email address.
|
113
|
-
* The **Redacted** format is used to store email address fingerprints
|
114
|
-
instead of the actual addresses:
|
115
|
-
* Format: sha1(canonical_address)@domain
|
116
|
-
* Given an email address, the record can be found
|
117
|
-
* Useful for treating email addresses as sensitive data and
|
118
|
-
complying with requests to remove the address from your database and
|
119
|
-
still maintain the state of the account.
|
120
|
-
* The **Reference** form allows you to publicly share an address without
|
121
|
-
revealing the actual address.
|
122
|
-
* Can be the MD5 or SHA1 of the normalized or canonical address
|
123
|
-
* Useful for "do not email" lists
|
124
|
-
* Useful for cookies that do not reveal the actual account
|
125
|
-
|
126
|
-
## Treating Email Addresses as Sensitive Data
|
27
|
+
Requires Ruby 2.0 or later.
|
127
28
|
|
128
|
-
|
129
|
-
becoming more important as a personal identifier on the internet.
|
130
|
-
Increasingly, we should treat email addresses as sensitive data. If your
|
131
|
-
site/database becomes compromised by hackers, these email addresses can
|
132
|
-
be stolen and used to spam your users and to try to gain access to their
|
133
|
-
accounts. You should not be storing passwords in plain text; perhaps you
|
134
|
-
don't need to store email addresses un-encoded either.
|
29
|
+
## Background
|
135
30
|
|
136
|
-
|
137
|
-
|
138
|
-
|
139
|
-
the user-supplied one and look up the record. Store the original address
|
140
|
-
in the session for the user, which goes away when the user logs out.
|
31
|
+
The email address specification is complex and often not what you want
|
32
|
+
when working with personal email addresses in applications. This library
|
33
|
+
introduces terms to distinguish types of email addresses.
|
141
34
|
|
142
|
-
|
143
|
-
|
144
|
-
|
145
|
-
access. Given the original email address again, the redacted account can
|
146
|
-
be identified if necessary.
|
35
|
+
* *Normal* - The edited form of any input email address. Typically, it
|
36
|
+
is lower-cased and minor "fixes" can be performed, depending on the
|
37
|
+
configurations and email address provider.
|
147
38
|
|
148
|
-
|
149
|
-
|
39
|
+
CKENT@DAILYPLANET.NEWS => ckent@dailyplanet.news
|
40
|
+
|
41
|
+
* *Conventional* - Most personal account addresses are in this basic
|
42
|
+
format, one or more "words" separated by a single simple punctuation
|
43
|
+
character. It consists of a mailbox (user name or role account) and
|
44
|
+
an optional address "tag" assigned by the user.
|
45
|
+
|
46
|
+
miles.o'brien@ncc-1701-d.ufp
|
47
|
+
|
48
|
+
* *Relaxed* - A less strict form of Conventional, same character set,
|
49
|
+
must begin and end with an alpha-numeric character, but order within
|
50
|
+
is not enforced.
|
51
|
+
|
52
|
+
aasdf-34-.z@example.com
|
53
|
+
|
54
|
+
* *Standard* - The RFC-Compliant syntax of an email address. This is
|
55
|
+
useful when working with software-generated addresses or handling
|
56
|
+
existing email addresses, but otherwise not useful for personal
|
57
|
+
addresses.
|
58
|
+
|
59
|
+
madness!."()<>[]:,;@\\\"!#$%&'*+-/=?^_`{}| ~.a(comment )"@example.org
|
150
60
|
|
151
|
-
|
61
|
+
* *Canonical* - An unique account address, lower-cased, without the
|
62
|
+
tag, and with irrelevant characters stripped.
|
152
63
|
|
153
|
-
|
64
|
+
clark.kent+scoops@gmail.com => clarkkent@gmail.com
|
65
|
+
|
66
|
+
* *Reference* - The MD5 of the Canonical format, used to share account
|
67
|
+
references without exposing the private email address directly.
|
68
|
+
|
69
|
+
Clark.Kent+scoops@gmail.com => c5be3597c391169a5ad2870f9ca51901
|
70
|
+
|
71
|
+
* *Redacted* - A form of the email address where it is replaced by
|
72
|
+
a SHA1-based version to remove the original address from the
|
73
|
+
database, or to store the address privately, yet still keep it
|
74
|
+
accessible at query time by converting the queried address to
|
75
|
+
the redacted form.
|
76
|
+
|
77
|
+
Clark.Kent+scoops@gmail.com => {bea3f3560a757f8142d38d212a931237b218eb5e}@gmail.com
|
78
|
+
|
79
|
+
* *Munged* - An obfuscated version of the email address suitable for
|
80
|
+
publishing on the internet, where email address harvesting
|
81
|
+
could occur.
|
82
|
+
|
83
|
+
Clark.Kent+scoops@gmail.com => cl\*\*\*\*\*@gm\*\*\*\*\*
|
84
|
+
|
85
|
+
Other terms:
|
86
|
+
|
87
|
+
* *Local* - The left-hand side of the "@", representing the user,
|
88
|
+
mailbox, or role, and an optional "tag".
|
89
|
+
|
90
|
+
mailbox+tag@example.com; Local part: mailbox+tag
|
91
|
+
|
92
|
+
* *Mailbox* - The destination user account or role account.
|
93
|
+
* *Tag* - A parameter added after the mailbox, usually after the
|
94
|
+
"+" symbol, set by the user for mail filtering and sub-accounts.
|
95
|
+
Not all mail systems support this.
|
96
|
+
* *Host* (sometimes called *Domain*) - The right-hand side of the "@"
|
97
|
+
indicating the domain or host name server to delivery the email.
|
98
|
+
If missing, "localhost" is assumed, or if not a fully-qualified
|
99
|
+
domain name, it assumed another computer on the same network, but
|
100
|
+
this is increasingly rare.
|
101
|
+
* *Provider* - The Email Service Provider (ESP) providing the email
|
102
|
+
service. Each provider may have its own email address validation
|
103
|
+
and canonicalization rules.
|
104
|
+
* *Punycode* - A host name with Unicode characters (International
|
105
|
+
Domain Name or IDN) needs conversion to this ASCII-encoded format
|
106
|
+
for DNS lookup.
|
107
|
+
|
108
|
+
"HIRO@こんにちは世界.com" => "hiro@xn--28j2a3ar1pp75ovm7c.com"
|
109
|
+
|
110
|
+
Wikipedia has a great article on
|
111
|
+
[Email Addresses](https://en.wikipedia.org/wiki/Email_address),
|
112
|
+
much more readable than the section within
|
113
|
+
[RFC 5322](https://tools.ietf.org/html/rfc5322#section-3.4)
|
114
|
+
|
115
|
+
## Avoiding the Bad Parts of RFC Specification
|
116
|
+
|
117
|
+
Following the RFC specification sounds like a good idea, until you
|
118
|
+
learn about all the madness contained therein. This library can
|
119
|
+
validate the RFC syntax, but this is never useful, especially when
|
120
|
+
validating user email address submissions. By default, it validates
|
121
|
+
to the *conventional* format.
|
122
|
+
|
123
|
+
Here are a few parts of the RFC specification you should avoid:
|
124
|
+
|
125
|
+
* Case-sensitive local parts: `First.Last@example.com`
|
126
|
+
* Spaces and Special Characters: `"():;<>@[\\]`
|
127
|
+
* Quoting and Escaping Requirements: `"first \"nickname\" last"@example.com`
|
128
|
+
* Comment Parts: `(comment)mailbox@example.com`
|
129
|
+
* IP and IPv6 addresses as hosts: `mailbox@[127.0.0.1]`
|
130
|
+
* Non-ASCII (7-bit) characters in the local part: `Pelé@example.com`
|
131
|
+
* Validation by voodoo regular expressions
|
132
|
+
* Gmail allows ".." in addresses since they are not meaningful, but
|
133
|
+
the standard does not.
|
134
|
+
|
135
|
+
## Installation With Rails or Bundler
|
136
|
+
|
137
|
+
If you are using Rails or a project with Bundler, add this line to your application's Gemfile:
|
154
138
|
|
155
139
|
gem 'email_address'
|
156
140
|
|
@@ -158,100 +142,361 @@ And then execute:
|
|
158
142
|
|
159
143
|
$ bundle
|
160
144
|
|
161
|
-
|
145
|
+
## Installation Without Bundler
|
146
|
+
|
147
|
+
If you are not using Bundler, you need to install the gem yourself.
|
162
148
|
|
163
149
|
$ gem install email_address
|
164
150
|
|
151
|
+
Require the gem inside your script.
|
152
|
+
|
153
|
+
require 'rubygems'
|
154
|
+
require 'email_address'
|
155
|
+
|
165
156
|
## Usage
|
166
157
|
|
167
|
-
|
168
|
-
|
158
|
+
Use `EmailAddress` to do transformations and validations. You can also
|
159
|
+
instantiate an object to inspect the address.
|
169
160
|
|
170
|
-
|
171
|
-
|
172
|
-
email.canonical #=> "user@example.com"
|
173
|
-
email.redact #=> "63a710569261a24b3766275b7000ce8d7b32e2f7@example.com"
|
174
|
-
email.sha1 #=> "63a710569261a24b3766275b7000ce8d7b32e2f7"
|
175
|
-
email.md5 #=> "dea073fb289e438a6d69c5384113454c"
|
161
|
+
These top-level helpers return edited email addresses and validation
|
162
|
+
check.
|
176
163
|
|
177
|
-
|
178
|
-
|
179
|
-
|
180
|
-
|
164
|
+
address = "Clark.Kent+scoops@gmail.com"
|
165
|
+
EmailAddress.valid?(address) #=> true
|
166
|
+
EmailAddress.normal(address) #=> "clark.kent+scoops@gmail.com"
|
167
|
+
EmailAddress.canonical(address) #=> "clarkkent@gmail.com"
|
168
|
+
EmailAddress.reference(address) #=> "c5be3597c391169a5ad2870f9ca51901"
|
169
|
+
EmailAddress.redact(address) #=> "{bea3f3560a757f8142d38d212a931237b218eb5e}@gmail.com"
|
170
|
+
EmailAddress.munge(address) #=> "cl*****@gm*****"
|
171
|
+
EmailAddress.matches?(address, 'google') #=> 'google' (true)
|
172
|
+
EmailAddress.error("#bad@example.com") #=> "Invalid Mailbox"
|
181
173
|
|
182
|
-
|
174
|
+
Or you can create an instance of the email address to work with it.
|
175
|
+
|
176
|
+
email = EmailAddress.new(address) #=> #<EmailAddress::Address:0x007fe6ee150540 ...>
|
177
|
+
email.normal #=> "clark.kent+scoops@gmail.com"
|
178
|
+
email.canonical #=> "clarkkent@gmail.com"
|
179
|
+
email.original #=> "Clark.Kent+scoops@gmail.com"
|
180
|
+
email.valid? #=> true
|
181
|
+
|
182
|
+
Here are some other methods that are available.
|
183
|
+
|
184
|
+
email.redact #=> "{bea3f3560a757f8142d38d212a931237b218eb5e}@gmail.com"
|
185
|
+
email.sha1 #=> "bea3f3560a757f8142d38d212a931237b218eb5e"
|
186
|
+
email.md5 #=> "c5be3597c391169a5ad2870f9ca51901"
|
187
|
+
email.host_name #=> "gmail.com"
|
183
188
|
email.provider #=> :google
|
184
|
-
email.
|
189
|
+
email.mailbox #=> "clark.kent"
|
190
|
+
email.tag #=> "scoops"
|
191
|
+
|
192
|
+
email.host.exchanger.first[:ip] #=> "2a00:1450:400b:c02::1a"
|
193
|
+
email.host.txt_hash #=> {:v=>"spf1", :redirect=>"\_spf.google.com"}
|
194
|
+
|
195
|
+
EmailAddress.normal("HIRO@こんにちは世界.com")
|
196
|
+
#=> "hiro@xn--28j2a3ar1pp75ovm7c.com"
|
197
|
+
EmailAddress.normal("hiro@xn--28j2a3ar1pp75ovm7c.com", host_encoding: :unicode)
|
198
|
+
#=> "hiro@こんにちは世界.com"
|
199
|
+
|
200
|
+
#### Rails Validator
|
201
|
+
|
202
|
+
For Rails' ActiveRecord classes, EmailAddress provides an ActiveRecordValidator.
|
203
|
+
Specify your email address attributes with `field: :user_email`, or
|
204
|
+
`fields: [:email1, :email2]`. If neither is given, it assumes to use the
|
205
|
+
`email` or `email_address` attribute.
|
206
|
+
|
207
|
+
class User < ActiveRecord::Base
|
208
|
+
validates_with EmailAddress::ActiveRecordValidator, field: :email
|
209
|
+
end
|
210
|
+
|
211
|
+
#### Rails Email Address Type Attribute
|
212
|
+
|
213
|
+
Initial support is provided for Active Record 5.0 attributes API.
|
214
|
+
|
215
|
+
First, you need to register the type in
|
216
|
+
`config/initializers/email_address.rb` along with any global
|
217
|
+
configurations you want.
|
218
|
+
|
219
|
+
ActiveRecord::Type.register(:email_address, EmailAddress::EmailAddressType)
|
220
|
+
ActiveRecord::Type.register(:canonical_email_address,
|
221
|
+
EmailAddress::CanonicalEmailAddressType)
|
222
|
+
|
223
|
+
Assume the Users table contains the columns "email" and "canonical_email".
|
224
|
+
We want to normalize the address in "email" and store the canonical/unique
|
225
|
+
version in "canonical_email". This code will set the canonical_email when
|
226
|
+
the email attribute is assigned. With the canonical_email column,
|
227
|
+
we can look up the User, even it the given email address didn't exactly
|
228
|
+
match the registered version.
|
229
|
+
|
230
|
+
class User < ApplicationRecord
|
231
|
+
attribute :email, :email_address
|
232
|
+
attribute :canonical_email, :canonical_email_address
|
185
233
|
|
186
|
-
|
187
|
-
|
188
|
-
original formatting, case, and tag information.
|
234
|
+
validates_with EmailAddress::ActiveRecordValidator,
|
235
|
+
fields: %i(email canonical_email)
|
189
236
|
|
190
|
-
|
237
|
+
def email=(email_address)
|
238
|
+
self[:canonical_email] = email_address
|
239
|
+
self[:email] = email_address
|
240
|
+
end
|
191
241
|
|
192
|
-
|
193
|
-
|
242
|
+
def self.find_by_email(email)
|
243
|
+
user = self.find_by(email: EmailAddress.normal(email))
|
244
|
+
user ||= self.find_by(canonical_email: EmailAddress.canonical(email))
|
245
|
+
user ||= self.find_by(canonical_email: EmailAddress.redacted(email))
|
246
|
+
user
|
247
|
+
end
|
194
248
|
|
195
|
-
|
249
|
+
def redact!
|
250
|
+
self[:canonical_email] = EmailAddress.redact(self.canonical_email)
|
251
|
+
self[:email] = self[:canonical_email]
|
252
|
+
end
|
253
|
+
end
|
254
|
+
|
255
|
+
Here is how the User model works:
|
256
|
+
|
257
|
+
user = User.create(email:"Pat.Smith+registrations@gmail.com")
|
258
|
+
user.email #=> "pat.smith+registrations@gmail.com"
|
259
|
+
user.canonical_email #=> "patsmith@gmail.com"
|
260
|
+
User.find_by_email("PAT.SMITH@GMAIL.COM")
|
261
|
+
#=> #<User email="pat.smith+registrations@gmail.com">
|
262
|
+
|
263
|
+
|
264
|
+
The `find_by_email` method looks up a given email address by the
|
265
|
+
normalized form (lower case), then by the canonical form, then finally
|
266
|
+
by the redacted form.
|
267
|
+
|
268
|
+
#### Validation
|
196
269
|
|
197
|
-
|
198
|
-
|
199
|
-
|
200
|
-
|
270
|
+
The only true validation is to send a message to the email address and
|
271
|
+
have the user (or process) verify it has been received. Syntax checks
|
272
|
+
help prevent erroneous input. Even sent messages can be silently
|
273
|
+
dropped, or bounced back after acceptance. Conditions such as a
|
274
|
+
"Mailbox Full" can mean the email address is known, but abandoned.
|
275
|
+
|
276
|
+
There are different levels of validations you can perform. By default, it will
|
277
|
+
validate to the "Provider" (if known), or "Conventional" format defined as the
|
278
|
+
"default" provider. You may pass a a list of parameters to select
|
279
|
+
which syntax and network validations to perform.
|
280
|
+
|
281
|
+
#### Comparison
|
201
282
|
|
202
283
|
You can compare email addresses:
|
203
284
|
|
204
|
-
e1 = EmailAddress.new("
|
205
|
-
|
206
|
-
e2 = EmailAddress.new("FirstLast+tag@Gmail.com")
|
207
|
-
e3.to_s #=> "firstlast+tag@gmail.com"
|
285
|
+
e1 = EmailAddress.new("Clark.Kent@Gmail.com")
|
286
|
+
e2 = EmailAddress.new("clark.kent+Superman@Gmail.com")
|
208
287
|
e3 = EmailAddress.new(e2.redact)
|
209
|
-
|
288
|
+
e1.to_s #=> "clark.kent@gmail.com"
|
289
|
+
e2.to_s #=> "clark.kent+superman@gmail.com"
|
290
|
+
e3.to_s #=> "{bea3f3560a757f8142d38d212a931237b218eb5e}@gmail.com"
|
210
291
|
|
211
292
|
e1 == e2 #=> false (Matches by normalized address)
|
212
293
|
e1.same_as?(e2) #=> true (Matches as canonical address)
|
213
294
|
e1.same_as?(e3) #=> true (Matches as redacted address)
|
214
295
|
e1 < e2 #=> true (Compares using normalized address)
|
215
296
|
|
216
|
-
|
297
|
+
#### Matching
|
217
298
|
|
218
|
-
|
299
|
+
Matching addresses by simple patterns:
|
219
300
|
|
220
|
-
|
221
|
-
|
222
|
-
|
223
|
-
|
224
|
-
|
225
|
-
|
226
|
-
|
227
|
-
|
301
|
+
* Top-Level-Domain: .org
|
302
|
+
* Domain Name: example.com
|
303
|
+
* Registration Name: hotmail. (matches any TLD)
|
304
|
+
* Domain Glob: *.exampl?.com
|
305
|
+
* Provider Name: google
|
306
|
+
* Mailbox Name or Glob: user00*@
|
307
|
+
* Address or Glob: postmaster@domain*.com
|
308
|
+
* Provider or Registration: msn
|
228
309
|
|
229
|
-
|
310
|
+
Usage:
|
230
311
|
|
231
|
-
|
312
|
+
e = EmailAddress.new("Clark.Kent@Gmail.com")
|
313
|
+
e.matches?("gmail.com") #=> true
|
314
|
+
e.matches?("google") #=> true
|
315
|
+
e.matches?(".org") #=> false
|
316
|
+
e.matches?("g*com") #=> true
|
317
|
+
e.matches?("gmail.") #=> true
|
318
|
+
e.matches?("*kent*@") #=> true
|
232
319
|
|
233
|
-
|
320
|
+
### Configuration
|
234
321
|
|
235
|
-
|
322
|
+
You can pass an options hash on the `.new()` and helper class methods to
|
323
|
+
control how the library treats that address. These can also be
|
324
|
+
configured during initialization by provider and default (see below).
|
236
325
|
|
237
|
-
|
238
|
-
|
239
|
-
* Registration names matching without the TLD. 'yahoo' matches:
|
240
|
-
* "www.yahoo.com" (with Subdomains)
|
241
|
-
* "yahoo.ca" (any TLD)
|
242
|
-
* "yahoo.co.jp" (2-char TLD with 2-char Second-level)
|
243
|
-
* But _may_ also match non-Yahoo domain names (yahoo.xxx)
|
326
|
+
EmailAddress.new("clark.kent@gmail.com",
|
327
|
+
dns_lookup::off, host_encoding: :unicode)
|
244
328
|
|
245
|
-
|
329
|
+
Globally, you can change and query configuration options:
|
246
330
|
|
247
|
-
|
331
|
+
EmailAddress::Config.setting(:dns_lookup, :mx)
|
332
|
+
EmailAddress::Config.setting(:dns_lookup) #=> :mx
|
248
333
|
|
249
|
-
|
250
|
-
|
251
|
-
|
252
|
-
|
334
|
+
Or set multiple settings at once:
|
335
|
+
|
336
|
+
EmailAddress::Config.configure(local_downcase:false, dns_lookup: :off)
|
337
|
+
|
338
|
+
You can add special rules by domain or provider. It takes the options
|
339
|
+
above and adds the :domain_match and :exchanger_match rules.
|
340
|
+
|
341
|
+
EmailAddress.define_provider('google',
|
342
|
+
domain_match: %w(gmail.com googlemail.com),
|
343
|
+
exchanger_match: %w(google.com), # Requires dns_lookup==:mx
|
344
|
+
local_size: 5..64,
|
345
|
+
mailbox_canonical: ->(m) {m.gsub('.','')})
|
346
|
+
|
347
|
+
The library ships with the most common set of provider rules. It is not meant
|
348
|
+
to house a database of all providers, but a separate `email_address-providers`
|
349
|
+
gem may be created to hold this data for those who need more complete rules.
|
350
|
+
|
351
|
+
Personal and Corporate email systems are not intended for either solution.
|
352
|
+
Any of these email systems may be configured locally.
|
353
|
+
|
354
|
+
Pre-configured email address providers include: Google (gmail), AOL, MSN
|
355
|
+
(hotmail, live, outlook), and Yahoo. Any address not matching one of
|
356
|
+
those patterns use the "default" provider rule set. Exchanger matches
|
357
|
+
matches against the Mail Exchanger (SMTP receivers) hosts defined in
|
358
|
+
DNS. If you specify an exchanger pattern, but requires a DNS MX lookup.
|
359
|
+
|
360
|
+
For Rails application, create an initializer file with your default
|
361
|
+
configuration options:
|
362
|
+
|
363
|
+
# ./config/initializers/email_address.rb
|
364
|
+
EmailAddress::Config.setting( local_format: :relaxed )
|
365
|
+
EmailAddress::Config.provider(:github,
|
366
|
+
host_match: %w(github.com), local_format: :standard)
|
367
|
+
|
368
|
+
### Available Configuration Settings
|
369
|
+
|
370
|
+
* dns_lookup: Enables DNS lookup for validation by
|
371
|
+
* :mx - DNS MX Record lookup
|
372
|
+
* :a - DNS A Record lookup (as some domains don't specify an MX incorrectly)
|
373
|
+
* :off - Do not perform DNS lookup (Test mode, network unavailable)
|
374
|
+
|
375
|
+
* sha1_secret -
|
376
|
+
This application-level secret is appended to the email_address to compute
|
377
|
+
the SHA1 Digest, making it unique to your application so it can't easily be
|
378
|
+
discovered by comparing against a known list of email/sha1 pairs.
|
379
|
+
|
380
|
+
* munge_string - "*****", the string to replace into munged addresses.
|
381
|
+
|
382
|
+
For local part configuration:
|
253
383
|
|
254
|
-
|
384
|
+
* local_downcase: true.
|
385
|
+
Downcase the local part. You probably want this for uniqueness.
|
386
|
+
RFC says local part is case insensitive, that's a bad part.
|
387
|
+
|
388
|
+
* local_fix: true.
|
389
|
+
Make simple fixes when available, remove spaces, condense multiple punctuations
|
390
|
+
|
391
|
+
* local_encoding: :ascii, :unicode,
|
392
|
+
Enable Unicode in local part. Most mail systems do not yet support this.
|
393
|
+
You probably want to stay with ASCII for now.
|
394
|
+
|
395
|
+
* local_parse: nil, ->(local) { [mailbox, tag, comment] }
|
396
|
+
Specify an optional lambda/Proc to parse the local part. It should return an
|
397
|
+
array (tuple) of mailbox, tag, and comment.
|
398
|
+
|
399
|
+
* local_format:
|
400
|
+
* :conventional - word ( puncuation{1} word )*
|
401
|
+
* :relaxed - alphanum ( allowed_characters)* alphanum
|
402
|
+
* :standard - RFC Compliant email addresses (anything goes!)
|
403
|
+
|
404
|
+
* local_size: 1..64,
|
405
|
+
A Range specifying the allowed size for mailbox + tags + comment
|
406
|
+
|
407
|
+
* tag_separator: nil, character (+)
|
408
|
+
Nil, or a character used to split the tag from the mailbox
|
409
|
+
|
410
|
+
For the mailbox (AKA account, role), without the tag
|
411
|
+
* mailbox_size: 1..64
|
412
|
+
A Range specifying the allowed size for mailbox
|
413
|
+
|
414
|
+
* mailbox_canonical: nil, ->(mailbox) { mailbox }
|
415
|
+
An optional lambda/Proc taking a mailbox name, returning a canonical
|
416
|
+
version of it. (E.G.: gmail removes '.' characters)
|
417
|
+
|
418
|
+
* mailbox_validator: nil, ->(mailbox) { true }
|
419
|
+
An optional lambda/Proc taking a mailbox name, returning true or false.
|
420
|
+
|
421
|
+
* host_encoding: :punycode, :unicode,
|
422
|
+
How to treat International Domain Names (IDN). Note that most mail and
|
423
|
+
DNS systems do not support unicode, so punycode needs to be passed.
|
424
|
+
:punycode Convert Unicode names to punycode representation
|
425
|
+
:unicode Keep Unicode names as is.
|
426
|
+
|
427
|
+
* host_validation:
|
428
|
+
:mx Ensure host is configured with DNS MX records
|
429
|
+
:a Ensure host is known to DNS (A Record)
|
430
|
+
:syntax Validate by syntax only, no Network verification
|
431
|
+
:connect Attempt host connection (not implemented, BAD!)
|
432
|
+
|
433
|
+
* host_size: 1..253,
|
434
|
+
A range specifying the size limit of the host part,
|
435
|
+
|
436
|
+
* host_allow_ip: false,
|
437
|
+
Allow IP address format in host: [127.0.0.1], [IPv6:::1]
|
438
|
+
|
439
|
+
* address_validation: :parts, :smtp, ->(address) { true }
|
440
|
+
Address validation policy
|
441
|
+
:parts Validate local and host.
|
442
|
+
:smtp Validate via SMTP (not implemented, BAD!)
|
443
|
+
A lambda/Proc taking the address string, returning true or false
|
444
|
+
|
445
|
+
* address_size: 3..254,
|
446
|
+
A range specifying the size limit of the complete address
|
447
|
+
|
448
|
+
* address_local: false,
|
449
|
+
Allow localhost, no domain, or local subdomains.
|
450
|
+
|
451
|
+
For provider rules to match to domain names and Exchanger hosts
|
452
|
+
The value is an array of match tokens.
|
453
|
+
* host_match: %w(.org example.com hotmail. user*@ sub.*.com)
|
454
|
+
* exchanger_match: %w(google.com 127.0.0.1 10.9.8.0/24 ::1/64)
|
455
|
+
|
456
|
+
## Notes
|
457
|
+
|
458
|
+
#### Internationalization
|
459
|
+
|
460
|
+
The industry is moving to support Unicode characters in the local part
|
461
|
+
of the email address. Currently, SMTP supports only 7-bit ASCII, but a
|
462
|
+
new `SMTPUTF8` standard is available, but not yet widely implemented.
|
463
|
+
To work properly, global Email systems must be converted to UTF-8
|
464
|
+
encoded databases and upgraded to the new email standards.
|
465
|
+
|
466
|
+
The problem with i18n email addresses is that support outside of the
|
467
|
+
given locale becomes hard to enter addresses on keyboards for another
|
468
|
+
locale. Because of this, internationalized local parts are not yet
|
469
|
+
supported by default. They are more likely to be erroneous.
|
470
|
+
|
471
|
+
Proper personal identity can still be provided using
|
472
|
+
[MIME Encoded-Words](http://en.wikipedia.org/wiki/MIME#Encoded-Word)
|
473
|
+
in Email headers.
|
474
|
+
|
475
|
+
|
476
|
+
#### Email Addresses as Sensitive Data
|
477
|
+
|
478
|
+
Like Social Security and Credit Card Numbers, email addresses are
|
479
|
+
becoming more important as a personal identifier on the internet.
|
480
|
+
Increasingly, we should treat email addresses as sensitive data. If your
|
481
|
+
site/database becomes compromised by hackers, these email addresses can
|
482
|
+
be stolen and used to spam your users and to try to gain access to their
|
483
|
+
accounts. You should not be storing passwords in plain text; perhaps you
|
484
|
+
don't need to store email addresses un-encoded either.
|
485
|
+
|
486
|
+
Consider this: upon registration, store the redacted email address for
|
487
|
+
the user, and of course, the salted, encrypted password.
|
488
|
+
When the user logs in, compute the redacted email address from
|
489
|
+
the user-supplied one and look up the record. Store the original address
|
490
|
+
in the session for the user, which goes away when the user logs out.
|
491
|
+
|
492
|
+
Sometimes, users demand you strike their information from the database.
|
493
|
+
Instead of deleting their account, you can "redact" their email
|
494
|
+
address, retaining the state of the account to prevent future
|
495
|
+
access. Given the original email address again, the redacted account can
|
496
|
+
be identified if necessary.
|
497
|
+
|
498
|
+
Because of these use cases, the **redact** method on the email address
|
499
|
+
instance has been provided.
|
255
500
|
|
256
501
|
## Contributing
|
257
502
|
|
@@ -260,3 +505,12 @@ See `lib/email_address/config.rb` for more options.
|
|
260
505
|
3. Commit your changes (`git commit -am 'Add some feature'`)
|
261
506
|
4. Push to the branch (`git push origin my-new-feature`)
|
262
507
|
5. Create new Pull Request
|
508
|
+
|
509
|
+
#### Project
|
510
|
+
|
511
|
+
This project lives at [https://github.com/afair/email_address/](https://github.com/afair/email_address/)
|
512
|
+
|
513
|
+
#### Authors
|
514
|
+
|
515
|
+
* [Allen Fair](https://github.com/afair) ([@allenfair](https://twitter.com/allenfair)):
|
516
|
+
I've worked with email-based applications and email addresses since 1999.
|