email_address 0.0.2 → 0.0.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: da957eceefc70a7dc4a21a1d1f44296e36783021
4
- data.tar.gz: fffc1230d4c3080ab56229e34604838543b659e6
3
+ metadata.gz: 77e8fb26631e827d41da7eee015141495316f430
4
+ data.tar.gz: 0fa66926ca1dbbdc13154aae789850eec7e1eb75
5
5
  SHA512:
6
- metadata.gz: e638d9efbba9834cd1857150db91821d82e6bbbbf95ee694bc30287d5c9f69893d87928ee49b044207e9503accf4eefc354295b29e3d5ec44ad6ee5ab2918653
7
- data.tar.gz: 90749c6bdbc84c2586208eefb04f8eb208812124d3253170d46471b64df4e23c5ff24bc6967c426285e8cd838f71bcd1fb3b910a7a82690ab3a91da5665e9e40
6
+ metadata.gz: 5d41c4ca234699c636811187718fc7b73fe52ff55256ebbeb3e32e518d3330c016a72c3185f9632be0cb4d612064e09408517f893927ddd23db9518d0c219066
7
+ data.tar.gz: 93c7fec757b3a635b39aea3bdc6a259ad5c69b0b5369d4ea48e41cde6009b406a8c2eec94a0bc1e42ed837b220a240d971079135e58455a57d0db5ad1e2df3dd
data/README.md CHANGED
@@ -1,31 +1,152 @@
1
1
  # Email Address
2
2
 
3
- The EmailAddress gem is an _opinionated_ email address handler and
4
- validator. It does not use RFC standards because they make things worse.
5
- Email addresses should conform to a few practices that are not
6
- RFC-Compliant.
7
-
8
- Specifically, local parts (left side of the @):
9
-
10
- * Should not be case sensitive.
11
- * Should not contain spaces or anything that would cause quoting.
12
- * Should not allow Unicode. Addressable items like this need to be
13
- entered from any keyboard, such as the US ASCII character set. (Domain
14
- names too, but that can be handled with Punycode.)
15
- * Should not have comments. Neither should domains.
16
- * Should not allow unusual symbols (not usually in names and standard
17
- punctuation).
18
- * Should not be verified by SMTP connections if possible.
19
- * Should have spaces stripped automatically if enabled
20
- * Should be of a reasonable length to identify the recipient.
21
- * Should be human readable and writable.
22
- * Should continue allowing for tagging.
23
- * Should provide mechanism for handling bounce backs and VERP.
24
- * Should be easily normalized and corrected.
25
- * Should be canonicalized to identify duplicates if necessary.
26
- * Should be able to be stored as a digest for privacy proctections.
27
-
28
- If you're on board, let's go!
3
+ [![Gem Version](https://badge.fury.io/rb/email_address.svg)](http://rubygems.org/gems/email_address)
4
+
5
+ The EmailAddress gem provides a structured datatype for email addresses
6
+ and pushes for an _opinionated_ model for which RFC patterns should be
7
+ accepted as a "best practice" and which should not be supported (in the
8
+ name of sanity).
9
+
10
+ This library provides:
11
+
12
+ * Email Address Validation
13
+ * Converting between email address forms
14
+ * **Original:** From the user or data source
15
+ * **Normalized:** A standardized format for identification
16
+ * **Canonical:** A format used to identify a unique user
17
+ * **Redacted:** A format used to store an email address privately
18
+ * **Reference:** Digest formats for sharing addresses without exposing
19
+ them.
20
+ * Matching addresses to Email/Internet Service Providers. Per-provider
21
+ rules for:
22
+ * Validation
23
+ * Address Tag formats
24
+ * Canonicalization
25
+ * Unicode Support
26
+
27
+ ## Email Addresses: The Good Parts
28
+
29
+ Email Addresses are split into two parts: the `local` and `host` part,
30
+ separated by the `@` symbol, or of the generalized format:
31
+
32
+ mailbox+tag@subdomain.domain.tld
33
+
34
+ The **Mailbox** usually identifies the user, role account, or application.
35
+ A **Tag** is any suffix for the mailbox useful for separating and filtering
36
+ incoming email. It is usually preceded by a '+' or other character. Tags are
37
+ not always available for a given ESP or MTA.
38
+
39
+ Local Parts should consist of lower-case 7-bit ASCII alphanumeric and these characters:
40
+ `-+'.,` It should start with and end with an alphanumeric character and
41
+ no more than one special character should appear together.
42
+
43
+ Host parts contain a lower-case version of any standard domain name.
44
+ International Domain Names are allowed, and can be converted to
45
+ [Punycode](http://en.wikipedia.org/wiki/Punycode),
46
+ an encoding system of Unicode strings into the 7-bit ASCII character set.
47
+ Domain names should be configured with MX records in DNS to receive
48
+ email, though this is sometimes mis-configured and the A record can be
49
+ used as a backup.
50
+
51
+ This is the subset of the RFC Email Address specification that should be
52
+ used.
53
+
54
+ ## Email Addresses: The Bad Parts
55
+
56
+ Email addresses are defined and redefined in a series of RFC standards.
57
+ Conforming to the full standards is not recommended for easily
58
+ identifying and supporting email addresses. Among these specification,
59
+ we reject are:
60
+
61
+ * Case-sensitive local parts: `First.Last@example.com`
62
+ * Spaces and Special Characters: `"():;<>@[\\]`
63
+ * Quoting and Escaping Requirements: `"first \"nickname\" last"@example.com`
64
+ * Comment Parts: `(comment)mailbox@example.com`
65
+ * IP and IPv6 addresses as hosts: `mailbox@[127.0.0.1]`
66
+ * Non-ASCII (7-bit) characters in the local part: `Pelé@example.com`
67
+ * Validation by regular expressions like:
68
+ ```
69
+ (?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*
70
+ | "(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]
71
+ | \\[\x01-\x09\x0b\x0c\x0e-\x7f])*")
72
+ @ (?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?
73
+ | \[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}
74
+ (?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:
75
+ (?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]
76
+ | \\[\x01-\x09\x0b\x0c\x0e-\x7f])+)
77
+ \])
78
+ ```
79
+
80
+ ## Internationalization
81
+
82
+ The industry is moving to support Unicode characters in the local part
83
+ of the email address. Currently, SMTP supports only 7-bit ASCII, but a
84
+ new `SMTPUTF8` standard is available, but not yet widely implemented.
85
+ To work properly, global Email systems must be converted to UTF-8
86
+ encoded databases and upgraded to the new email standards.
87
+
88
+ The problem with i18n email addresses is that support outside of the
89
+ given locale becomes hard to enter addresses on keyboards for another
90
+ locale. Because of this, internationalized local parts are not yet
91
+ supported by default. They are more likely to be erroneous.
92
+
93
+ Proper personal identity can still be provided using
94
+ [MIME Encoded-Words](http://en.wikipedia.org/wiki/MIME#Encoded-Word)
95
+ in Email headers.
96
+
97
+ ## Email Addresses Forms
98
+
99
+ * The **original** email address is of the format given by the user.
100
+ * The **Normalized** address has:
101
+ * Lower-case the local and domain part
102
+ * Tags are kept as they are important for the user
103
+ * Remove comments and any "bad parts"
104
+ * This format is what should be used to identify the account.
105
+ * The **Canonical** form is used to uniquely identify the mailbox.
106
+ * Domains stored as punycode for IDN
107
+ * Address Tags removed
108
+ * Special characters removed (dots in gmail addresses are not
109
+ significant)
110
+ * Lower cased and "bad parts" removed
111
+ * Useful for locating a user who forgets registering with a tag or
112
+ with a "Bad part" in the email address.
113
+ * The **Redacted** format is used to store email address fingerprints
114
+ instead of the actual addresses:
115
+ * Format: sha1(canonical_address)@domain
116
+ * Given an email address, the record can be found
117
+ * Useful for treating email addresses as sensitive data and
118
+ complying with requests to remove the address from your database and
119
+ still maintain the state of the account.
120
+ * The **Reference** form allows you to publicly share an address without
121
+ revealing the actual address.
122
+ * Can be the MD5 or SHA1 of the normalized or canonical address
123
+ * Useful for "do not email" lists
124
+ * Useful for cookies that do not reveal the actual account
125
+
126
+ ## Treating Email Addresses as Sensitive Data
127
+
128
+ Like Social Security and Credit Card Numbers, email addresses are
129
+ becoming more important as a personal identifier on the internet.
130
+ Increasingly, we should treat email addresses as sensitive data. If your
131
+ site/database becomes compromised by hackers, these email addresses can
132
+ be stolen and used to spam your users and to try to gain access to their
133
+ accounts. You should not be storing passwords in plain text; perhaps you
134
+ don't need to store email addresses un-encoded either.
135
+
136
+ Consider this: upon registration, store the redacted email address for
137
+ the user, and of course, the salted, encrypted password.
138
+ When the user logs in, compute the redacted email address from
139
+ the user-supplied one and look up the record. Store the original address
140
+ in the session for the user, which goes away when the user logs out.
141
+
142
+ Sometimes, users demand you strike their information from the database.
143
+ Instead of deleting their account, you can "redact" their email
144
+ address, retaining the state of the account to prevent future
145
+ access. Given the original email address again, the redacted account can
146
+ be identified if necessary.
147
+
148
+ Because of these use cases, the **redact** method on the email address
149
+ instance has been provided.
29
150
 
30
151
  ## Installation
31
152
 
@@ -49,6 +170,9 @@ EmailAddress:
49
170
  email = EmailAddress.new("USER+tag@EXAMPLE.com")
50
171
  email.normalize #=> "user+tag@example.com"
51
172
  email.canonical #=> "user@example.com"
173
+ email.redact #=> "63a710569261a24b3766275b7000ce8d7b32e2f7@example.com"
174
+ email.sha1 #=> "63a710569261a24b3766275b7000ce8d7b32e2f7"
175
+ email.md5 #=> "dea073fb289e438a6d69c5384113454c"
52
176
 
53
177
  Email Service Provider (ESP) specific edits can be created to provide
54
178
  validations and canonical manipulations. A few are given out of the box.
@@ -56,7 +180,7 @@ Providers can be defined bu email domain match rules, or by match rules
56
180
  for the MX host names using domains or CIDR addresses.
57
181
 
58
182
  email = EmailAddress.new("First.Last+Tag@Gmail.Com")
59
- email.provider #=> :gmail
183
+ email.provider #=> :google
60
184
  email.canonical #=> "firstlast@gmail.com"
61
185
 
62
186
  Storing the canonical address with the request address (don't remove
@@ -71,32 +195,36 @@ You can inspect the MX (Mail Exchanger) records
71
195
  You can see if it validates as an opinionated address:
72
196
 
73
197
  email.valid? # Resonably valid?
198
+ email.errors #=> [:mx]
74
199
  email.valid_host? # Host name is defined in DNS
75
200
  email.strict? # Strictly valid?
76
201
 
77
- Email addresses should be able to be "archived" or stored in a digest
78
- format. This allows you to safely keep a record of the address and still
79
- protect the account's privacy after it has been closed. Given an address
80
- for inquiry, it can still look up a closed account.
202
+ You can compare email addresses:
81
203
 
82
- email.md5 #=> "dea073fb289e438a6d69c5384113454c"
83
- email.archive #=> "554d32017ab3a7fcf51c88ffce078689003bc521@gmail.com"
204
+ e1 = EmailAddress.new("First.Last@Gmail.com")
205
+ e1.to_s #=> "first.last@gmail.com"
206
+ e2 = EmailAddress.new("FirstLast+tag@Gmail.com")
207
+ e3.to_s #=> "firstlast+tag@gmail.com"
208
+ e3 = EmailAddress.new(e2.redact)
209
+ e3.to_s #=> "554d32017ab3a7fcf51c88ffce078689003bc521@gmail.com"
84
210
 
211
+ e1 == e2 #=> false (Matches by normalized address)
212
+ e1.same_as?(e2) #=> true (Matches as canonical address)
213
+ e1.same_as?(e3) #=> true (Matches as redacted address)
214
+ e1 < e2 #=> true (Compares using normalized address)
85
215
 
86
- ## Email Address Parts
216
+ ## Host Inspection
87
217
 
88
- The Local and Domain Parsing routines divvy the email address into these
89
- parts from `(comment)mailbox+tag@subdomain.domain.tld`
218
+ The `EmailAddress::Host` can be used to inspect the email domain.
90
219
 
91
- * Local - Everything to the left of the "@"
92
- * Mailbox - The controlling mailbox name
93
- * Tag - Anything after the mailbox and tag separator character (usually "+")
94
- * Comment - To be removed from the normalized form
95
- * Host Name - Everything to the right of the "@"
96
- * Subdomains - of the host name, if any.
97
- * Domain Name - host name without subdomains, with TLD
98
- * TLD - the rightmost word or set of 2-character domains ("co.uk")
99
- * Registration Name - host name without subdomain or TLD
220
+ ```ruby
221
+ e1 = EmailAddress.new("First.Last@Gmail.com")
222
+ e1.host.name #=> "gmail.com"
223
+ e1.host.exchanger.mxers #=> [["alt4.gmail-smtp-in.l.google.com", "2a00:1450:400c:c01::1b", 30],...]
224
+ e1.host.exchanger.mx_ips #=> ["2a00:1450:400c:c01::1b", ...]
225
+ e1.host.matches?('.com') #=> true
226
+ e1.host.txt #=> "v=spf1 redirect=_spf.google.com"
227
+ ```
100
228
 
101
229
  ## Domain Matching
102
230
 
data/Rakefile CHANGED
@@ -3,3 +3,21 @@ require "bundler/gem_tasks"
3
3
  task :default do
4
4
  sh "ruby test/email_address/*"
5
5
  end
6
+ require "bundler/gem_tasks"
7
+ require "bundler/setup"
8
+ require 'rake/testtask'
9
+
10
+ task :default => :test
11
+
12
+ desc "Run the Test Suite, toot suite"
13
+ task :test do
14
+ sh "find test -name 'test*rb' -exec ruby {} \\;"
15
+ end
16
+
17
+ desc "Open and IRB Console with the gem loaded"
18
+ task :console do
19
+ sh "bundle exec irb -Ilib -I . -r email_address"
20
+ #require 'irb'
21
+ #ARGV.clear
22
+ #IRB.start
23
+ end
@@ -20,6 +20,7 @@ validator.}
20
20
  spec.require_paths = ["lib"]
21
21
 
22
22
  spec.add_development_dependency "bundler", "~> 1.3"
23
+ spec.add_development_dependency "activemodel", "~> 4.2"
23
24
  spec.add_development_dependency "rake"
24
25
  spec.add_dependency "simpleidn"
25
26
  spec.add_dependency "netaddr"
data/lib/email_address.rb CHANGED
@@ -5,13 +5,38 @@ require "email_address/domain_parser"
5
5
  require "email_address/exchanger"
6
6
  require "email_address/host"
7
7
  require "email_address/local"
8
+ require "email_address/matcher"
8
9
  require "email_address/validator"
9
10
  require "email_address/version"
11
+ require "email_address/active_record_validator" if defined?(ActiveModel)
10
12
 
11
13
  module EmailAddress
12
14
 
13
- def self.new(address)
14
- EmailAddress::Address.new(address)
15
+ # Creates an instance of this email address.
16
+ # This is a short-cut to Email::Address::Address.new
17
+ def self.new(email_address)
18
+ EmailAddress::Address.new(email_address)
15
19
  end
16
20
 
21
+ # Given an email address, this return true if the email validates, false otherwise
22
+ def self.valid?(email_address, options={})
23
+ self.new(email_address).valid?(options)
24
+ end
25
+
26
+ # Shortcut to normalize the given email address
27
+ def self.normal(email_address)
28
+ EmailAddress::Address.new(email_address).normalize
29
+ end
30
+
31
+ def self.new_normal(email_address)
32
+ EmailAddress::Address.new(EmailAddress::Address.new(email_address).normalize)
33
+ end
34
+
35
+ def self.canonical(email_address)
36
+ EmailAddress::Address.new(email_address).normalize
37
+ end
38
+
39
+ def self.new_canonical(email_address)
40
+ EmailAddress::Address.new(EmailAddress::Address.new(email_address).canonical)
41
+ end
17
42
  end
@@ -0,0 +1,42 @@
1
+ module EmailAddress
2
+
3
+ # ActiveRecord validator class for validating an email
4
+ # address with this library.
5
+ # Note the initialization happens once per process.
6
+ #
7
+ # Usage:
8
+ # validates_with EmailAddress::ActiveRecordValidator, field: :name
9
+ #
10
+ # Options:
11
+ # field: email,
12
+ # fields: [:email1, :email2]
13
+ # Default field:
14
+ # :email or :email_address (first found)
15
+ #
16
+ class ActiveRecordValidator < ActiveModel::Validator
17
+
18
+ def initialize(options={})
19
+ @opt = options
20
+ end
21
+
22
+ def validate(r)
23
+ if @opt[:fields]
24
+ @opt[:fields].each {|f| validate_email(r, f) }
25
+ elsif @opt[:field]
26
+ validate_email(r, opt[:field])
27
+ elsif r.respond_to? :email
28
+ validate_email(r, :email)
29
+ elsif r.respond_to? :email_address
30
+ validate_email(r, :email_address)
31
+ end
32
+ end
33
+
34
+ def validate_email(r,f)
35
+ return if r[f].nil?
36
+ e = EmailAddress.new(r[f])
37
+ r.errors[f] << "Email Address Not Valid" unless e.valid?
38
+ end
39
+
40
+ end
41
+
42
+ end
@@ -3,72 +3,159 @@ require 'digest/md5'
3
3
 
4
4
  module EmailAddress
5
5
  class Address
6
+ include Comparable
6
7
 
7
- def initialize(address)
8
- @address = address
9
- parse
10
- end
11
-
12
- def parse
13
- (_, local, host) = @address.match(/\A(.+)@(.+)/).to_a
14
- @host = EmailAddress::Host.new(host)
15
- @local = EmailAddress::Local.new(local, @host.provider)
8
+ # Given an email address of the form "local@hostname", this sets up the
9
+ # instance, and initializes the address to the "normalized" format of the
10
+ # address. The original string is available in the #original method.
11
+ def initialize(email_address)
12
+ @original = email_address
13
+ (local, host) = email_address.split('@', 2)
14
+ @host = EmailAddress::Host.new(host)
15
+ @local = EmailAddress::Local.new(local||@address, @host)
16
16
  end
17
17
 
18
+ # Returns the Email::Address::Host to inspect the host name of the address
18
19
  def host
19
20
  @host
20
21
  end
21
22
 
23
+ # Returns the EmailAddress::local to inspect the data to the left of the @
24
+ # Use the #left method to access the full string
22
25
  def local
23
26
  @local
24
27
  end
25
28
 
29
+ # Everything to the left of the @ in the address, called the local part.
30
+ def left
31
+ local.to_s
32
+ end
33
+
34
+ # Returns the mailbox portion of the local port, with no tags. Usually, this
35
+ # can be considered the user account or role account names. Some systems
36
+ # employ dynamic email addresses which don't have the same meaning.
26
37
  def mailbox
27
38
  @local.mailbox
28
39
  end
29
40
 
41
+ # Returns the host name, the part to the right of the @ sign.
30
42
  def host_name
31
43
  @host.host_name
32
44
  end
45
+ alias :right :host_name
33
46
 
47
+ # Returns the tag part of the local address, or nil if not given.
34
48
  def tag
35
49
  @local.tag
36
50
  end
37
51
 
52
+ # Retuns any comments parsed from the local part of the email address.
53
+ # This is retained for inspection after construction, even if it is
54
+ # removed from the normalized email address.
38
55
  def comment
39
56
  @local.comment
40
57
  end
41
58
 
59
+ # Returns the ESP (Email Service Provider) or ISP name derived
60
+ # using the provider configuration rules.
42
61
  def provider
43
62
  @host.provider
44
63
  end
45
64
 
65
+ # Returns the string representation of the normalized email address.
46
66
  def to_s
47
67
  normalize
48
68
  end
49
69
 
50
- def normalize
70
+ # The original email address in the request (unmodified).
71
+ def original
72
+ @original
73
+ end
74
+
75
+ # Returns the normailed email address according to the provider
76
+ # and system normalization rules. Ususally this downcases the address,
77
+ # removes spaces and comments, but includes any tags.
78
+ def normal
51
79
  [@local.normalize, @host.normalize].join('@')
52
80
  end
81
+ alias :normalize :normal
53
82
 
83
+ # Returns the canonical email address according to the provider
84
+ # uniqueness rules. Usually, this downcases the address, removes
85
+ # spaves and comments and tags, and any extraneous part of the address
86
+ # not considered a unique account by the provider.
54
87
  def canonical
55
88
  [@local.canonical, @host.canonical].join('@')
56
89
  end
90
+ alias :uniq :canonical
91
+ alias :canonicalize :canonical
57
92
 
93
+ # Returns and MD5 of the canonical address form. Some cross-system systems
94
+ # use the email address MD5 instead of the actual address to refer to the
95
+ # same shared user identity without exposing the actual address when it
96
+ # is not known in common.
58
97
  def md5
59
98
  Digest::MD5.hexdigest(canonical)
60
99
  end
61
100
 
101
+ def canonical_md5
102
+ Digest::MD5.hexdigest(self.canonical)
103
+ end
104
+
105
+ # This returns the SHA1 digest (in a hex string) of the canonical email
106
+ # address. See #md5 for more background.
62
107
  def sha1
63
108
  Digest::SHA1.hexdigest(canonical)
64
109
  end
65
110
 
66
- def archive
111
+ # Equal matches the normalized version of each address. Use the Threequal to check
112
+ # for match on canonical or redacted versions of addresses
113
+ def ==(other_email)
114
+ normalize == other_email.normalize
115
+ end
116
+ alias :eql? :==
117
+ alias :equal? :==
118
+
119
+ # Return the <=> or CMP comparison operator result (-1, 0, +1) on the comparison
120
+ # of this addres with another, using the canonical or redacted forms.
121
+ def same_as?(other_email)
122
+ canonical == other_email.canonical ||
123
+ redact == other_email.canonical || canonical == other_email.redact
124
+ end
125
+ alias :include? :same_as?
126
+
127
+ # Return the <=> or CMP comparison operator result (-1, 0, +1) on the comparison
128
+ # of this addres with another, using the normalized form.
129
+ def <=>(other_email)
130
+ normalize <=> other_email.normalize
131
+ end
132
+
133
+ # Redact the address for storage. To protect the user's privacy,
134
+ # use this when you don't want to store a real email, only a fingerprint.
135
+ # Given the original address, you can match the original with this method.
136
+ # This returns the SHA1 of the canonical address (no tags, no gmail dots)
137
+ # at the original host. The host is part of the digest part, but also
138
+ # retained for verification and domain maintenance.
139
+ def redact
67
140
  [sha1, @host.canonical].join('@')
68
141
  end
69
142
 
143
+ def redacted?
144
+ @local.to_s =~ /\A[0-9a-f]{40}\z/ ? true : false
145
+ end
146
+
147
+ # Returns true if this address is considered valid according to the format
148
+ # configured for its provider, It test the normalized form.
70
149
  def valid?(options={})
71
150
  EmailAddress::Validator.validate(self, options)
72
151
  end
152
+
153
+ # Returns an array of error messages generated from the validation process via
154
+ # the #valid? method.
155
+ def errors(options={})
156
+ v = EmailAddress::Validator.new(self, options)
157
+ v.valid?
158
+ v.errors
159
+ end
73
160
  end
74
161
  end