email_address 0.0.2 → 0.0.3

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: da957eceefc70a7dc4a21a1d1f44296e36783021
4
- data.tar.gz: fffc1230d4c3080ab56229e34604838543b659e6
3
+ metadata.gz: 77e8fb26631e827d41da7eee015141495316f430
4
+ data.tar.gz: 0fa66926ca1dbbdc13154aae789850eec7e1eb75
5
5
  SHA512:
6
- metadata.gz: e638d9efbba9834cd1857150db91821d82e6bbbbf95ee694bc30287d5c9f69893d87928ee49b044207e9503accf4eefc354295b29e3d5ec44ad6ee5ab2918653
7
- data.tar.gz: 90749c6bdbc84c2586208eefb04f8eb208812124d3253170d46471b64df4e23c5ff24bc6967c426285e8cd838f71bcd1fb3b910a7a82690ab3a91da5665e9e40
6
+ metadata.gz: 5d41c4ca234699c636811187718fc7b73fe52ff55256ebbeb3e32e518d3330c016a72c3185f9632be0cb4d612064e09408517f893927ddd23db9518d0c219066
7
+ data.tar.gz: 93c7fec757b3a635b39aea3bdc6a259ad5c69b0b5369d4ea48e41cde6009b406a8c2eec94a0bc1e42ed837b220a240d971079135e58455a57d0db5ad1e2df3dd
data/README.md CHANGED
@@ -1,31 +1,152 @@
1
1
  # Email Address
2
2
 
3
- The EmailAddress gem is an _opinionated_ email address handler and
4
- validator. It does not use RFC standards because they make things worse.
5
- Email addresses should conform to a few practices that are not
6
- RFC-Compliant.
7
-
8
- Specifically, local parts (left side of the @):
9
-
10
- * Should not be case sensitive.
11
- * Should not contain spaces or anything that would cause quoting.
12
- * Should not allow Unicode. Addressable items like this need to be
13
- entered from any keyboard, such as the US ASCII character set. (Domain
14
- names too, but that can be handled with Punycode.)
15
- * Should not have comments. Neither should domains.
16
- * Should not allow unusual symbols (not usually in names and standard
17
- punctuation).
18
- * Should not be verified by SMTP connections if possible.
19
- * Should have spaces stripped automatically if enabled
20
- * Should be of a reasonable length to identify the recipient.
21
- * Should be human readable and writable.
22
- * Should continue allowing for tagging.
23
- * Should provide mechanism for handling bounce backs and VERP.
24
- * Should be easily normalized and corrected.
25
- * Should be canonicalized to identify duplicates if necessary.
26
- * Should be able to be stored as a digest for privacy proctections.
27
-
28
- If you're on board, let's go!
3
+ [![Gem Version](https://badge.fury.io/rb/email_address.svg)](http://rubygems.org/gems/email_address)
4
+
5
+ The EmailAddress gem provides a structured datatype for email addresses
6
+ and pushes for an _opinionated_ model for which RFC patterns should be
7
+ accepted as a "best practice" and which should not be supported (in the
8
+ name of sanity).
9
+
10
+ This library provides:
11
+
12
+ * Email Address Validation
13
+ * Converting between email address forms
14
+ * **Original:** From the user or data source
15
+ * **Normalized:** A standardized format for identification
16
+ * **Canonical:** A format used to identify a unique user
17
+ * **Redacted:** A format used to store an email address privately
18
+ * **Reference:** Digest formats for sharing addresses without exposing
19
+ them.
20
+ * Matching addresses to Email/Internet Service Providers. Per-provider
21
+ rules for:
22
+ * Validation
23
+ * Address Tag formats
24
+ * Canonicalization
25
+ * Unicode Support
26
+
27
+ ## Email Addresses: The Good Parts
28
+
29
+ Email Addresses are split into two parts: the `local` and `host` part,
30
+ separated by the `@` symbol, or of the generalized format:
31
+
32
+ mailbox+tag@subdomain.domain.tld
33
+
34
+ The **Mailbox** usually identifies the user, role account, or application.
35
+ A **Tag** is any suffix for the mailbox useful for separating and filtering
36
+ incoming email. It is usually preceded by a '+' or other character. Tags are
37
+ not always available for a given ESP or MTA.
38
+
39
+ Local Parts should consist of lower-case 7-bit ASCII alphanumeric and these characters:
40
+ `-+'.,` It should start with and end with an alphanumeric character and
41
+ no more than one special character should appear together.
42
+
43
+ Host parts contain a lower-case version of any standard domain name.
44
+ International Domain Names are allowed, and can be converted to
45
+ [Punycode](http://en.wikipedia.org/wiki/Punycode),
46
+ an encoding system of Unicode strings into the 7-bit ASCII character set.
47
+ Domain names should be configured with MX records in DNS to receive
48
+ email, though this is sometimes mis-configured and the A record can be
49
+ used as a backup.
50
+
51
+ This is the subset of the RFC Email Address specification that should be
52
+ used.
53
+
54
+ ## Email Addresses: The Bad Parts
55
+
56
+ Email addresses are defined and redefined in a series of RFC standards.
57
+ Conforming to the full standards is not recommended for easily
58
+ identifying and supporting email addresses. Among these specification,
59
+ we reject are:
60
+
61
+ * Case-sensitive local parts: `First.Last@example.com`
62
+ * Spaces and Special Characters: `"():;<>@[\\]`
63
+ * Quoting and Escaping Requirements: `"first \"nickname\" last"@example.com`
64
+ * Comment Parts: `(comment)mailbox@example.com`
65
+ * IP and IPv6 addresses as hosts: `mailbox@[127.0.0.1]`
66
+ * Non-ASCII (7-bit) characters in the local part: `Pelé@example.com`
67
+ * Validation by regular expressions like:
68
+ ```
69
+ (?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*
70
+ | "(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]
71
+ | \\[\x01-\x09\x0b\x0c\x0e-\x7f])*")
72
+ @ (?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?
73
+ | \[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}
74
+ (?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:
75
+ (?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]
76
+ | \\[\x01-\x09\x0b\x0c\x0e-\x7f])+)
77
+ \])
78
+ ```
79
+
80
+ ## Internationalization
81
+
82
+ The industry is moving to support Unicode characters in the local part
83
+ of the email address. Currently, SMTP supports only 7-bit ASCII, but a
84
+ new `SMTPUTF8` standard is available, but not yet widely implemented.
85
+ To work properly, global Email systems must be converted to UTF-8
86
+ encoded databases and upgraded to the new email standards.
87
+
88
+ The problem with i18n email addresses is that support outside of the
89
+ given locale becomes hard to enter addresses on keyboards for another
90
+ locale. Because of this, internationalized local parts are not yet
91
+ supported by default. They are more likely to be erroneous.
92
+
93
+ Proper personal identity can still be provided using
94
+ [MIME Encoded-Words](http://en.wikipedia.org/wiki/MIME#Encoded-Word)
95
+ in Email headers.
96
+
97
+ ## Email Addresses Forms
98
+
99
+ * The **original** email address is of the format given by the user.
100
+ * The **Normalized** address has:
101
+ * Lower-case the local and domain part
102
+ * Tags are kept as they are important for the user
103
+ * Remove comments and any "bad parts"
104
+ * This format is what should be used to identify the account.
105
+ * The **Canonical** form is used to uniquely identify the mailbox.
106
+ * Domains stored as punycode for IDN
107
+ * Address Tags removed
108
+ * Special characters removed (dots in gmail addresses are not
109
+ significant)
110
+ * Lower cased and "bad parts" removed
111
+ * Useful for locating a user who forgets registering with a tag or
112
+ with a "Bad part" in the email address.
113
+ * The **Redacted** format is used to store email address fingerprints
114
+ instead of the actual addresses:
115
+ * Format: sha1(canonical_address)@domain
116
+ * Given an email address, the record can be found
117
+ * Useful for treating email addresses as sensitive data and
118
+ complying with requests to remove the address from your database and
119
+ still maintain the state of the account.
120
+ * The **Reference** form allows you to publicly share an address without
121
+ revealing the actual address.
122
+ * Can be the MD5 or SHA1 of the normalized or canonical address
123
+ * Useful for "do not email" lists
124
+ * Useful for cookies that do not reveal the actual account
125
+
126
+ ## Treating Email Addresses as Sensitive Data
127
+
128
+ Like Social Security and Credit Card Numbers, email addresses are
129
+ becoming more important as a personal identifier on the internet.
130
+ Increasingly, we should treat email addresses as sensitive data. If your
131
+ site/database becomes compromised by hackers, these email addresses can
132
+ be stolen and used to spam your users and to try to gain access to their
133
+ accounts. You should not be storing passwords in plain text; perhaps you
134
+ don't need to store email addresses un-encoded either.
135
+
136
+ Consider this: upon registration, store the redacted email address for
137
+ the user, and of course, the salted, encrypted password.
138
+ When the user logs in, compute the redacted email address from
139
+ the user-supplied one and look up the record. Store the original address
140
+ in the session for the user, which goes away when the user logs out.
141
+
142
+ Sometimes, users demand you strike their information from the database.
143
+ Instead of deleting their account, you can "redact" their email
144
+ address, retaining the state of the account to prevent future
145
+ access. Given the original email address again, the redacted account can
146
+ be identified if necessary.
147
+
148
+ Because of these use cases, the **redact** method on the email address
149
+ instance has been provided.
29
150
 
30
151
  ## Installation
31
152
 
@@ -49,6 +170,9 @@ EmailAddress:
49
170
  email = EmailAddress.new("USER+tag@EXAMPLE.com")
50
171
  email.normalize #=> "user+tag@example.com"
51
172
  email.canonical #=> "user@example.com"
173
+ email.redact #=> "63a710569261a24b3766275b7000ce8d7b32e2f7@example.com"
174
+ email.sha1 #=> "63a710569261a24b3766275b7000ce8d7b32e2f7"
175
+ email.md5 #=> "dea073fb289e438a6d69c5384113454c"
52
176
 
53
177
  Email Service Provider (ESP) specific edits can be created to provide
54
178
  validations and canonical manipulations. A few are given out of the box.
@@ -56,7 +180,7 @@ Providers can be defined bu email domain match rules, or by match rules
56
180
  for the MX host names using domains or CIDR addresses.
57
181
 
58
182
  email = EmailAddress.new("First.Last+Tag@Gmail.Com")
59
- email.provider #=> :gmail
183
+ email.provider #=> :google
60
184
  email.canonical #=> "firstlast@gmail.com"
61
185
 
62
186
  Storing the canonical address with the request address (don't remove
@@ -71,32 +195,36 @@ You can inspect the MX (Mail Exchanger) records
71
195
  You can see if it validates as an opinionated address:
72
196
 
73
197
  email.valid? # Resonably valid?
198
+ email.errors #=> [:mx]
74
199
  email.valid_host? # Host name is defined in DNS
75
200
  email.strict? # Strictly valid?
76
201
 
77
- Email addresses should be able to be "archived" or stored in a digest
78
- format. This allows you to safely keep a record of the address and still
79
- protect the account's privacy after it has been closed. Given an address
80
- for inquiry, it can still look up a closed account.
202
+ You can compare email addresses:
81
203
 
82
- email.md5 #=> "dea073fb289e438a6d69c5384113454c"
83
- email.archive #=> "554d32017ab3a7fcf51c88ffce078689003bc521@gmail.com"
204
+ e1 = EmailAddress.new("First.Last@Gmail.com")
205
+ e1.to_s #=> "first.last@gmail.com"
206
+ e2 = EmailAddress.new("FirstLast+tag@Gmail.com")
207
+ e3.to_s #=> "firstlast+tag@gmail.com"
208
+ e3 = EmailAddress.new(e2.redact)
209
+ e3.to_s #=> "554d32017ab3a7fcf51c88ffce078689003bc521@gmail.com"
84
210
 
211
+ e1 == e2 #=> false (Matches by normalized address)
212
+ e1.same_as?(e2) #=> true (Matches as canonical address)
213
+ e1.same_as?(e3) #=> true (Matches as redacted address)
214
+ e1 < e2 #=> true (Compares using normalized address)
85
215
 
86
- ## Email Address Parts
216
+ ## Host Inspection
87
217
 
88
- The Local and Domain Parsing routines divvy the email address into these
89
- parts from `(comment)mailbox+tag@subdomain.domain.tld`
218
+ The `EmailAddress::Host` can be used to inspect the email domain.
90
219
 
91
- * Local - Everything to the left of the "@"
92
- * Mailbox - The controlling mailbox name
93
- * Tag - Anything after the mailbox and tag separator character (usually "+")
94
- * Comment - To be removed from the normalized form
95
- * Host Name - Everything to the right of the "@"
96
- * Subdomains - of the host name, if any.
97
- * Domain Name - host name without subdomains, with TLD
98
- * TLD - the rightmost word or set of 2-character domains ("co.uk")
99
- * Registration Name - host name without subdomain or TLD
220
+ ```ruby
221
+ e1 = EmailAddress.new("First.Last@Gmail.com")
222
+ e1.host.name #=> "gmail.com"
223
+ e1.host.exchanger.mxers #=> [["alt4.gmail-smtp-in.l.google.com", "2a00:1450:400c:c01::1b", 30],...]
224
+ e1.host.exchanger.mx_ips #=> ["2a00:1450:400c:c01::1b", ...]
225
+ e1.host.matches?('.com') #=> true
226
+ e1.host.txt #=> "v=spf1 redirect=_spf.google.com"
227
+ ```
100
228
 
101
229
  ## Domain Matching
102
230
 
data/Rakefile CHANGED
@@ -3,3 +3,21 @@ require "bundler/gem_tasks"
3
3
  task :default do
4
4
  sh "ruby test/email_address/*"
5
5
  end
6
+ require "bundler/gem_tasks"
7
+ require "bundler/setup"
8
+ require 'rake/testtask'
9
+
10
+ task :default => :test
11
+
12
+ desc "Run the Test Suite, toot suite"
13
+ task :test do
14
+ sh "find test -name 'test*rb' -exec ruby {} \\;"
15
+ end
16
+
17
+ desc "Open and IRB Console with the gem loaded"
18
+ task :console do
19
+ sh "bundle exec irb -Ilib -I . -r email_address"
20
+ #require 'irb'
21
+ #ARGV.clear
22
+ #IRB.start
23
+ end
@@ -20,6 +20,7 @@ validator.}
20
20
  spec.require_paths = ["lib"]
21
21
 
22
22
  spec.add_development_dependency "bundler", "~> 1.3"
23
+ spec.add_development_dependency "activemodel", "~> 4.2"
23
24
  spec.add_development_dependency "rake"
24
25
  spec.add_dependency "simpleidn"
25
26
  spec.add_dependency "netaddr"
data/lib/email_address.rb CHANGED
@@ -5,13 +5,38 @@ require "email_address/domain_parser"
5
5
  require "email_address/exchanger"
6
6
  require "email_address/host"
7
7
  require "email_address/local"
8
+ require "email_address/matcher"
8
9
  require "email_address/validator"
9
10
  require "email_address/version"
11
+ require "email_address/active_record_validator" if defined?(ActiveModel)
10
12
 
11
13
  module EmailAddress
12
14
 
13
- def self.new(address)
14
- EmailAddress::Address.new(address)
15
+ # Creates an instance of this email address.
16
+ # This is a short-cut to Email::Address::Address.new
17
+ def self.new(email_address)
18
+ EmailAddress::Address.new(email_address)
15
19
  end
16
20
 
21
+ # Given an email address, this return true if the email validates, false otherwise
22
+ def self.valid?(email_address, options={})
23
+ self.new(email_address).valid?(options)
24
+ end
25
+
26
+ # Shortcut to normalize the given email address
27
+ def self.normal(email_address)
28
+ EmailAddress::Address.new(email_address).normalize
29
+ end
30
+
31
+ def self.new_normal(email_address)
32
+ EmailAddress::Address.new(EmailAddress::Address.new(email_address).normalize)
33
+ end
34
+
35
+ def self.canonical(email_address)
36
+ EmailAddress::Address.new(email_address).normalize
37
+ end
38
+
39
+ def self.new_canonical(email_address)
40
+ EmailAddress::Address.new(EmailAddress::Address.new(email_address).canonical)
41
+ end
17
42
  end
@@ -0,0 +1,42 @@
1
+ module EmailAddress
2
+
3
+ # ActiveRecord validator class for validating an email
4
+ # address with this library.
5
+ # Note the initialization happens once per process.
6
+ #
7
+ # Usage:
8
+ # validates_with EmailAddress::ActiveRecordValidator, field: :name
9
+ #
10
+ # Options:
11
+ # field: email,
12
+ # fields: [:email1, :email2]
13
+ # Default field:
14
+ # :email or :email_address (first found)
15
+ #
16
+ class ActiveRecordValidator < ActiveModel::Validator
17
+
18
+ def initialize(options={})
19
+ @opt = options
20
+ end
21
+
22
+ def validate(r)
23
+ if @opt[:fields]
24
+ @opt[:fields].each {|f| validate_email(r, f) }
25
+ elsif @opt[:field]
26
+ validate_email(r, opt[:field])
27
+ elsif r.respond_to? :email
28
+ validate_email(r, :email)
29
+ elsif r.respond_to? :email_address
30
+ validate_email(r, :email_address)
31
+ end
32
+ end
33
+
34
+ def validate_email(r,f)
35
+ return if r[f].nil?
36
+ e = EmailAddress.new(r[f])
37
+ r.errors[f] << "Email Address Not Valid" unless e.valid?
38
+ end
39
+
40
+ end
41
+
42
+ end
@@ -3,72 +3,159 @@ require 'digest/md5'
3
3
 
4
4
  module EmailAddress
5
5
  class Address
6
+ include Comparable
6
7
 
7
- def initialize(address)
8
- @address = address
9
- parse
10
- end
11
-
12
- def parse
13
- (_, local, host) = @address.match(/\A(.+)@(.+)/).to_a
14
- @host = EmailAddress::Host.new(host)
15
- @local = EmailAddress::Local.new(local, @host.provider)
8
+ # Given an email address of the form "local@hostname", this sets up the
9
+ # instance, and initializes the address to the "normalized" format of the
10
+ # address. The original string is available in the #original method.
11
+ def initialize(email_address)
12
+ @original = email_address
13
+ (local, host) = email_address.split('@', 2)
14
+ @host = EmailAddress::Host.new(host)
15
+ @local = EmailAddress::Local.new(local||@address, @host)
16
16
  end
17
17
 
18
+ # Returns the Email::Address::Host to inspect the host name of the address
18
19
  def host
19
20
  @host
20
21
  end
21
22
 
23
+ # Returns the EmailAddress::local to inspect the data to the left of the @
24
+ # Use the #left method to access the full string
22
25
  def local
23
26
  @local
24
27
  end
25
28
 
29
+ # Everything to the left of the @ in the address, called the local part.
30
+ def left
31
+ local.to_s
32
+ end
33
+
34
+ # Returns the mailbox portion of the local port, with no tags. Usually, this
35
+ # can be considered the user account or role account names. Some systems
36
+ # employ dynamic email addresses which don't have the same meaning.
26
37
  def mailbox
27
38
  @local.mailbox
28
39
  end
29
40
 
41
+ # Returns the host name, the part to the right of the @ sign.
30
42
  def host_name
31
43
  @host.host_name
32
44
  end
45
+ alias :right :host_name
33
46
 
47
+ # Returns the tag part of the local address, or nil if not given.
34
48
  def tag
35
49
  @local.tag
36
50
  end
37
51
 
52
+ # Retuns any comments parsed from the local part of the email address.
53
+ # This is retained for inspection after construction, even if it is
54
+ # removed from the normalized email address.
38
55
  def comment
39
56
  @local.comment
40
57
  end
41
58
 
59
+ # Returns the ESP (Email Service Provider) or ISP name derived
60
+ # using the provider configuration rules.
42
61
  def provider
43
62
  @host.provider
44
63
  end
45
64
 
65
+ # Returns the string representation of the normalized email address.
46
66
  def to_s
47
67
  normalize
48
68
  end
49
69
 
50
- def normalize
70
+ # The original email address in the request (unmodified).
71
+ def original
72
+ @original
73
+ end
74
+
75
+ # Returns the normailed email address according to the provider
76
+ # and system normalization rules. Ususally this downcases the address,
77
+ # removes spaces and comments, but includes any tags.
78
+ def normal
51
79
  [@local.normalize, @host.normalize].join('@')
52
80
  end
81
+ alias :normalize :normal
53
82
 
83
+ # Returns the canonical email address according to the provider
84
+ # uniqueness rules. Usually, this downcases the address, removes
85
+ # spaves and comments and tags, and any extraneous part of the address
86
+ # not considered a unique account by the provider.
54
87
  def canonical
55
88
  [@local.canonical, @host.canonical].join('@')
56
89
  end
90
+ alias :uniq :canonical
91
+ alias :canonicalize :canonical
57
92
 
93
+ # Returns and MD5 of the canonical address form. Some cross-system systems
94
+ # use the email address MD5 instead of the actual address to refer to the
95
+ # same shared user identity without exposing the actual address when it
96
+ # is not known in common.
58
97
  def md5
59
98
  Digest::MD5.hexdigest(canonical)
60
99
  end
61
100
 
101
+ def canonical_md5
102
+ Digest::MD5.hexdigest(self.canonical)
103
+ end
104
+
105
+ # This returns the SHA1 digest (in a hex string) of the canonical email
106
+ # address. See #md5 for more background.
62
107
  def sha1
63
108
  Digest::SHA1.hexdigest(canonical)
64
109
  end
65
110
 
66
- def archive
111
+ # Equal matches the normalized version of each address. Use the Threequal to check
112
+ # for match on canonical or redacted versions of addresses
113
+ def ==(other_email)
114
+ normalize == other_email.normalize
115
+ end
116
+ alias :eql? :==
117
+ alias :equal? :==
118
+
119
+ # Return the <=> or CMP comparison operator result (-1, 0, +1) on the comparison
120
+ # of this addres with another, using the canonical or redacted forms.
121
+ def same_as?(other_email)
122
+ canonical == other_email.canonical ||
123
+ redact == other_email.canonical || canonical == other_email.redact
124
+ end
125
+ alias :include? :same_as?
126
+
127
+ # Return the <=> or CMP comparison operator result (-1, 0, +1) on the comparison
128
+ # of this addres with another, using the normalized form.
129
+ def <=>(other_email)
130
+ normalize <=> other_email.normalize
131
+ end
132
+
133
+ # Redact the address for storage. To protect the user's privacy,
134
+ # use this when you don't want to store a real email, only a fingerprint.
135
+ # Given the original address, you can match the original with this method.
136
+ # This returns the SHA1 of the canonical address (no tags, no gmail dots)
137
+ # at the original host. The host is part of the digest part, but also
138
+ # retained for verification and domain maintenance.
139
+ def redact
67
140
  [sha1, @host.canonical].join('@')
68
141
  end
69
142
 
143
+ def redacted?
144
+ @local.to_s =~ /\A[0-9a-f]{40}\z/ ? true : false
145
+ end
146
+
147
+ # Returns true if this address is considered valid according to the format
148
+ # configured for its provider, It test the normalized form.
70
149
  def valid?(options={})
71
150
  EmailAddress::Validator.validate(self, options)
72
151
  end
152
+
153
+ # Returns an array of error messages generated from the validation process via
154
+ # the #valid? method.
155
+ def errors(options={})
156
+ v = EmailAddress::Validator.new(self, options)
157
+ v.valid?
158
+ v.errors
159
+ end
73
160
  end
74
161
  end