email_reply_parser_ffcrm 0.5.0

Sign up to get free protection for your applications and to get access to all the features.
data/LICENSE ADDED
@@ -0,0 +1,22 @@
1
+ The MIT License
2
+
3
+ Copyright (c) GitHub
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
22
+
@@ -0,0 +1,103 @@
1
+ # Email Reply Parser
2
+
3
+ EmailReplyParser is a small library to parse plain text email content.
4
+ See the rocco-documented source code for specifics on how it works.
5
+
6
+ This is what GitHub uses to display comments that were created from
7
+ email replies. This code is being open sourced in an effort to
8
+ crowdsource the quality of our email representation.
9
+
10
+ See more at the [Rocco docs][rocco].
11
+
12
+ [rocco]: http://help.github.com/code/email_reply_parser/
13
+
14
+ ## Problem?
15
+
16
+ If you have a question about the behavior and formatting of email replies on GitHub, check out [support][support]. If you have a specific issue regarding this library, then hit up the [Issues][issues].
17
+
18
+ [support]: http://support.github.com/
19
+ [issues]: https://github.com/github/email_reply_parser/issues
20
+
21
+ ## Installation
22
+
23
+ Get it from [GitHub][github] or `gem install email_reply_parser`. Run `rake` to run the tests.
24
+
25
+ [github]: https://github.com/github/email_reply_parser
26
+
27
+ ## Contribute
28
+
29
+ If you'd like to hack on EmailReplyParser, start by forking the repo on GitHub:
30
+
31
+ https://github.com/github/email_reply_parser
32
+
33
+ The best way to get your changes merged back into core is as follows:
34
+
35
+ * Clone down your fork
36
+ * Create a thoughtfully named topic branch to contain your change
37
+ * Hack away
38
+ * Add tests and make sure everything still passes by running rake
39
+ * If you are adding new functionality, document it in the README
40
+ * Do not change the version number, I will do that on my end
41
+ * If necessary, rebase your commits into logical chunks, without errors
42
+ * Push the branch up to GitHub
43
+ * Send a pull request to the `github/email_reply_parser` project.
44
+
45
+ ## Known Issues
46
+
47
+ ### Quoted Headers
48
+
49
+ Quoted headers aren't picked up if there's an extra line break:
50
+
51
+ On <date>, <author> wrote:
52
+
53
+ > blah
54
+
55
+ Also, they're not picked up if the email client breaks it up into
56
+ multiple lines. GMail breaks up any lines over 80 characters for you.
57
+
58
+ On <date>, <author>
59
+ wrote:
60
+ > blah
61
+
62
+ Not to mention that we're search for "on" and "wrote". It won't work
63
+ with other languages.
64
+
65
+ Possible solution: Remove "reply@reply.github.com" lines...
66
+
67
+ ### Weird Signatures
68
+
69
+ Lines starting with `-` or `_` sometimes mark the beginning of
70
+ signatures:
71
+
72
+ Hello
73
+
74
+ --
75
+ Rick
76
+
77
+ Not everyone follows this convention:
78
+
79
+ Hello
80
+
81
+ Mr Rick Olson
82
+ Galactic President Superstar Mc Awesomeville
83
+ GitHub
84
+
85
+ **********************DISCLAIMER***********************************
86
+ * Note: blah blah blah *
87
+ **********************DISCLAIMER***********************************
88
+
89
+
90
+
91
+ ### Strange Quoting
92
+
93
+ Apparently, prefixing lines with `>` isn't universal either:
94
+
95
+ Hello
96
+
97
+ --
98
+ Rick
99
+
100
+ ________________________________________
101
+ From: Bob [reply@reply.github.com]
102
+ Sent: Monday, March 14, 2011 6:16 PM
103
+ To: Rick
@@ -0,0 +1,135 @@
1
+ require 'rubygems'
2
+ require 'rake'
3
+ require 'date'
4
+
5
+ #############################################################################
6
+ #
7
+ # Helper functions
8
+ #
9
+ #############################################################################
10
+
11
+ def name
12
+ @name ||= "email_reply_parser"
13
+ end
14
+
15
+ def version
16
+ line = File.read("lib/#{name}.rb")[/^\s*VERSION\s*=\s*.*/]
17
+ line.match(/.*VERSION\s*=\s*['"](.*)['"]/)[1]
18
+ end
19
+
20
+ def date
21
+ Date.today.to_s
22
+ end
23
+
24
+ def rubyforge_project
25
+ name
26
+ end
27
+
28
+ def gemspec_file
29
+ "#{name}_ffcrm.gemspec"
30
+ end
31
+
32
+ def gem_file
33
+ "#{name}-#{version}.gem"
34
+ end
35
+
36
+ def replace_header(head, header_name)
37
+ head.sub!(/(\.#{header_name}\s*= ').*'/) { "#{$1}#{send(header_name)}'"}
38
+ end
39
+
40
+ #############################################################################
41
+ #
42
+ # Standard tasks
43
+ #
44
+ #############################################################################
45
+
46
+ task :default => :test
47
+
48
+ require 'rake/testtask'
49
+ Rake::TestTask.new(:test) do |test|
50
+ test.libs << 'lib' << 'test'
51
+ test.pattern = 'test/**/*_test.rb'
52
+ test.verbose = true
53
+ end
54
+
55
+ desc "Open an irb session preloaded with this library"
56
+ task :console do
57
+ sh "irb -rubygems -r ./lib/#{name}.rb"
58
+ end
59
+
60
+ #############################################################################
61
+ #
62
+ # Custom tasks (add your own tasks here)
63
+ #
64
+ #############################################################################
65
+
66
+
67
+
68
+ #############################################################################
69
+ #
70
+ # Packaging tasks
71
+ #
72
+ #############################################################################
73
+
74
+ desc "Create tag v#{version} and build and push #{gem_file} to Rubygems"
75
+ task :release => :build do
76
+ unless `git branch` =~ /^\* master$/
77
+ puts "You must be on the master branch to release!"
78
+ exit!
79
+ end
80
+ sh "git commit --allow-empty -a -m 'Release #{version}'"
81
+ sh "git tag v#{version}"
82
+ sh "git push origin master"
83
+ sh "git push origin v#{version}"
84
+ sh "gem push pkg/#{name}-#{version}.gem"
85
+ end
86
+
87
+ desc "Build #{gem_file} into the pkg directory"
88
+ task :build => :gemspec do
89
+ sh "mkdir -p pkg"
90
+ sh "gem build #{gemspec_file}"
91
+ sh "mv #{gem_file} pkg"
92
+ end
93
+
94
+ desc "Generate #{gemspec_file}"
95
+ task :gemspec => :validate do
96
+ # read spec file and split out manifest section
97
+ spec = File.read(gemspec_file)
98
+ head, manifest, tail = spec.split(" # = MANIFEST =\n")
99
+
100
+ # replace name version and date
101
+ replace_header(head, :name)
102
+ replace_header(head, :version)
103
+ replace_header(head, :date)
104
+ #comment this out if your rubyforge_project has a different name
105
+ replace_header(head, :rubyforge_project)
106
+
107
+ # determine file list from git ls-files
108
+ files = `git ls-files`.
109
+ split("\n").
110
+ sort.
111
+ reject { |file| file =~ /^\./ }.
112
+ reject { |file| file =~ /^(rdoc|pkg)/ }.
113
+ map { |file| " #{file}" }.
114
+ join("\n")
115
+
116
+ # piece file back together and write
117
+ manifest = " s.files = %w[\n#{files}\n ]\n"
118
+ spec = [head, manifest, tail].join(" # = MANIFEST =\n")
119
+ File.open(gemspec_file, 'w') { |io| io.write(spec) }
120
+ puts "Updated #{gemspec_file}"
121
+ end
122
+
123
+ desc "Validate #{gemspec_file}"
124
+ task :validate do
125
+ libfiles = Dir['lib/*'] - ["lib/#{name}.rb", "lib/#{name}"]
126
+ unless libfiles.empty?
127
+ puts "Directory `lib` should only contain a `#{name}.rb` file and `#{name}` dir."
128
+ exit!
129
+ end
130
+ unless Dir['VERSION*'].empty?
131
+ puts "A `VERSION` file at root level violates Gem best practices."
132
+ exit!
133
+ end
134
+ end
135
+
@@ -0,0 +1,88 @@
1
+ ## This is the rakegem gemspec template. Make sure you read and understand
2
+ ## all of the comments. Some sections require modification, and others can
3
+ ## be deleted if you don't need them. Once you understand the contents of
4
+ ## this file, feel free to delete any comments that begin with two hash marks.
5
+ ## You can find comprehensive Gem::Specification documentation, at
6
+ ## http://docs.rubygems.org/read/chapter/20
7
+ Gem::Specification.new do |s|
8
+ s.specification_version = 2 if s.respond_to? :specification_version=
9
+ s.required_rubygems_version = Gem::Requirement.new(">= 0") if s.respond_to? :required_rubygems_version=
10
+ s.rubygems_version = '1.3.5'
11
+
12
+ ## Leave these as is they will be modified for you by the rake gemspec task.
13
+ ## If your rubyforge_project name is different, then edit it and comment out
14
+ ## the sub! line in the Rakefile
15
+ s.name = 'email_reply_parser_ffcrm'
16
+ s.version = '0.5.0'
17
+ s.date = '2012-05-09'
18
+
19
+ ## Make sure your summary is short. The description may be as long
20
+ ## as you like.
21
+ s.summary = "Short description used in Gem listings."
22
+ s.description = "Long description. Maybe copied from the README."
23
+
24
+ ## List the primary authors. If there are a bunch of authors, it's probably
25
+ ## better to set the email to an email list or something. If you don't have
26
+ ## a custom homepage, consider using your GitHub URL or the like.
27
+ s.authors = ["Rick Olson"]
28
+ s.email = 'technoweenie@gmail.com'
29
+ s.homepage = 'http://github.com/github/email_reply_parser'
30
+
31
+ ## This gets added to the $LOAD_PATH so that 'lib/NAME.rb' can be required as
32
+ ## require 'NAME.rb' or'/lib/NAME/file.rb' can be as require 'NAME/file.rb'
33
+ s.require_paths = %w[lib]
34
+
35
+ ## This sections is only necessary if you have C extensions.
36
+ #s.require_paths << 'ext'
37
+ #s.extensions = %w[ext/extconf.rb]
38
+
39
+ ## If your gem includes any executables, list them here.
40
+ #s.executables = ["name"]
41
+ #s.default_executable = 'name'
42
+
43
+ ## Specify any RDoc options here. You'll want to add your README and
44
+ ## LICENSE files to the extra_rdoc_files list.
45
+ s.rdoc_options = ["--charset=UTF-8"]
46
+ s.extra_rdoc_files = %w[README.md LICENSE]
47
+
48
+ ## List your runtime dependencies here. Runtime dependencies are those
49
+ ## that are needed for an end user to actually USE your code.
50
+ #s.add_dependency('DEPNAME', [">= 1.1.0", "< 2.0.0"])
51
+
52
+ ## List your development dependencies here. Development dependencies are
53
+ ## those that are only needed during development
54
+ #s.add_development_dependency('DEVDEPNAME', [">= 1.1.0", "< 2.0.0"])
55
+
56
+ ## Leave this section as-is. It will be automatically generated from the
57
+ ## contents of your Git repository via the gemspec task. DO NOT REMOVE
58
+ ## THE MANIFEST COMMENTS, they are used as delimiters by the task.
59
+ # = MANIFEST =
60
+ s.files = %w[
61
+ LICENSE
62
+ README.md
63
+ Rakefile
64
+ email_reply_parser_ffcrm.gemspec
65
+ lib/email_reply_parser.rb
66
+ test/email_reply_parser_test.rb
67
+ test/emails/correct_sig.txt
68
+ test/emails/email_1_1.txt
69
+ test/emails/email_1_2.txt
70
+ test/emails/email_1_3.txt
71
+ test/emails/email_1_4.txt
72
+ test/emails/email_1_5.txt
73
+ test/emails/email_1_6.txt
74
+ test/emails/email_2_1.txt
75
+ test/emails/email_2_2.txt
76
+ test/emails/email_BlackBerry.txt
77
+ test/emails/email_bullets.txt
78
+ test/emails/email_iPhone.txt
79
+ test/emails/email_multi_word_sent_from_my_mobile_device.txt
80
+ test/emails/email_sent_from_my_not_signature.txt
81
+ ]
82
+ # = MANIFEST =
83
+
84
+ ## Test files will be grabbed from the file list. Make sure the path glob
85
+ ## matches what you actually use.
86
+ s.test_files = s.files.select { |path| path =~ /^test\/.*_test\.rb/ }
87
+ end
88
+
@@ -0,0 +1,265 @@
1
+ require 'strscan'
2
+
3
+ # EmailReplyParser is a small library to parse plain text email content. The
4
+ # goal is to identify which fragments are quoted, part of a signature, or
5
+ # original body content. We want to support both top and bottom posters, so
6
+ # no simple "REPLY ABOVE HERE" content is used.
7
+ #
8
+ # Beyond RFC 5322 (which is handled by the [Ruby mail gem][mail]), there aren't
9
+ # any real standards for how emails are created. This attempts to parse out
10
+ # common conventions for things like replies:
11
+ #
12
+ # this is some text
13
+ #
14
+ # On <date>, <author> wrote:
15
+ # > blah blah
16
+ # > blah blah
17
+ #
18
+ # ... and signatures:
19
+ #
20
+ # this is some text
21
+ #
22
+ # --
23
+ # Bob
24
+ # http://homepage.com/~bob
25
+ #
26
+ # Each of these are parsed into Fragment objects.
27
+ #
28
+ # EmailReplyParser also attempts to figure out which of these blocks should
29
+ # be hidden from users.
30
+ #
31
+ # [mail]: https://github.com/mikel/mail
32
+ class EmailReplyParser
33
+ VERSION = "0.5.0"
34
+
35
+ # Public: Splits an email body into a list of Fragments.
36
+ #
37
+ # text - A String email body.
38
+ #
39
+ # Returns an Email instance.
40
+ def self.read(text)
41
+ Email.new.read(text)
42
+ end
43
+
44
+ # Public: Get the text of the visible portions of the given email body.
45
+ #
46
+ # text - A String email body.
47
+ #
48
+ # Returns a String.
49
+ def self.parse_reply(text)
50
+ self.read(text).visible_text
51
+ end
52
+
53
+ ### Emails
54
+
55
+ # An Email instance represents a parsed body String.
56
+ class Email
57
+ # Emails have an Array of Fragments.
58
+ attr_reader :fragments
59
+
60
+ def initialize
61
+ @fragments = []
62
+ end
63
+
64
+ # Public: Gets the combined text of the visible fragments of the email body.
65
+ #
66
+ # Returns a String.
67
+ def visible_text
68
+ fragments.select{|f| !f.hidden?}.map{|f| f.to_s}.join("\n").rstrip
69
+ end
70
+
71
+ # Splits the given text into a list of Fragments. This is roughly done by
72
+ # reversing the text and parsing from the bottom to the top. This way we
73
+ # can check for 'On <date>, <author> wrote:' lines above quoted blocks.
74
+ #
75
+ # text - A String email body.
76
+ #
77
+ # Returns this same Email instance.
78
+ def read(text)
79
+ # Check for multi-line reply headers. Some clients break up
80
+ # the "On DATE, NAME <EMAIL> wrote:" line into multiple lines.
81
+ if text =~ /^(On(.+)wrote:)$/m
82
+ # Remove all new lines from the reply header.
83
+ text.gsub! $1, $1.gsub("\n", " ")
84
+ end
85
+
86
+ # Some users may reply directly above a line of underscores.
87
+ # In order to ensure that these fragments are split correctly,
88
+ # make sure that all lines of underscores are preceded by
89
+ # at least two newline characters.
90
+ text.gsub!(/([^\n])(?=\n_{7}_+)$/m, "\\1\n")
91
+
92
+ # The text is reversed initially due to the way we check for hidden
93
+ # fragments.
94
+ text = text.reverse
95
+
96
+ # This determines if any 'visible' Fragment has been found. Once any
97
+ # visible Fragment is found, stop looking for hidden ones.
98
+ @found_visible = false
99
+
100
+ # This instance variable points to the current Fragment. If the matched
101
+ # line fits, it should be added to this Fragment. Otherwise, finish it
102
+ # and start a new Fragment.
103
+ @fragment = nil
104
+
105
+ # Use the StringScanner to pull out each line of the email content.
106
+ @scanner = StringScanner.new(text)
107
+ while line = @scanner.scan_until(/\n/)
108
+ scan_line(line)
109
+ end
110
+
111
+ # Be sure to parse the last line of the email.
112
+ if (last_line = @scanner.rest.to_s).size > 0
113
+ scan_line(last_line)
114
+ end
115
+
116
+ # Finish up the final fragment. Finishing a fragment will detect any
117
+ # attributes (hidden, signature, reply), and join each line into a
118
+ # string.
119
+ finish_fragment
120
+
121
+ @scanner = @fragment = nil
122
+
123
+ # Now that parsing is done, reverse the order.
124
+ @fragments.reverse!
125
+ self
126
+ end
127
+
128
+ private
129
+ EMPTY = "".freeze
130
+ SIG_REGEX = /(--|__|\w-$)|(^(\w+\s*){1,3} #{"Sent from my".reverse}$)/
131
+
132
+ ### Line-by-Line Parsing
133
+
134
+ # Scans the given line of text and figures out which fragment it belongs
135
+ # to.
136
+ #
137
+ # line - A String line of text from the email.
138
+ #
139
+ # Returns nothing.
140
+ def scan_line(line)
141
+ line.chomp!("\n")
142
+ line.lstrip! unless line =~ SIG_REGEX
143
+
144
+ # We're looking for leading `>`'s to see if this line is part of a
145
+ # quoted Fragment.
146
+ is_quoted = !!(line =~ /(>+)$/)
147
+
148
+ # Mark the current Fragment as a signature if the current line is empty
149
+ # and the Fragment starts with a common signature indicator.
150
+ if @fragment && line == EMPTY
151
+ if @fragment.lines.last =~ SIG_REGEX
152
+ @fragment.signature = true
153
+ finish_fragment
154
+ end
155
+ end
156
+
157
+ # If the line matches the current fragment, add it. Note that a common
158
+ # reply header also counts as part of the quoted Fragment, even though
159
+ # it doesn't start with `>`.
160
+ if @fragment &&
161
+ ((@fragment.quoted? == is_quoted) ||
162
+ (@fragment.quoted? && (quote_header?(line) || line == EMPTY)))
163
+ @fragment.lines << line
164
+
165
+ # Otherwise, finish the fragment and start a new one.
166
+ else
167
+ finish_fragment
168
+ @fragment = Fragment.new(is_quoted, line)
169
+ end
170
+ end
171
+
172
+ # Detects if a given line is a header above a quoted area. It is only
173
+ # checked for lines preceding quoted regions.
174
+ #
175
+ # line - A String line of text from the email.
176
+ #
177
+ # Returns true if the line is a valid header, or false.
178
+ def quote_header?(line)
179
+ line =~ /^:etorw.*nO$/
180
+ end
181
+
182
+ # Builds the fragment string and reverses it, after all lines have been
183
+ # added. It also checks to see if this Fragment is hidden. The hidden
184
+ # Fragment check reads from the bottom to the top.
185
+ #
186
+ # Any quoted Fragments or signature Fragments are marked hidden if they
187
+ # are below any visible Fragments. Visible Fragments are expected to
188
+ # contain original content by the author. If they are below a quoted
189
+ # Fragment, then the Fragment should be visible to give context to the
190
+ # reply.
191
+ #
192
+ # some original text (visible)
193
+ #
194
+ # > do you have any two's? (quoted, visible)
195
+ #
196
+ # Go fish! (visible)
197
+ #
198
+ # > --
199
+ # > Player 1 (quoted, hidden)
200
+ #
201
+ # --
202
+ # Player 2 (signature, hidden)
203
+ #
204
+ def finish_fragment
205
+ if @fragment
206
+ @fragment.finish
207
+ if !@found_visible
208
+ if @fragment.quoted? || @fragment.signature? ||
209
+ @fragment.to_s.strip == EMPTY
210
+ @fragment.hidden = true
211
+ else
212
+ @found_visible = true
213
+ end
214
+ end
215
+ @fragments << @fragment
216
+ end
217
+ @fragment = nil
218
+ end
219
+ end
220
+
221
+ ### Fragments
222
+
223
+ # Represents a group of paragraphs in the email sharing common attributes.
224
+ # Paragraphs should get their own fragment if they are a quoted area or a
225
+ # signature.
226
+ class Fragment < Struct.new(:quoted, :signature, :hidden)
227
+ # This is an Array of String lines of content. Since the content is
228
+ # reversed, this array is backwards, and contains reversed strings.
229
+ attr_reader :lines,
230
+
231
+ # This is reserved for the joined String that is build when this Fragment
232
+ # is finished.
233
+ :content
234
+
235
+ def initialize(quoted, first_line)
236
+ self.signature = self.hidden = false
237
+ self.quoted = quoted
238
+ @lines = [first_line]
239
+ @content = nil
240
+ @lines.compact!
241
+ end
242
+
243
+ alias quoted? quoted
244
+ alias signature? signature
245
+ alias hidden? hidden
246
+
247
+ # Builds the string content by joining the lines and reversing them.
248
+ #
249
+ # Returns nothing.
250
+ def finish
251
+ @content = @lines.join("\n")
252
+ @lines = nil
253
+ @content.reverse!
254
+ end
255
+
256
+ def to_s
257
+ @content
258
+ end
259
+
260
+ def inspect
261
+ to_s.inspect
262
+ end
263
+ end
264
+ end
265
+
@@ -0,0 +1,154 @@
1
+ require 'rubygems'
2
+ require 'test/unit'
3
+ require 'pathname'
4
+ require 'pp'
5
+
6
+ dir = Pathname.new File.expand_path(File.dirname(__FILE__))
7
+ require dir + '..' + 'lib' + 'email_reply_parser'
8
+
9
+ EMAIL_FIXTURE_PATH = dir + 'emails'
10
+
11
+ class EmailReplyParserTest < Test::Unit::TestCase
12
+ def test_reads_simple_body
13
+ reply = email(:email_1_1)
14
+ assert_equal 3, reply.fragments.size
15
+
16
+ assert reply.fragments.none? { |f| f.quoted? }
17
+ assert_equal [false, true, true],
18
+ reply.fragments.map { |f| f.signature? }
19
+ assert_equal [false, true, true],
20
+ reply.fragments.map { |f| f.hidden? }
21
+
22
+ assert_equal "Hi folks
23
+
24
+ What is the best way to clear a Riak bucket of all key, values after
25
+ running a test?
26
+ I am currently using the Java HTTP API.\n", reply.fragments[0].to_s
27
+
28
+ assert_equal "-Abhishek Kona\n\n", reply.fragments[1].to_s
29
+ end
30
+
31
+ def test_reads_top_post
32
+ reply = email(:email_1_3)
33
+ assert_equal 5, reply.fragments.size
34
+
35
+ assert_equal [false, false, true, false, false],
36
+ reply.fragments.map { |f| f.quoted? }
37
+ assert_equal [false, true, true, true, true],
38
+ reply.fragments.map { |f| f.hidden? }
39
+ assert_equal [false, true, false, false, true],
40
+ reply.fragments.map { |f| f.signature? }
41
+
42
+ assert_match /^Oh thanks.\n\nHaving/, reply.fragments[0].to_s
43
+ assert_match /^-A/, reply.fragments[1].to_s
44
+ assert_match /^On [^\:]+\:/, reply.fragments[2].to_s
45
+ assert_match /^_/, reply.fragments[4].to_s
46
+ end
47
+
48
+ def test_reads_bottom_post
49
+ reply = email(:email_1_2)
50
+ assert_equal 6, reply.fragments.size
51
+
52
+ assert_equal [false, true, false, true, false, false],
53
+ reply.fragments.map { |f| f.quoted? }
54
+ assert_equal [false, false, false, false, false, true],
55
+ reply.fragments.map { |f| f.signature? }
56
+ assert_equal [false, false, false, true, true, true],
57
+ reply.fragments.map { |f| f.hidden? }
58
+
59
+ assert_equal "Hi,", reply.fragments[0].to_s
60
+ assert_match /^On [^\:]+\:/, reply.fragments[1].to_s
61
+ assert_match /^You can list/, reply.fragments[2].to_s
62
+ assert_match /^> /, reply.fragments[3].to_s
63
+ assert_match /^_/, reply.fragments[5].to_s
64
+ end
65
+
66
+ def test_recognizes_date_string_above_quote
67
+ reply = email :email_1_4
68
+
69
+ assert_match /^Awesome/, reply.fragments[0].to_s
70
+ assert_match /^On/, reply.fragments[1].to_s
71
+ assert_match /Loader/, reply.fragments[1].to_s
72
+ end
73
+
74
+ def test_a_complex_body_with_only_one_fragment
75
+ reply = email :email_1_5
76
+
77
+ assert_equal 1, reply.fragments.size
78
+ end
79
+
80
+ def test_reads_email_with_correct_signature
81
+ reply = email :correct_sig
82
+
83
+ assert_equal 2, reply.fragments.size
84
+ assert_equal [false, false], reply.fragments.map { |f| f.quoted? }
85
+ assert_equal [false, true], reply.fragments.map { |f| f.signature? }
86
+ assert_equal [false, true], reply.fragments.map { |f| f.hidden? }
87
+ assert_match /^-- \nrick/, reply.fragments[1].to_s
88
+ end
89
+
90
+ def test_deals_with_multiline_reply_headers
91
+ reply = email :email_1_6
92
+
93
+ assert_match /^I get/, reply.fragments[0].to_s
94
+ assert_match /^On/, reply.fragments[1].to_s
95
+ assert_match /Was this/, reply.fragments[1].to_s
96
+ end
97
+
98
+ def test_does_not_modify_input_string
99
+ original = "The Quick Brown Fox Jumps Over The Lazy Dog"
100
+ EmailReplyParser.read original
101
+ assert_equal "The Quick Brown Fox Jumps Over The Lazy Dog", original
102
+ end
103
+
104
+ def test_returns_only_the_visible_fragments_as_a_string
105
+ reply = email(:email_2_1)
106
+ assert_equal reply.fragments.select{|r| !r.hidden?}.map{|r| r.to_s}.join("\n").rstrip, reply.visible_text
107
+ end
108
+
109
+ def test_parse_out_just_top_for_outlook_reply
110
+ body = IO.read EMAIL_FIXTURE_PATH.join("email_2_1.txt").to_s
111
+ assert_equal "Outlook with a reply", EmailReplyParser.parse_reply(body)
112
+ end
113
+
114
+ def test_parse_out_just_top_for_outlook_with_reply_directly_above_line
115
+ body = IO.read EMAIL_FIXTURE_PATH.join("email_2_2.txt").to_s
116
+ assert_equal "Outlook with a reply directly above line", EmailReplyParser.parse_reply(body)
117
+ end
118
+
119
+ def test_parse_out_sent_from_iPhone
120
+ body = IO.read EMAIL_FIXTURE_PATH.join("email_iPhone.txt").to_s
121
+ assert_equal "Here is another email", EmailReplyParser.parse_reply(body)
122
+ end
123
+
124
+ def test_parse_out_sent_from_BlackBerry
125
+ body = IO.read EMAIL_FIXTURE_PATH.join("email_BlackBerry.txt").to_s
126
+ assert_equal "Here is another email", EmailReplyParser.parse_reply(body)
127
+ end
128
+
129
+ def test_parse_out_send_from_multiword_mobile_device
130
+ body = IO.read EMAIL_FIXTURE_PATH.join("email_multi_word_sent_from_my_mobile_device.txt").to_s
131
+ assert_equal "Here is another email", EmailReplyParser.parse_reply(body)
132
+ end
133
+
134
+ def test_do_not_parse_out_send_from_in_regular_sentence
135
+ body = IO.read EMAIL_FIXTURE_PATH.join("email_sent_from_my_not_signature.txt").to_s
136
+ assert_equal "Here is another email\n\nSent from my desk, is much easier then my mobile phone.", EmailReplyParser.parse_reply(body)
137
+ end
138
+
139
+ def test_retains_bullets
140
+ body = IO.read EMAIL_FIXTURE_PATH.join("email_bullets.txt").to_s
141
+ assert_equal "test 2 this should list second\n\nand have spaces\n\nand retain this formatting\n\n\n - how about bullets\n - and another",
142
+ EmailReplyParser.parse_reply(body)
143
+ end
144
+
145
+ def test_parse_reply
146
+ body = IO.read EMAIL_FIXTURE_PATH.join("email_1_2.txt").to_s
147
+ assert_equal EmailReplyParser.read(body).visible_text, EmailReplyParser.parse_reply(body)
148
+ end
149
+
150
+ def email(name)
151
+ body = IO.read EMAIL_FIXTURE_PATH.join("#{name}.txt").to_s
152
+ EmailReplyParser.read body
153
+ end
154
+ end
@@ -0,0 +1,4 @@
1
+ this is an email with a correct -- signature.
2
+
3
+ --
4
+ rick
@@ -0,0 +1,13 @@
1
+ Hi folks
2
+
3
+ What is the best way to clear a Riak bucket of all key, values after
4
+ running a test?
5
+ I am currently using the Java HTTP API.
6
+
7
+ -Abhishek Kona
8
+
9
+
10
+ _______________________________________________
11
+ riak-users mailing list
12
+ riak-users@lists.basho.com
13
+ http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
@@ -0,0 +1,51 @@
1
+ Hi,
2
+ On Tue, 2011-03-01 at 18:02 +0530, Abhishek Kona wrote:
3
+ > Hi folks
4
+ >
5
+ > What is the best way to clear a Riak bucket of all key, values after
6
+ > running a test?
7
+ > I am currently using the Java HTTP API.
8
+
9
+ You can list the keys for the bucket and call delete for each. Or if you
10
+ put the keys (and kept track of them in your test) you can delete them
11
+ one at a time (without incurring the cost of calling list first.)
12
+
13
+ Something like:
14
+
15
+ String bucket = "my_bucket";
16
+ BucketResponse bucketResponse = riakClient.listBucket(bucket);
17
+ RiakBucketInfo bucketInfo = bucketResponse.getBucketInfo();
18
+
19
+ for(String key : bucketInfo.getKeys()) {
20
+ riakClient.delete(bucket, key);
21
+ }
22
+
23
+
24
+ would do it.
25
+
26
+ See also
27
+
28
+ http://wiki.basho.com/REST-API.html#Bucket-operations
29
+
30
+ which says
31
+
32
+ "At the moment there is no straightforward way to delete an entire
33
+ Bucket. There is, however, an open ticket for the feature. To delete all
34
+ the keys in a bucket, you’ll need to delete them all individually."
35
+
36
+ >
37
+ > -Abhishek Kona
38
+ >
39
+ >
40
+ > _______________________________________________
41
+ > riak-users mailing list
42
+ > riak-users@lists.basho.com
43
+ > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
44
+
45
+
46
+
47
+
48
+ _______________________________________________
49
+ riak-users mailing list
50
+ riak-users@lists.basho.com
51
+ http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
@@ -0,0 +1,55 @@
1
+ Oh thanks.
2
+
3
+ Having the function would be great.
4
+
5
+ -Abhishek Kona
6
+
7
+ On 01/03/11 7:07 PM, Russell Brown wrote:
8
+ > Hi,
9
+ > On Tue, 2011-03-01 at 18:02 +0530, Abhishek Kona wrote:
10
+ >> Hi folks
11
+ >>
12
+ >> What is the best way to clear a Riak bucket of all key, values after
13
+ >> running a test?
14
+ >> I am currently using the Java HTTP API.
15
+ > You can list the keys for the bucket and call delete for each. Or if you
16
+ > put the keys (and kept track of them in your test) you can delete them
17
+ > one at a time (without incurring the cost of calling list first.)
18
+ >
19
+ > Something like:
20
+ >
21
+ > String bucket = "my_bucket";
22
+ > BucketResponse bucketResponse = riakClient.listBucket(bucket);
23
+ > RiakBucketInfo bucketInfo = bucketResponse.getBucketInfo();
24
+ >
25
+ > for(String key : bucketInfo.getKeys()) {
26
+ > riakClient.delete(bucket, key);
27
+ > }
28
+ >
29
+ >
30
+ > would do it.
31
+ >
32
+ > See also
33
+ >
34
+ > http://wiki.basho.com/REST-API.html#Bucket-operations
35
+ >
36
+ > which says
37
+ >
38
+ > "At the moment there is no straightforward way to delete an entire
39
+ > Bucket. There is, however, an open ticket for the feature. To delete all
40
+ > the keys in a bucket, you’ll need to delete them all individually."
41
+ >
42
+ >> -Abhishek Kona
43
+ >>
44
+ >>
45
+ >> _______________________________________________
46
+ >> riak-users mailing list
47
+ >> riak-users@lists.basho.com
48
+ >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
49
+ >
50
+
51
+
52
+ _______________________________________________
53
+ riak-users mailing list
54
+ riak-users@lists.basho.com
55
+ http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
@@ -0,0 +1,5 @@
1
+ Awesome! I haven't had another problem with it.
2
+
3
+ On Aug 22, 2011, at 7:37 PM, defunkt<reply@reply.github.com> wrote:
4
+
5
+ > Loader seems to be working well.
@@ -0,0 +1,15 @@
1
+ One: Here's what I've got.
2
+
3
+ - This would be the first bullet point that wraps to the second line
4
+ to the next
5
+ - This is the second bullet point and it doesn't wrap
6
+ - This is the third bullet point and I'm having trouble coming up with enough
7
+ to say
8
+ - This is the fourth bullet point
9
+
10
+ Two:
11
+ - Here is another bullet point
12
+ - And another one
13
+
14
+ This is a paragraph that talks about a bunch of stuff. It goes on and on
15
+ for a while.
@@ -0,0 +1,15 @@
1
+ I get proper rendering as well.
2
+
3
+ Sent from a magnificent torch of pixels
4
+
5
+ On Dec 16, 2011, at 12:47 PM, Corey Donohoe
6
+ <reply@reply.github.com>
7
+ wrote:
8
+
9
+ > Was this caching related or fixed already? I get proper rendering here.
10
+ >
11
+ > ![](https://img.skitch.com/20111216-m9munqjsy112yqap5cjee5wr6c.jpg)
12
+ >
13
+ > ---
14
+ > Reply to this email directly or view it on GitHub:
15
+ > https://github.com/github/github/issues/2278#issuecomment-3182418
@@ -0,0 +1,25 @@
1
+ Outlook with a reply
2
+
3
+
4
+ ------------------------------
5
+
6
+ *From:* Google Apps Sync Team [mailto:mail-noreply@google.com]
7
+ *Sent:* Thursday, February 09, 2012 1:36 PM
8
+ *To:* jow@xxxx.com
9
+ *Subject:* Google Apps Sync was updated!
10
+
11
+
12
+
13
+ Dear Google Apps Sync user,
14
+
15
+ Google Apps Sync for Microsoft Outlook® was recently updated. Your computer
16
+ now has the latest version (version 2.5). This release includes bug fixes
17
+ to improve product reliability. For more information about these and other
18
+ changes, please see the help article here:
19
+
20
+ http://www.google.com/support/a/bin/answer.py?answer=153463
21
+
22
+ Sincerely,
23
+
24
+ The Google Apps Sync Team.
25
+
@@ -0,0 +1,10 @@
1
+ Outlook with a reply directly above line
2
+ ________________________________________
3
+ From: CRM Comments [crm-comment@example.com]
4
+ Sent: Friday, 23 March 2012 5:08 p.m.
5
+ To: John S. Greene
6
+ Subject: [contact:106] John Greene
7
+
8
+ A new comment has been added to the Contact named 'John Greene':
9
+
10
+ I am replying to a comment.
@@ -0,0 +1,3 @@
1
+ Here is another email
2
+
3
+ Sent from my BlackBerry
@@ -0,0 +1,22 @@
1
+ test 2 this should list second
2
+
3
+ and have spaces
4
+
5
+ and retain this formatting
6
+
7
+
8
+ - how about bullets
9
+ - and another
10
+
11
+
12
+ On Fri, Feb 24, 2012 at 10:19 AM, <examples@email.goalengine.com> wrote:
13
+
14
+ > Give us an example of how you applied what they learned to achieve
15
+ > something in your organization
16
+
17
+
18
+
19
+
20
+ --
21
+
22
+ *Joe Smith | Director, Product Management*
@@ -0,0 +1,3 @@
1
+ Here is another email
2
+
3
+ Sent from my iPhone
@@ -0,0 +1,3 @@
1
+ Here is another email
2
+
3
+ Sent from my Verizon Wireless BlackBerry
@@ -0,0 +1,3 @@
1
+ Here is another email
2
+
3
+ Sent from my desk, is much easier then my mobile phone.
metadata ADDED
@@ -0,0 +1,68 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: email_reply_parser_ffcrm
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.5.0
5
+ prerelease:
6
+ platform: ruby
7
+ authors:
8
+ - Rick Olson
9
+ autorequire:
10
+ bindir: bin
11
+ cert_chain: []
12
+ date: 2012-05-09 00:00:00.000000000 Z
13
+ dependencies: []
14
+ description: Long description. Maybe copied from the README.
15
+ email: technoweenie@gmail.com
16
+ executables: []
17
+ extensions: []
18
+ extra_rdoc_files:
19
+ - README.md
20
+ - LICENSE
21
+ files:
22
+ - LICENSE
23
+ - README.md
24
+ - Rakefile
25
+ - email_reply_parser_ffcrm.gemspec
26
+ - lib/email_reply_parser.rb
27
+ - test/email_reply_parser_test.rb
28
+ - test/emails/correct_sig.txt
29
+ - test/emails/email_1_1.txt
30
+ - test/emails/email_1_2.txt
31
+ - test/emails/email_1_3.txt
32
+ - test/emails/email_1_4.txt
33
+ - test/emails/email_1_5.txt
34
+ - test/emails/email_1_6.txt
35
+ - test/emails/email_2_1.txt
36
+ - test/emails/email_2_2.txt
37
+ - test/emails/email_BlackBerry.txt
38
+ - test/emails/email_bullets.txt
39
+ - test/emails/email_iPhone.txt
40
+ - test/emails/email_multi_word_sent_from_my_mobile_device.txt
41
+ - test/emails/email_sent_from_my_not_signature.txt
42
+ homepage: http://github.com/github/email_reply_parser
43
+ licenses: []
44
+ post_install_message:
45
+ rdoc_options:
46
+ - --charset=UTF-8
47
+ require_paths:
48
+ - lib
49
+ required_ruby_version: !ruby/object:Gem::Requirement
50
+ none: false
51
+ requirements:
52
+ - - ! '>='
53
+ - !ruby/object:Gem::Version
54
+ version: '0'
55
+ required_rubygems_version: !ruby/object:Gem::Requirement
56
+ none: false
57
+ requirements:
58
+ - - ! '>='
59
+ - !ruby/object:Gem::Version
60
+ version: '0'
61
+ requirements: []
62
+ rubyforge_project:
63
+ rubygems_version: 1.8.24
64
+ signing_key:
65
+ specification_version: 2
66
+ summary: Short description used in Gem listings.
67
+ test_files:
68
+ - test/email_reply_parser_test.rb