RubyGems - extended_email_reply_parser - Versions diffs - 0.4.0 → 0.5.0 - Mend

extended_email_reply_parser 0.4.0 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +5 -0
data/README.md +9 -0
data/lib/extended_email_reply_parser/mail/message.rb +4 -0
data/lib/extended_email_reply_parser/version.rb +1 -1
data/lib/extended_email_reply_parser.rb +20 -1
metadata +2 -2

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA1:
-  metadata.gz: 945dbd97e90b02bd2f9ea5b0005867fb3dfa9fa3
-  data.tar.gz: 261767d15481a9c51862d623446f33da47681ed7
+  metadata.gz: 1343c2a228d59390f28c51ba8c2e5283f380e6fa
+  data.tar.gz: e2efc52cb6c93b4b6a8e271d47a0a0b894fbba61
 SHA512:
-  metadata.gz: 330300aae38701945c25aba3eee8301c270a9d92bb369616f60b07ed4ff7f11c85ca61f13c0e033b468317ce8b6a7d1fee9315fc94cd35a2c9dd1bc855c268de
-  data.tar.gz: 5075415426c2c38c202e98213ed5cce1389dab8a17989e74e0e37e951c42f56d70a6d8558b8e27447480494675dc1509490d3f927fb00c88908cf9f20824d9f9
+  metadata.gz: 851e7e03fbbf44ec84f40269f61d86e09393cba5df7cef4e5d1ce364379d3ee13a0ea9237585c480e499d93c7429c7212e305b155f61044c7c5e9f687ed92dea
+  data.tar.gz: d4d987194cec11ea53ba3d7779db48accdfcebee35f531fc61428e2c8dfb05e5bde52158e5e4dfa0fc353bf59d4384fa3922b9830aa60543a137c80b8e444b9f

data/CHANGELOG.md CHANGED Viewed

@@ -10,6 +10,11 @@ This project adheres to [Semantic Versioning](http://semver.org/).
 ### Removed
 ### Fixed
+## ExtendedEmailReplyParser 0.5.0 (2016-11-15)
+### Added
+- Adding `ExtendedEmailReplyParser.extract_text_or_html` which falls back to the html part if the text part is missing.
+- `ExtendedEmailReplyParser.parse` uses `extract_text_or_html`. This is good, because html-only mails do not just return `nil`. But, be aware that `ExtendedEmailReplyParser.parse` may include html tags this way. If this causes you troble, use `ExtendedEmailReplyParser.parse(ExtendedEmailReplyParser.extract_text(message_or_path))` instead, which only uses the text part. We might correct the behavior in the future in order to strip the html tags during the parsing.
 ## ExtendedEmailReplyParser 0.4.0 (2016-11-14)
 ### Added
 - `Mail::Message#extract_html` extracts the html part as a counterpart for `Mail::Message#extract_text`. This is useful when an email has no text part.

data/README.md CHANGED Viewed

@@ -54,6 +54,15 @@ ExtendedEmailReplyParser.read("/path/to/email.eml")  # => Mail::Message
 ExtendedEmailReplyParser.read("/path/to/email.eml").extract_text  # => String
 ```
+**Optional: How to handle html-only emails**: There are emails that do not have a text part but only an html part. For those emails, `ExtendedEmailReplyParser.parse` uses the html part. But, to give you more control over how to handle those situations, `ExtendedEmailReplyParser.extract_text` returns `nil` for those emails. If you want your text extraction to fall back to the html part if the text part is missing, use this:
+```ruby
+ExtendedEmailReplyParser.extract_text_or_html message
+ExtendedEmailReplyParser.extract_text_or_html '/path/to/email.eml'
+```
+The `Mail::Message` object it self is extended to support `message.extract_text`, `message.extract_html_body_content` as well as `message.extract_text_or_html`.
 ### Writing a parser
 The parsing system allows you to add your own parser to the parsing chain. Just define a class inheriting from `ExtendedEmailReplyParser::Parsers::Base` and implement a `parse` method. The text before parsing is accessed via `text`.

data/lib/extended_email_reply_parser/mail/message.rb CHANGED Viewed

@@ -25,6 +25,10 @@ module Mail
       (self.html_part || (self if self.content_type.include?('text/html'))).try(:body_in_utf8)
     end
+    def extract_text_or_html
+      extract_text || extract_html_body_content
+    end
     def extract_html_body_content
       # http://stackoverflow.com/a/356376/2066546
       extract_html.match(/(.*<\s*body[^>]*>)(.*)(<\s*\/\s*body\s*\>.+)/m)[2] || extract_html

data/lib/extended_email_reply_parser/version.rb CHANGED Viewed

@@ -1,3 +1,3 @@
 module ExtendedEmailReplyParser
-  VERSION = "0.4.0"
+  VERSION = "0.5.0"
 end

data/lib/extended_email_reply_parser.rb CHANGED Viewed

@@ -58,6 +58,25 @@ module ExtendedEmailReplyParser
     end
   end
+  # Extract the body text from the given Mail::Message.
+  # If there is no text part, extract the content of the body tag
+  # of the html part.
+  #
+  #     ExtendedEmailReplyParser.extract_text_or_html message
+  #     ExtendedEmailReplyParser.extract_text_or_html '/path/to/email.eml'
+  #
+  # This is the same as:
+  #
+  #     message.extract_text_or_html
+  #
+  def self.extract_text_or_html(message_or_path)
+    if message_or_path.kind_of? Mail::Message
+      message_or_path.extract_text_or_html
+    elsif message_or_path.kind_of? String and File.file? message_or_path
+      Mail.read(message_or_path).extract_text_or_html
+    end
+  end
   # This parses the given object, i.e. removes quoted replies etc.
   #
   # Examples:
@@ -81,7 +100,7 @@ module ExtendedEmailReplyParser
   end
   def self.parse_message(message)
-    self.parse_text(message.extract_text || message.extract_html_body_content)
+    self.parse_text(message.extract_text_or_html)
   end
   def self.parse_text(text)

metadata CHANGED Viewed

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: extended_email_reply_parser
 version: !ruby/object:Gem::Version
-  version: 0.4.0
+  version: 0.5.0
 platform: ruby
 authors:
 - Sebastian Fiedlschuster
 autorequire:
 bindir: bin
 cert_chain: []
-date: 2016-11-14 00:00:00.000000000 Z
+date: 2016-11-15 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: bundler