extended_email_reply_parser 0.4.0 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 945dbd97e90b02bd2f9ea5b0005867fb3dfa9fa3
4
- data.tar.gz: 261767d15481a9c51862d623446f33da47681ed7
3
+ metadata.gz: 1343c2a228d59390f28c51ba8c2e5283f380e6fa
4
+ data.tar.gz: e2efc52cb6c93b4b6a8e271d47a0a0b894fbba61
5
5
  SHA512:
6
- metadata.gz: 330300aae38701945c25aba3eee8301c270a9d92bb369616f60b07ed4ff7f11c85ca61f13c0e033b468317ce8b6a7d1fee9315fc94cd35a2c9dd1bc855c268de
7
- data.tar.gz: 5075415426c2c38c202e98213ed5cce1389dab8a17989e74e0e37e951c42f56d70a6d8558b8e27447480494675dc1509490d3f927fb00c88908cf9f20824d9f9
6
+ metadata.gz: 851e7e03fbbf44ec84f40269f61d86e09393cba5df7cef4e5d1ce364379d3ee13a0ea9237585c480e499d93c7429c7212e305b155f61044c7c5e9f687ed92dea
7
+ data.tar.gz: d4d987194cec11ea53ba3d7779db48accdfcebee35f531fc61428e2c8dfb05e5bde52158e5e4dfa0fc353bf59d4384fa3922b9830aa60543a137c80b8e444b9f
data/CHANGELOG.md CHANGED
@@ -10,6 +10,11 @@ This project adheres to [Semantic Versioning](http://semver.org/).
10
10
  ### Removed
11
11
  ### Fixed
12
12
 
13
+ ## ExtendedEmailReplyParser 0.5.0 (2016-11-15)
14
+ ### Added
15
+ - Adding `ExtendedEmailReplyParser.extract_text_or_html` which falls back to the html part if the text part is missing.
16
+ - `ExtendedEmailReplyParser.parse` uses `extract_text_or_html`. This is good, because html-only mails do not just return `nil`. But, be aware that `ExtendedEmailReplyParser.parse` may include html tags this way. If this causes you troble, use `ExtendedEmailReplyParser.parse(ExtendedEmailReplyParser.extract_text(message_or_path))` instead, which only uses the text part. We might correct the behavior in the future in order to strip the html tags during the parsing.
17
+
13
18
  ## ExtendedEmailReplyParser 0.4.0 (2016-11-14)
14
19
  ### Added
15
20
  - `Mail::Message#extract_html` extracts the html part as a counterpart for `Mail::Message#extract_text`. This is useful when an email has no text part.
data/README.md CHANGED
@@ -54,6 +54,15 @@ ExtendedEmailReplyParser.read("/path/to/email.eml") # => Mail::Message
54
54
  ExtendedEmailReplyParser.read("/path/to/email.eml").extract_text # => String
55
55
  ```
56
56
 
57
+ **Optional: How to handle html-only emails**: There are emails that do not have a text part but only an html part. For those emails, `ExtendedEmailReplyParser.parse` uses the html part. But, to give you more control over how to handle those situations, `ExtendedEmailReplyParser.extract_text` returns `nil` for those emails. If you want your text extraction to fall back to the html part if the text part is missing, use this:
58
+
59
+ ```ruby
60
+ ExtendedEmailReplyParser.extract_text_or_html message
61
+ ExtendedEmailReplyParser.extract_text_or_html '/path/to/email.eml'
62
+ ```
63
+
64
+ The `Mail::Message` object it self is extended to support `message.extract_text`, `message.extract_html_body_content` as well as `message.extract_text_or_html`.
65
+
57
66
  ### Writing a parser
58
67
 
59
68
  The parsing system allows you to add your own parser to the parsing chain. Just define a class inheriting from `ExtendedEmailReplyParser::Parsers::Base` and implement a `parse` method. The text before parsing is accessed via `text`.
@@ -25,6 +25,10 @@ module Mail
25
25
  (self.html_part || (self if self.content_type.include?('text/html'))).try(:body_in_utf8)
26
26
  end
27
27
 
28
+ def extract_text_or_html
29
+ extract_text || extract_html_body_content
30
+ end
31
+
28
32
  def extract_html_body_content
29
33
  # http://stackoverflow.com/a/356376/2066546
30
34
  extract_html.match(/(.*<\s*body[^>]*>)(.*)(<\s*\/\s*body\s*\>.+)/m)[2] || extract_html
@@ -1,3 +1,3 @@
1
1
  module ExtendedEmailReplyParser
2
- VERSION = "0.4.0"
2
+ VERSION = "0.5.0"
3
3
  end
@@ -58,6 +58,25 @@ module ExtendedEmailReplyParser
58
58
  end
59
59
  end
60
60
 
61
+ # Extract the body text from the given Mail::Message.
62
+ # If there is no text part, extract the content of the body tag
63
+ # of the html part.
64
+ #
65
+ # ExtendedEmailReplyParser.extract_text_or_html message
66
+ # ExtendedEmailReplyParser.extract_text_or_html '/path/to/email.eml'
67
+ #
68
+ # This is the same as:
69
+ #
70
+ # message.extract_text_or_html
71
+ #
72
+ def self.extract_text_or_html(message_or_path)
73
+ if message_or_path.kind_of? Mail::Message
74
+ message_or_path.extract_text_or_html
75
+ elsif message_or_path.kind_of? String and File.file? message_or_path
76
+ Mail.read(message_or_path).extract_text_or_html
77
+ end
78
+ end
79
+
61
80
  # This parses the given object, i.e. removes quoted replies etc.
62
81
  #
63
82
  # Examples:
@@ -81,7 +100,7 @@ module ExtendedEmailReplyParser
81
100
  end
82
101
 
83
102
  def self.parse_message(message)
84
- self.parse_text(message.extract_text || message.extract_html_body_content)
103
+ self.parse_text(message.extract_text_or_html)
85
104
  end
86
105
 
87
106
  def self.parse_text(text)
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: extended_email_reply_parser
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.4.0
4
+ version: 0.5.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Sebastian Fiedlschuster
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2016-11-14 00:00:00.000000000 Z
11
+ date: 2016-11-15 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler