extended_email_reply_parser 0.4.0 → 0.5.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 945dbd97e90b02bd2f9ea5b0005867fb3dfa9fa3
4
- data.tar.gz: 261767d15481a9c51862d623446f33da47681ed7
3
+ metadata.gz: 1343c2a228d59390f28c51ba8c2e5283f380e6fa
4
+ data.tar.gz: e2efc52cb6c93b4b6a8e271d47a0a0b894fbba61
5
5
  SHA512:
6
- metadata.gz: 330300aae38701945c25aba3eee8301c270a9d92bb369616f60b07ed4ff7f11c85ca61f13c0e033b468317ce8b6a7d1fee9315fc94cd35a2c9dd1bc855c268de
7
- data.tar.gz: 5075415426c2c38c202e98213ed5cce1389dab8a17989e74e0e37e951c42f56d70a6d8558b8e27447480494675dc1509490d3f927fb00c88908cf9f20824d9f9
6
+ metadata.gz: 851e7e03fbbf44ec84f40269f61d86e09393cba5df7cef4e5d1ce364379d3ee13a0ea9237585c480e499d93c7429c7212e305b155f61044c7c5e9f687ed92dea
7
+ data.tar.gz: d4d987194cec11ea53ba3d7779db48accdfcebee35f531fc61428e2c8dfb05e5bde52158e5e4dfa0fc353bf59d4384fa3922b9830aa60543a137c80b8e444b9f
data/CHANGELOG.md CHANGED
@@ -10,6 +10,11 @@ This project adheres to [Semantic Versioning](http://semver.org/).
10
10
  ### Removed
11
11
  ### Fixed
12
12
 
13
+ ## ExtendedEmailReplyParser 0.5.0 (2016-11-15)
14
+ ### Added
15
+ - Adding `ExtendedEmailReplyParser.extract_text_or_html` which falls back to the html part if the text part is missing.
16
+ - `ExtendedEmailReplyParser.parse` uses `extract_text_or_html`. This is good, because html-only mails do not just return `nil`. But, be aware that `ExtendedEmailReplyParser.parse` may include html tags this way. If this causes you troble, use `ExtendedEmailReplyParser.parse(ExtendedEmailReplyParser.extract_text(message_or_path))` instead, which only uses the text part. We might correct the behavior in the future in order to strip the html tags during the parsing.
17
+
13
18
  ## ExtendedEmailReplyParser 0.4.0 (2016-11-14)
14
19
  ### Added
15
20
  - `Mail::Message#extract_html` extracts the html part as a counterpart for `Mail::Message#extract_text`. This is useful when an email has no text part.
data/README.md CHANGED
@@ -54,6 +54,15 @@ ExtendedEmailReplyParser.read("/path/to/email.eml") # => Mail::Message
54
54
  ExtendedEmailReplyParser.read("/path/to/email.eml").extract_text # => String
55
55
  ```
56
56
 
57
+ **Optional: How to handle html-only emails**: There are emails that do not have a text part but only an html part. For those emails, `ExtendedEmailReplyParser.parse` uses the html part. But, to give you more control over how to handle those situations, `ExtendedEmailReplyParser.extract_text` returns `nil` for those emails. If you want your text extraction to fall back to the html part if the text part is missing, use this:
58
+
59
+ ```ruby
60
+ ExtendedEmailReplyParser.extract_text_or_html message
61
+ ExtendedEmailReplyParser.extract_text_or_html '/path/to/email.eml'
62
+ ```
63
+
64
+ The `Mail::Message` object it self is extended to support `message.extract_text`, `message.extract_html_body_content` as well as `message.extract_text_or_html`.
65
+
57
66
  ### Writing a parser
58
67
 
59
68
  The parsing system allows you to add your own parser to the parsing chain. Just define a class inheriting from `ExtendedEmailReplyParser::Parsers::Base` and implement a `parse` method. The text before parsing is accessed via `text`.
@@ -25,6 +25,10 @@ module Mail
25
25
  (self.html_part || (self if self.content_type.include?('text/html'))).try(:body_in_utf8)
26
26
  end
27
27
 
28
+ def extract_text_or_html
29
+ extract_text || extract_html_body_content
30
+ end
31
+
28
32
  def extract_html_body_content
29
33
  # http://stackoverflow.com/a/356376/2066546
30
34
  extract_html.match(/(.*<\s*body[^>]*>)(.*)(<\s*\/\s*body\s*\>.+)/m)[2] || extract_html
@@ -1,3 +1,3 @@
1
1
  module ExtendedEmailReplyParser
2
- VERSION = "0.4.0"
2
+ VERSION = "0.5.0"
3
3
  end
@@ -58,6 +58,25 @@ module ExtendedEmailReplyParser
58
58
  end
59
59
  end
60
60
 
61
+ # Extract the body text from the given Mail::Message.
62
+ # If there is no text part, extract the content of the body tag
63
+ # of the html part.
64
+ #
65
+ # ExtendedEmailReplyParser.extract_text_or_html message
66
+ # ExtendedEmailReplyParser.extract_text_or_html '/path/to/email.eml'
67
+ #
68
+ # This is the same as:
69
+ #
70
+ # message.extract_text_or_html
71
+ #
72
+ def self.extract_text_or_html(message_or_path)
73
+ if message_or_path.kind_of? Mail::Message
74
+ message_or_path.extract_text_or_html
75
+ elsif message_or_path.kind_of? String and File.file? message_or_path
76
+ Mail.read(message_or_path).extract_text_or_html
77
+ end
78
+ end
79
+
61
80
  # This parses the given object, i.e. removes quoted replies etc.
62
81
  #
63
82
  # Examples:
@@ -81,7 +100,7 @@ module ExtendedEmailReplyParser
81
100
  end
82
101
 
83
102
  def self.parse_message(message)
84
- self.parse_text(message.extract_text || message.extract_html_body_content)
103
+ self.parse_text(message.extract_text_or_html)
85
104
  end
86
105
 
87
106
  def self.parse_text(text)
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: extended_email_reply_parser
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.4.0
4
+ version: 0.5.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Sebastian Fiedlschuster
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2016-11-14 00:00:00.000000000 Z
11
+ date: 2016-11-15 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler