ruby-msg-nx 0.3.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: ecb72e2d9f9b305ba77303b4c31252fe421758e4b89e1d00b8249bcda268aaea
4
+ data.tar.gz: 248766c857e0b52131ab2bbaf0ed346468a856cb315abe85d76f5a3929d3fb1a
5
+ SHA512:
6
+ metadata.gz: 00a1d0b03495afe9e2183ba3804c0c6235e57c6e9840abd23140ea4a36f5a1472ea5eeb4d99cd16f420fa801de4255f284a6c5d25842f80d93b56a2ef612d3a4
7
+ data.tar.gz: 48c8ca8a4be497c011a2529eb6a1460ee24d7c8a3ba87939f40576598d989335aa79e8460cdaefaa4cfa8b1902e7392343661846e4ad040616f2c3c5e1de3680
data/COPYING ADDED
@@ -0,0 +1,20 @@
1
+ Copyright (c) 2007-2014 Charles Lowe
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining a copy
4
+ of this software and associated documentation files (the "Software"), to deal
5
+ in the Software without restriction, including without limitation the rights
6
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
7
+ copies of the Software, and to permit persons to whom the Software is
8
+ furnished to do so, subject to the following conditions:
9
+
10
+ The above copyright notice and this permission notice shall be included in
11
+ all copies or substantial portions of the Software.
12
+
13
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
14
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
15
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
16
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
17
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
18
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
19
+ THE SOFTWARE.
20
+
data/ChangeLog ADDED
@@ -0,0 +1,108 @@
1
+ == 0.1.0 / 2021-12-22
2
+
3
+ - Rebooting.
4
+
5
+ == 1.5.2 / 2014-08-20
6
+
7
+ - Move mime.rb file to avoid conflicts with mime_types gem (github #7,
8
+ blerins).
9
+ - Minor fix to mapitool for ruby >= 1.9.
10
+ - Alway require mapi/convert (indirect fix for missed step in README,
11
+ github #6).
12
+ - Various minor cleanups.
13
+
14
+ == 1.5.1 / 2012-07-03
15
+
16
+ - Fix handling of different body types (issue #14). Was breaking on
17
+ files without RTF content since 8933c26e, and also failing on files
18
+ where PR_BODY_HTML was a string rather than a stream.
19
+ - Move classes from RTF into Mapi::RTF (github #4).
20
+
21
+ == 1.5.0 / 2011-05-18
22
+
23
+ - Fixes for ruby 1.9.
24
+ - Move Mime into the Mapi module namespace (crowbot).
25
+ - Use ascii regex flag to avoid unicode probs (crowbot).
26
+
27
+ == 1.4.0 / 2008-10-12
28
+
29
+ - Initial simple msg test case.
30
+ - Update TODO, stripping out all the redundant ole stuff.
31
+ - Fix property set guids to use the new Ole::Types::Clsid type.
32
+ - Add block form of Msg.open
33
+ - Fix file requires for running tests individually.
34
+ - Update pst RangesIO subclasses for changes in ruby-ole.
35
+ - Merge initial pst reading code (converted from libpst).
36
+ - Pretty big pst refactoring, adding initial outlook 2003 pst support.
37
+ - Flesh out move to mapi to clean up the way pst hijacks the msg
38
+ classes currently.
39
+ - Add a ChangeLog :).
40
+ - Update README, by converting Home.wiki with wiki2rdoc converter.
41
+ - Separate out generic mapi object code from msg code, and separate out
42
+ conversion code.
43
+ - Add decent set of Mapi and Msg unit tests, approaching ~55% code coverage,
44
+ not including pst.
45
+ - Add TMail note conversion alternative, to eventually allow removal of
46
+ custom Mime class.
47
+ - Expose experimental pst support through renamed mapitool program.
48
+
49
+ == 1.3.1 / 2007-08-21
50
+
51
+ - Add fix for issue #2, and #4.
52
+ - Move ole code to ruby-ole project, and depend on it.
53
+
54
+ == 1.2.17 / 2007-05-13
55
+
56
+ (This was last release before splitting out ruby-ole. subsequent bug fix
57
+ point releases 1-3 were made directly on the gem, not reflected in the
58
+ repository, though the fixes were also forward-ported.)
59
+
60
+ - Update Ole::Storage backend, finalising api for split to separate
61
+ library.
62
+
63
+ == 1.2.16 / 2007-04-28
64
+
65
+ - Some minor fixes to msg parser.
66
+ - Extending RTF and body conversion support.
67
+ - Initial look at possible wmf conversion for embedded images.
68
+ - Add initial cli converter tool
69
+ - Add rdoc to ole/storage, and msg/properties
70
+ - Add streaming IO support to Ole::Storage, and use it in Msg::Properties
71
+ - Updates to test cases
72
+ - Add README, and update TODO
73
+ - Convert rtf support tools in c to small ruby class.
74
+ - Merge preliminary write support for Ole::Storage, as well as preliminary
75
+ filesystem api.
76
+
77
+ == 1.2.13 / 2007-01-22
78
+
79
+ - Nested msg support
80
+
81
+ == 1.2.10 / 2007-01-21
82
+
83
+ - Add initial vcard support.
84
+ - Implement a named properties map, for vcard conversion.
85
+ - Add orderedhash to Mime for keeping header order
86
+ - Fix line endings in lib/mime
87
+ - First released version
88
+
89
+ == <= 1.2.9 / 2007-01-11..2007-01-19
90
+
91
+ (Haven't bothered to note exact versions and dates - nothing here was released.
92
+ can look at history of lib/msg.rb to see exact VERSION at each commit.)
93
+
94
+ - Merged most of the named property work.
95
+ - Added some test files.
96
+ - Update svn:ignore, to exclude test messages and ole files which I can't
97
+ release. Need to get some clean files for use in test cases.
98
+ Also excluding source to the mapitags files for the moment.
99
+ A lot of it is not redistributable
100
+ - Added a converter to extract embedded html in rtf. Downloaded somewhere,
101
+ source unknown.
102
+ - Minor fix to ole/storage.rb, after new OleDir#type behaviour
103
+ - Imported support.rb, replacing previously required std.rb
104
+ - Added initial support for parsing times in Msg::Properties.
105
+ - Imported some rtf decompression code and minor updates.
106
+ - Cleaned up the ole class a bit
107
+ - Fixed OleDir#data method using sb_blocks map (see POLE).
108
+
data/Home.md ADDED
@@ -0,0 +1,133 @@
1
+ # Introduction
2
+
3
+ Generally, the goal of the project is to enable the conversion of
4
+ msg and pst files into standards based formats, without reliance on
5
+ outlook, or any platform dependencies. In fact its currently **pure
6
+ ruby**, so it should be easy to get running.
7
+
8
+ It is targeted at people who want to migrate their PIM data from outlook,
9
+ converting msg and pst files into rfc2822 emails, vCard contacts,
10
+ iCalendar appointments etc. However, it also aims to be a fairly complete
11
+ mapi message store manipulation library, providing a sane model for
12
+ (currently read-only) access to msg and pst files (message stores).
13
+
14
+ I am happy to accept patches, give commit bits etc.
15
+
16
+ Please let me know how it works for you, any feedback would be welcomed.
17
+
18
+ # Features
19
+
20
+ Broad features of the project:
21
+
22
+ * Can be used as a general mapi library, where conversion to and working
23
+ on a standard format doesn't make sense.
24
+
25
+ * Supports conversion of messages to standard formats, like rfc2822
26
+ emails, vCard, etc.
27
+
28
+ * Well commented, and easily extended.
29
+
30
+ * Basic RTF converter, for providing a readable body when only RTF
31
+ exists (needs work)
32
+
33
+ * RTF decompression support included, as well as HTML extraction from
34
+ RTF where appropriate (both in pure ruby, see `lib/mapi/rtf.rb`)
35
+
36
+ * Support for mapping property codes to symbolic names, with many
37
+ included.
38
+
39
+ Features of the msg format message store:
40
+
41
+ * Most key .msg structures are understood, and the only the parsing
42
+ code should require minor tweaks. Most of remaining work is in achieving
43
+ high-fidelity conversion to standards formats (see `TODO`).
44
+
45
+ * Supports both types of property storage (large ones in +substg+
46
+ files, and small ones in the +properties+ file.
47
+
48
+ * Complete support for named properties in different GUID namespaces.
49
+
50
+ * Initial support for handling embedded ole files, converting nested
51
+ .msg files to message/rfc822 attachments, and serializing others
52
+ as ole file attachments (allows you to view embedded excel for example).
53
+
54
+ Features of the pst format message store:
55
+
56
+ * Handles both Outlook 1997 & 2003 format pst files, both with no-
57
+ and "compressible-" encryption.
58
+
59
+ * Understanding of the file format is still very superficial.
60
+
61
+ # Usage
62
+
63
+ At the command line, it is simple to convert individual msg or pst
64
+ files to .eml, or to convert a batch to an mbox format file. See mapitool
65
+ help for details:
66
+
67
+ ```sh
68
+ mapitool -si some_email.msg > some_email.eml
69
+ mapitool -s *.msg > mbox
70
+ ```
71
+
72
+ There is also a fairly complete and easy to use high level library
73
+ access:
74
+
75
+ ```
76
+ require 'mapi/msg'
77
+
78
+ msg = Mapi::Msg.open filename
79
+
80
+ # access to the 3 main data stores, if you want to poke with the msg
81
+ # internals
82
+ msg.recipients
83
+ # => [#<Recipient:'\'Marley, Bob\' <bob.marley@gmail.com>'>]
84
+ msg.attachments
85
+ # => [#<Attachment filename='blah1.tif'>, #<Attachment filename='blah2.tif'>]
86
+ msg.properties
87
+ # => #<Properties ... normalized_subject='Testing' ...
88
+ # creation_time=#<DateTime: 2454042.45074714,0,2299161> ...>
89
+ ```
90
+
91
+ To completely abstract away all msg peculiarities, convert the msg
92
+ to a mime object. The message as a whole, and some of its main parts
93
+ support conversion to mime objects.
94
+
95
+ ```
96
+ msg.attachments.first.to_mime
97
+ # => #<Mime content_type='application/octet-stream'>
98
+ mime = msg.to_mime
99
+ puts mime.to_tree
100
+ # =>
101
+ - #<Mime content_type='multipart/mixed'>
102
+ |- #<Mime content_type='multipart/alternative'>
103
+ | |- #<Mime content_type='text/plain'>
104
+ | \- #<Mime content_type='text/html'>
105
+ |- #<Mime content_type='application/octet-stream'>
106
+ \- #<Mime content_type='application/octet-stream'>
107
+
108
+ # convert mime object to serialised form,
109
+ # inclusive of attachments etc. (not ideal in memory, but its wip).
110
+ puts mime.to_s
111
+ ```
112
+
113
+ # Thanks
114
+
115
+ * The initial implementation of parsing msg files was based primarily
116
+ on [msgconvert.pl](http://www.matijs.net/software/msgconv/).
117
+
118
+ * The basis for the outlook 97 pst file was the source to +libpst+.
119
+
120
+ * The code for rtf decompression was implemented by inspecting the
121
+ algorithm used in the +JTNEF+ project.
122
+
123
+ # Other
124
+
125
+ For more information, see
126
+
127
+ * [TODO](https://github.com/aquasync/ruby-msg/wiki/TODO)
128
+
129
+ * [MsgDetails](https://github.com/aquasync/ruby-msg/wiki/MsgDetails)
130
+
131
+ * [PstDetails](https://github.com/aquasync/ruby-msg/wiki/PstDetails)
132
+
133
+ * [OleDetails](https://github.com/aquasync/ruby-msg/wiki/OleDetails)
data/Rakefile ADDED
@@ -0,0 +1,52 @@
1
+ require 'rubygems'
2
+ require 'rake/testtask'
3
+
4
+ require 'rbconfig'
5
+ require 'fileutils'
6
+
7
+ spec = eval File.read('ruby-msg.gemspec')
8
+
9
+ task :default => [:test]
10
+
11
+ Rake::TestTask.new do |t|
12
+ t.test_files = FileList["test/test_*.rb"]
13
+ t.warning = false
14
+ t.verbose = true
15
+ end
16
+
17
+ begin
18
+ Rake::TestTask.new(:coverage) do |t|
19
+ t.test_files = FileList["test/test_*.rb"]
20
+ t.warning = false
21
+ t.verbose = true
22
+ t.ruby_opts = ['-rsimplecov -e "SimpleCov.start; load(ARGV.shift)"']
23
+ end
24
+ rescue LoadError
25
+ # SimpleCov not available
26
+ end
27
+
28
+ begin
29
+ require 'rdoc/task'
30
+ RDoc::Task.new do |t|
31
+ t.rdoc_dir = 'doc'
32
+ t.rdoc_files.include 'lib/**/*.rb'
33
+ t.rdoc_files.include 'README', 'ChangeLog'
34
+ t.title = "#{PKG_NAME} documentation"
35
+ t.options += %w[--line-numbers --inline-source --tab-width 2]
36
+ t.main = 'README'
37
+ end
38
+ rescue LoadError
39
+ # RDoc not available or too old (<2.4.2)
40
+ end
41
+
42
+ begin
43
+ require 'rubygems/package_task'
44
+ Gem::PackageTask.new(spec) do |t|
45
+ t.need_tar = true
46
+ t.need_zip = false
47
+ t.package_dir = 'build'
48
+ end
49
+ rescue LoadError
50
+ # RubyGems too old (<1.3.2)
51
+ end
52
+
data/bin/mapitool ADDED
@@ -0,0 +1,204 @@
1
+ #! /usr/bin/ruby
2
+
3
+ $:.unshift File.dirname(__FILE__) + '/../lib'
4
+
5
+ require 'optparse'
6
+ require 'rubygems'
7
+ require 'mapi/msg'
8
+ require 'mapi/pst'
9
+ require 'mapi/helper'
10
+ require 'time'
11
+
12
+ class Mapitool
13
+ attr_reader :files
14
+ attr_reader :opts
15
+
16
+ #FILTER_NON_FILE_NAME = /[^A-Za-z0-9.()\[\]{}-]/
17
+ FILTER_NON_FILE_NAME = /[\x00-0x1f"\/\\\?\*\|\<\>\:]/
18
+
19
+ def initialize files, opts
20
+ @files, @opts = files, opts
21
+ seen_pst = false
22
+ raise ArgumentError, 'Must specify 1 or more input files.' if files.empty?
23
+ @helper = Mapi::Helper.new opts[:ansi_encoding], opts[:to_unicode]
24
+ files.map! do |f|
25
+ ext = File.extname(f.downcase)[1..-1]
26
+ raise ArgumentError, 'Unsupported file type - %s' % f unless ext =~ /^(msg|pst|ost)$/
27
+ raise ArgumentError, 'Expermiental pst support not enabled' if /^(pst|ost)$/.match(ext) and !opts[:enable_pst]
28
+ [ext.to_sym, f]
29
+ end
30
+ if dir = opts[:output_dir]
31
+ Dir.mkdir(dir) unless File.directory?(dir)
32
+ end
33
+ end
34
+
35
+ def each_message(&block)
36
+ files.each do |format, filename|
37
+ if [:pst, :ost].include? format
38
+ if filter_path = opts[:filter_path]
39
+ filter_path = filter_path.tr("\\", '/').gsub(/\/+/, '/').sub(/^\//, '').sub(/\/$/, '')
40
+ end
41
+ open filename do |io|
42
+ pst = Mapi::Pst.new io, @helper
43
+ pst.each do |message|
44
+ next unless message.type == :message
45
+ if filter_path
46
+ next unless message.path =~ /^#{Regexp.quote filter_path}(\/|$)/i
47
+ end
48
+ yield message
49
+ end
50
+ end
51
+ else
52
+ Mapi::Msg.open filename, nil, @helper, &block
53
+ end
54
+ end
55
+ end
56
+
57
+ def run
58
+ each_message(&method(:process_message))
59
+ end
60
+
61
+ def make_unique filename
62
+ @map ||= {}
63
+ return @map[filename] if !opts[:individual] and @map[filename]
64
+ try = filename
65
+ i = 1
66
+ try = filename.gsub(/(\.[^.]+)$/, ".#{i += 1}\\1") while File.exist?(try)
67
+ @map[filename] = try
68
+ try
69
+ end
70
+
71
+ def process_message message
72
+ # TODO make this more informative
73
+ mime_type = message.mime_type
74
+ return unless pair = Mapi::Message::CONVERSION_MAP[mime_type]
75
+
76
+ combined_map = {
77
+ 'eml' => 'Mail.mbox',
78
+ 'vcf' => 'Contacts.vcf',
79
+ 'txt' => 'Posts.txt'
80
+ }
81
+
82
+ # TODO handle merged mode, pst, etc etc...
83
+ case message
84
+ when Mapi::Msg
85
+ if opts[:individual]
86
+ filename = message.root.ole.io.path.gsub(/msg$/i, pair.last)
87
+ else
88
+ filename = combined_map[pair.last] or raise NotImplementedError
89
+ end
90
+ when Mapi::Pst::Item
91
+ if opts[:individual]
92
+ filename = "#{message.subject.tr ' ', '_'}.#{pair.last}".gsub(FILTER_NON_FILE_NAME, '_')
93
+ else
94
+ filename = combined_map[pair.last] or raise NotImplementedError
95
+ filename = (message.path.tr(' /', '_.').gsub(FILTER_NON_FILE_NAME, '_') + '.' + File.extname(filename)).squeeze('.')
96
+ end
97
+ dir = File.dirname(message.instance_variable_get(:@node).pst.io.path)
98
+ filename = File.join dir, filename
99
+ else
100
+ raise
101
+ end
102
+
103
+ if dir = opts[:output_dir]
104
+ filename = File.join dir, File.basename(filename)
105
+ end
106
+
107
+ filename = make_unique filename
108
+
109
+ write_message = proc do |f|
110
+ data = message.send(pair.first).to_s
111
+ if !opts[:individual] and pair.last == 'eml'
112
+ # we do the append > style mbox quoting (mboxrd i think its called), as it
113
+ # is the only one that can be robuslty un-quoted. evolution doesn't use this!
114
+ f.puts "From mapitool@localhost #{Time.now.rfc2822}"
115
+ #munge_headers mime, opts
116
+ data.lines.each do |line|
117
+ if line =~ /^>*From /o
118
+ f.print '>' + line
119
+ else
120
+ f.print line
121
+ end
122
+ end
123
+ else
124
+ f.write data
125
+ end
126
+ end
127
+
128
+ if opts[:stdout]
129
+ write_message[STDOUT]
130
+ else
131
+ # Using binary mode. On Windows "\r\n" will become "\r\n\r\n" or such.
132
+ open filename, 'ab', &write_message
133
+ end
134
+ end
135
+
136
+ def munge_headers mime, opts
137
+ opts[:header_defaults].each do |s|
138
+ key, val = s.match(/(.*?):\s+(.*)/)[1..-1]
139
+ mime.headers[key] = [val] if mime.headers[key].empty?
140
+ end
141
+ end
142
+ end
143
+
144
+ def mapitool
145
+ opts = {:verbose => false, :action => :convert, :header_defaults => []}
146
+ op = OptionParser.new do |op|
147
+ op.banner = "Usage: mapitool [options] [files]"
148
+ #op.separator ''
149
+ #op.on('-c', '--convert', 'Convert input files (default)') { opts[:action] = :convert }
150
+ op.separator ''
151
+ op.on('-o', '--output-dir DIR', 'Put all output files in DIR') { |d| opts[:output_dir] = d }
152
+ op.on('-i', '--[no-]individual', 'Do not combine converted files') { |i| opts[:individual] = i }
153
+ op.on('-s', '--stdout', 'Write all data to stdout') { opts[:stdout] = true }
154
+ op.on('-f', '--filter-path PATH', 'Only process pst items in PATH') { |path| opts[:filter_path] = path }
155
+ op.on( '--enable-pst', 'Turn on experimental PST support') { opts[:enable_pst] = true }
156
+ op.on('-e', '--ansi-encoding CHARSET', 'Use this text to charset for non Unicode text') { |charset| opts[:ansi_encoding] = charset }
157
+ op.on('-u', '--to-unicode', 'Convert ansi text to unicode') { opts[:to_unicode] = true }
158
+ #op.on('-d', '--header-default STR', 'Provide a default value for top level mail header') { |hd| opts[:header_defaults] << hd }
159
+ # --enable-pst
160
+ op.separator ''
161
+ op.on('-v', '--[no-]verbose', 'Run verbosely') { |v| opts[:verbose] = v }
162
+ op.on_tail('-h', '--help', 'Show this message') { puts op; exit }
163
+ end
164
+
165
+ files = op.parse ARGV
166
+
167
+ # for windows. see issue #2
168
+ STDOUT.binmode
169
+
170
+ Mapi::Log.level = Ole::Log.level = opts[:verbose] ? Logger::WARN : Logger::FATAL
171
+
172
+ tool = begin
173
+ Mapitool.new(files, opts)
174
+ rescue ArgumentError
175
+ puts $!
176
+ puts op
177
+ exit 1
178
+ end
179
+
180
+ tool.run
181
+ end
182
+
183
+ mapitool
184
+
185
+ __END__
186
+
187
+ mapitool [options] [files]
188
+
189
+ files is a list of *.msg & *.pst files.
190
+
191
+ one of the options should be some sort of path filter to apply to pst items.
192
+
193
+ --filter-path=
194
+ --filter-type=eml,vcf
195
+
196
+ with that out of the way, the entire list of files can be converted into a
197
+ list of items (with meta data about the source).
198
+
199
+ --convert
200
+ --[no-]separate one output file per item or combined output
201
+ --stdout
202
+ --output-dir=.
203
+
204
+