ruby-msg 1.3.1 → 1.4.0
Sign up to get free protection for your applications and to get access to all the features.
- data/README +108 -113
- data/Rakefile +42 -28
- data/bin/mapitool +195 -0
- data/lib/mapi.rb +109 -0
- data/lib/mapi/convert.rb +61 -0
- data/lib/mapi/convert/contact.rb +142 -0
- data/lib/mapi/convert/note-mime.rb +274 -0
- data/lib/mapi/convert/note-tmail.rb +287 -0
- data/lib/mapi/msg.rb +440 -0
- data/lib/mapi/property_set.rb +269 -0
- data/lib/mapi/pst.rb +1806 -0
- data/lib/mapi/rtf.rb +169 -0
- data/lib/mapi/types.rb +51 -0
- data/lib/rtf.rb +0 -9
- data/test/test_convert_contact.rb +60 -0
- data/test/test_convert_note.rb +66 -0
- data/test/test_mime.rb +4 -2
- data/test/test_msg.rb +29 -0
- data/test/test_property_set.rb +116 -0
- data/test/test_types.rb +17 -0
- metadata +78 -48
- data/bin/msgtool +0 -65
- data/lib/msg.rb +0 -522
- data/lib/msg/properties.rb +0 -532
- data/lib/msg/rtf.rb +0 -236
data/README
CHANGED
@@ -1,121 +1,116 @@
|
|
1
|
-
|
1
|
+
= Introduction
|
2
2
|
|
3
|
-
|
3
|
+
Generally, the goal of the project is the conversion of .msg files
|
4
|
+
into proper rfc2822 emails, independent of outlook, or any platform
|
5
|
+
dependencies etc. In fact its currently pure ruby, so it should be
|
6
|
+
easy to get started with.
|
4
7
|
|
5
|
-
|
6
|
-
|
7
|
-
|
8
|
+
There's also work-in-progess pst support (unfortunately outlook 97
|
9
|
+
only currently), based on libpst, making this project more of a general
|
10
|
+
ruby mapi message store conversion library now (though some significant
|
11
|
+
cleaning up has to happen first).
|
8
12
|
|
9
|
-
It draws on
|
10
|
-
Neither are complete yet, however, but I think
|
13
|
+
It draws on <tt>msgconvert.pl</tt>, but tries to take a cleaner and
|
14
|
+
more complete approach. Neither are complete yet, however, but I think
|
15
|
+
that this project provides a clean foundation upon which to work on
|
16
|
+
a good converter for msg files for use in outlook migrations etc.
|
11
17
|
|
12
18
|
I am happy to accept patches, give commit bits etc.
|
13
19
|
|
14
20
|
Please let me know how it works for you, any feedback would be welcomed.
|
15
21
|
|
16
|
-
=
|
17
|
-
|
18
|
-
|
19
|
-
|
20
|
-
|
21
|
-
|
22
|
-
|
23
|
-
msg
|
24
|
-
|
25
|
-
|
26
|
-
|
27
|
-
|
28
|
-
|
29
|
-
|
30
|
-
|
31
|
-
|
32
|
-
|
33
|
-
|
34
|
-
|
35
|
-
|
36
|
-
|
37
|
-
|
38
|
-
|
39
|
-
|
40
|
-
|
41
|
-
|
42
|
-
|
43
|
-
|
44
|
-
|
45
|
-
|
46
|
-
|
47
|
-
|
48
|
-
|
49
|
-
|
50
|
-
|
51
|
-
|
52
|
-
|
53
|
-
|
54
|
-
|
55
|
-
|
56
|
-
|
57
|
-
|
58
|
-
|
59
|
-
|
60
|
-
|
61
|
-
|
62
|
-
|
63
|
-
|
64
|
-
|
65
|
-
|
66
|
-
|
67
|
-
|
68
|
-
|
69
|
-
|
70
|
-
|
71
|
-
|
72
|
-
|
73
|
-
|
74
|
-
|
75
|
-
...
|
76
|
-
|
77
|
-
|
78
|
-
|
79
|
-
|
80
|
-
|
81
|
-
|
82
|
-
|
83
|
-
|
84
|
-
|
85
|
-
|
86
|
-
|
87
|
-
|
88
|
-
|
89
|
-
|
90
|
-
|
91
|
-
|
92
|
-
|
93
|
-
|
94
|
-
|
95
|
-
|
96
|
-
|
97
|
-
|
98
|
-
|
99
|
-
|
100
|
-
|
101
|
-
|
102
|
-
|
103
|
-
|
104
|
-
|
105
|
-
|
106
|
-
|
107
|
-
|
108
|
-
|
109
|
-
|
110
|
-
|
111
|
-
# and as a fallback, the symbolic lookup will automatically use named properties,
|
112
|
-
# which can be seen:
|
113
|
-
props.resolve :keywords
|
114
|
-
# => #<Key {00020329-0000-0000-c000-000000000046}/"Keywords">
|
115
|
-
|
116
|
-
# which allows this to work:
|
117
|
-
props.keywords # as above
|
118
|
-
}}}
|
119
|
-
|
120
|
-
With some more work, the property storage model should be able to reach feature
|
121
|
-
completion.
|
22
|
+
= Features
|
23
|
+
|
24
|
+
Broad features of the project:
|
25
|
+
|
26
|
+
* Can be used as a general msg library, where conversion to and working
|
27
|
+
on a standard format doesn't make sense.
|
28
|
+
|
29
|
+
* Supports conversion of msg files to standard formats, like rfc2822
|
30
|
+
emails, vCards, etc.
|
31
|
+
|
32
|
+
* Well commented, and easily extended.
|
33
|
+
|
34
|
+
* Most key .msg structures are understood, and the only the parsing
|
35
|
+
code should require minor tweaks. Most of remaining work is in achieving
|
36
|
+
high-fidelity conversion to standards formats (see [TODO]).
|
37
|
+
|
38
|
+
Features of the lower-level msg handling:
|
39
|
+
|
40
|
+
* Supports both types of property storage (large ones in +substg+
|
41
|
+
files, and small ones in the +properties+ file).
|
42
|
+
|
43
|
+
* Complete support for named properties in different GUID namespaces.
|
44
|
+
|
45
|
+
* Support for mapping property codes to symbolic names, with many
|
46
|
+
included.
|
47
|
+
|
48
|
+
* RTF decompression support included, as well as HTML extraction from
|
49
|
+
RTF where appropriate (both in pure ruby, see <tt>lib/msg/rtf.rb</tt>)
|
50
|
+
|
51
|
+
* Initial RTF converter, for providing a readable body when only RTF
|
52
|
+
exists (needs work)
|
53
|
+
|
54
|
+
* Initial support for handling embedded ole files, converting nested
|
55
|
+
.msg files to message/rfc822 attachments, and serializing others
|
56
|
+
as ole file attachments (allows you to view embedded excel for example).
|
57
|
+
|
58
|
+
= Usage
|
59
|
+
|
60
|
+
At the command line, it is simple to convert individual msg files
|
61
|
+
to .eml, or to convert a batch to an mbox format file. See help for
|
62
|
+
details:
|
63
|
+
|
64
|
+
msgtool -c some_email.msg > some_email.eml
|
65
|
+
msgtool -m *.msg > mbox
|
66
|
+
|
67
|
+
There is also a fairly complete and easy to use high level library
|
68
|
+
access:
|
69
|
+
|
70
|
+
require 'msg'
|
71
|
+
|
72
|
+
msg = Msg.open filename
|
73
|
+
|
74
|
+
# access to the 3 main data stores, if you want to poke with the msg
|
75
|
+
# internals
|
76
|
+
msg.recipients
|
77
|
+
# => [#<Recipient:'\'Marley, Bob\' <bob.marley@gmail.com>'>]
|
78
|
+
msg.attachments
|
79
|
+
# => [#<Attachment filename='blah1.tif'>, #<Attachment filename='blah2.tif'>]
|
80
|
+
msg.properties
|
81
|
+
# => #<Properties ... normalized_subject='Testing' ...
|
82
|
+
# creation_time=#<DateTime: 2454042.45074714,0,2299161> ...>
|
83
|
+
|
84
|
+
To completely abstract away all msg peculiarities, convert the msg
|
85
|
+
to a mime object. The message as a whole, and some of its main parts
|
86
|
+
support conversion to mime objects.
|
87
|
+
|
88
|
+
msg.attachments.first.to_mime
|
89
|
+
# => #<Mime content_type='application/octet-stream'>
|
90
|
+
mime = msg.to_mime
|
91
|
+
puts mime.to_tree
|
92
|
+
# =>
|
93
|
+
- #<Mime content_type='multipart/mixed'>
|
94
|
+
|- #<Mime content_type='multipart/alternative'>
|
95
|
+
| |- #<Mime content_type='text/plain'>
|
96
|
+
| \- #<Mime content_type='text/html'>
|
97
|
+
|- #<Mime content_type='application/octet-stream'>
|
98
|
+
\- #<Mime content_type='application/octet-stream'>
|
99
|
+
|
100
|
+
# convert mime object to serialised form,
|
101
|
+
# inclusive of attachments etc. (not ideal in memory, but its wip).
|
102
|
+
puts mime.to_s
|
103
|
+
|
104
|
+
= Other
|
105
|
+
|
106
|
+
For more information, see
|
107
|
+
|
108
|
+
* TODO
|
109
|
+
|
110
|
+
* MsgDetails[http://code.google.com/p/ruby-msg/wiki/MsgDetails]
|
111
|
+
|
112
|
+
* OleDetails[http://code.google.com/p/ruby-ole/wiki/OleDetails]
|
113
|
+
|
114
|
+
* msgconv[http://www.matijs.net/software/msgconv/], the original
|
115
|
+
perl converter.
|
116
|
+
|
data/Rakefile
CHANGED
@@ -8,55 +8,69 @@ require 'fileutils'
|
|
8
8
|
|
9
9
|
$:.unshift 'lib'
|
10
10
|
|
11
|
-
require 'msg'
|
11
|
+
require 'mapi/msg'
|
12
12
|
|
13
13
|
PKG_NAME = 'ruby-msg'
|
14
|
-
PKG_VERSION =
|
14
|
+
PKG_VERSION = Mapi::VERSION
|
15
15
|
|
16
16
|
task :default => [:test]
|
17
17
|
|
18
18
|
Rake::TestTask.new(:test) do |t|
|
19
|
-
t.test_files = FileList["test/test_*.rb"]
|
20
|
-
t.warning =
|
19
|
+
t.test_files = FileList["test/test_*.rb"] - ['test/test_pst.rb']
|
20
|
+
t.warning = false
|
21
21
|
t.verbose = true
|
22
22
|
end
|
23
23
|
|
24
|
-
|
25
|
-
|
26
|
-
|
27
|
-
|
24
|
+
begin
|
25
|
+
require 'rcov/rcovtask'
|
26
|
+
# NOTE: this will not do anything until you add some tests
|
27
|
+
desc "Create a cross-referenced code coverage report"
|
28
|
+
Rcov::RcovTask.new do |t|
|
29
|
+
t.test_files = FileList['test/test*.rb']
|
30
|
+
t.ruby_opts << "-Ilib" # in order to use this rcov
|
31
|
+
t.rcov_opts << "--xrefs" # comment to disable cross-references
|
32
|
+
t.rcov_opts << "--exclude /usr/local/lib/site_ruby"
|
33
|
+
t.verbose = true
|
34
|
+
end
|
35
|
+
rescue LoadError
|
36
|
+
# Rcov not available
|
37
|
+
end
|
38
|
+
|
39
|
+
Rake::RDocTask.new do |t|
|
40
|
+
t.rdoc_dir = 'doc'
|
41
|
+
t.title = "#{PKG_NAME} documentation"
|
42
|
+
t.options += %w[--main README --line-numbers --inline-source --tab-width 2]
|
43
|
+
t.rdoc_files.include 'lib/**/*.rb'
|
44
|
+
t.rdoc_files.include 'README'
|
28
45
|
end
|
29
46
|
|
30
47
|
spec = Gem::Specification.new do |s|
|
31
|
-
s.name
|
32
|
-
s.version
|
33
|
-
s.summary
|
48
|
+
s.name = PKG_NAME
|
49
|
+
s.version = PKG_VERSION
|
50
|
+
s.summary = %q{Ruby Msg library.}
|
34
51
|
s.description = %q{A library for reading Outlook msg files, and for converting them to RFC2822 emails.}
|
35
|
-
s.authors
|
36
|
-
s.email
|
37
|
-
s.homepage
|
38
|
-
|
39
|
-
|
40
|
-
s.executables = ['
|
41
|
-
s.files
|
42
|
-
s.files
|
43
|
-
s.files += Dir.glob("test/test_*.rb")
|
44
|
-
s.files += Dir.glob("bin/*")
|
52
|
+
s.authors = ["Charles Lowe"]
|
53
|
+
s.email = %q{aquasync@gmail.com}
|
54
|
+
s.homepage = %q{http://code.google.com/p/ruby-msg}
|
55
|
+
s.rubyforge_project = %q{ruby-msg}
|
56
|
+
|
57
|
+
s.executables = ['mapitool']
|
58
|
+
s.files = FileList['data/*.yaml', 'Rakefile', 'README', 'FIXES']
|
59
|
+
s.files += FileList['lib/**/*.rb', 'test/test_*.rb', 'bin/*']
|
45
60
|
|
46
|
-
s.has_rdoc
|
47
|
-
s.
|
61
|
+
s.has_rdoc = true
|
62
|
+
s.extra_rdoc_files = ['README']
|
63
|
+
s.rdoc_options += ['--main', 'README',
|
48
64
|
'--title', "#{PKG_NAME} documentation",
|
49
65
|
'--tab-width', '2']
|
50
66
|
|
51
|
-
|
52
|
-
s.
|
53
|
-
|
54
|
-
s.add_dependency 'ruby-ole', '>=1.2.1'
|
67
|
+
s.add_dependency 'ruby-ole', '>=1.2.4'
|
68
|
+
s.add_dependency 'vpim', '>=0.360'
|
55
69
|
end
|
56
70
|
|
57
71
|
Rake::GemPackageTask.new(spec) do |p|
|
58
72
|
p.gem_spec = spec
|
59
|
-
p.need_tar = true
|
73
|
+
p.need_tar = false #true
|
60
74
|
p.need_zip = false
|
61
75
|
p.package_dir = 'build'
|
62
76
|
end
|
data/bin/mapitool
ADDED
@@ -0,0 +1,195 @@
|
|
1
|
+
#! /usr/bin/ruby
|
2
|
+
|
3
|
+
$:.unshift File.dirname(__FILE__) + '/../lib'
|
4
|
+
|
5
|
+
require 'optparse'
|
6
|
+
require 'rubygems'
|
7
|
+
require 'mapi/msg'
|
8
|
+
require 'mapi/pst'
|
9
|
+
require 'mapi/convert'
|
10
|
+
require 'time'
|
11
|
+
|
12
|
+
class Mapitool
|
13
|
+
attr_reader :files, :opts
|
14
|
+
def initialize files, opts
|
15
|
+
@files, @opts = files, opts
|
16
|
+
seen_pst = false
|
17
|
+
raise ArgumentError, 'Must specify 1 or more input files.' if files.empty?
|
18
|
+
files.map! do |f|
|
19
|
+
ext = File.extname(f.downcase)[1..-1]
|
20
|
+
raise ArgumentError, 'Unsupported file type - %s' % f unless ext =~ /^(msg|pst)$/
|
21
|
+
raise ArgumentError, 'Expermiental pst support not enabled' if ext == 'pst' and !opts[:enable_pst]
|
22
|
+
[ext.to_sym, f]
|
23
|
+
end
|
24
|
+
if dir = opts[:output_dir]
|
25
|
+
Dir.mkdir(dir) unless File.directory?(dir)
|
26
|
+
end
|
27
|
+
end
|
28
|
+
|
29
|
+
def each_message(&block)
|
30
|
+
files.each do |format, filename|
|
31
|
+
if format == :pst
|
32
|
+
if filter_path = opts[:filter_path]
|
33
|
+
filter_path = filter_path.tr("\\", '/').gsub(/\/+/, '/').sub(/^\//, '').sub(/\/$/, '')
|
34
|
+
end
|
35
|
+
open filename do |io|
|
36
|
+
pst = Mapi::Pst.new io
|
37
|
+
pst.each do |message|
|
38
|
+
next unless message.type == :message
|
39
|
+
if filter_path
|
40
|
+
next unless message.path =~ /^#{Regexp.quote filter_path}(\/|$)/i
|
41
|
+
end
|
42
|
+
yield message
|
43
|
+
end
|
44
|
+
end
|
45
|
+
else
|
46
|
+
Mapi::Msg.open filename, &block
|
47
|
+
end
|
48
|
+
end
|
49
|
+
end
|
50
|
+
|
51
|
+
def run
|
52
|
+
each_message(&method(:process_message))
|
53
|
+
end
|
54
|
+
|
55
|
+
def make_unique filename
|
56
|
+
@map ||= {}
|
57
|
+
return @map[filename] if !opts[:individual] and @map[filename]
|
58
|
+
try = filename
|
59
|
+
i = 1
|
60
|
+
try = filename.gsub(/(\.[^.]+)$/, ".#{i += 1}\\1") while File.exist?(try)
|
61
|
+
@map[filename] = try
|
62
|
+
try
|
63
|
+
end
|
64
|
+
|
65
|
+
def process_message message
|
66
|
+
# TODO make this more informative
|
67
|
+
mime_type = message.mime_type
|
68
|
+
return unless pair = Mapi::Message::CONVERSION_MAP[mime_type]
|
69
|
+
|
70
|
+
combined_map = {
|
71
|
+
'eml' => 'Mail.mbox',
|
72
|
+
'vcf' => 'Contacts.vcf',
|
73
|
+
'txt' => 'Posts.txt'
|
74
|
+
}
|
75
|
+
|
76
|
+
# TODO handle merged mode, pst, etc etc...
|
77
|
+
case message
|
78
|
+
when Mapi::Msg
|
79
|
+
if opts[:individual]
|
80
|
+
filename = message.root.ole.io.path.gsub(/msg$/i, pair.last)
|
81
|
+
else
|
82
|
+
filename = combined_map[pair.last] or raise NotImplementedError
|
83
|
+
end
|
84
|
+
when Mapi::Pst::Item
|
85
|
+
if opts[:individual]
|
86
|
+
filename = "#{message.subject.tr ' ', '_'}.#{pair.last}".gsub(/[^A-Za-z0-9.()\[\]{}-]/, '_')
|
87
|
+
else
|
88
|
+
filename = combined_map[pair.last] or raise NotImplementedError
|
89
|
+
filename = (message.path.tr(' /', '_.').gsub(/[^A-Za-z0-9.()\[\]{}-]/, '_') + '.' + File.extname(filename)).squeeze('.')
|
90
|
+
end
|
91
|
+
dir = File.dirname(message.instance_variable_get(:@desc).pst.io.path)
|
92
|
+
filename = File.join dir, filename
|
93
|
+
else
|
94
|
+
raise
|
95
|
+
end
|
96
|
+
|
97
|
+
if dir = opts[:output_dir]
|
98
|
+
filename = File.join dir, File.basename(filename)
|
99
|
+
end
|
100
|
+
|
101
|
+
filename = make_unique filename
|
102
|
+
|
103
|
+
write_message = proc do |f|
|
104
|
+
data = message.send(pair.first).to_s
|
105
|
+
if !opts[:individual] and pair.last == 'eml'
|
106
|
+
# we do the append > style mbox quoting (mboxrd i think its called), as it
|
107
|
+
# is the only one that can be robuslty un-quoted. evolution doesn't use this!
|
108
|
+
f.puts "From mapitool@localhost #{Time.now.rfc2822}"
|
109
|
+
#munge_headers mime, opts
|
110
|
+
data.each do |line|
|
111
|
+
if line =~ /^>*From /o
|
112
|
+
f.print '>' + line
|
113
|
+
else
|
114
|
+
f.print line
|
115
|
+
end
|
116
|
+
end
|
117
|
+
else
|
118
|
+
f.write data
|
119
|
+
end
|
120
|
+
end
|
121
|
+
|
122
|
+
if opts[:stdout]
|
123
|
+
write_message[STDOUT]
|
124
|
+
else
|
125
|
+
open filename, 'a', &write_message
|
126
|
+
end
|
127
|
+
end
|
128
|
+
|
129
|
+
def munge_headers mime, opts
|
130
|
+
opts[:header_defaults].each do |s|
|
131
|
+
key, val = s.match(/(.*?):\s+(.*)/)[1..-1]
|
132
|
+
mime.headers[key] = [val] if mime.headers[key].empty?
|
133
|
+
end
|
134
|
+
end
|
135
|
+
end
|
136
|
+
|
137
|
+
def mapitool
|
138
|
+
opts = {:verbose => false, :action => :convert, :header_defaults => []}
|
139
|
+
op = OptionParser.new do |op|
|
140
|
+
op.banner = "Usage: mapitool [options] [files]"
|
141
|
+
#op.separator ''
|
142
|
+
#op.on('-c', '--convert', 'Convert input files (default)') { opts[:action] = :convert }
|
143
|
+
op.separator ''
|
144
|
+
op.on('-o', '--output-dir DIR', 'Put all output files in DIR') { |d| opts[:output_dir] = d }
|
145
|
+
op.on('-i', '--[no-]individual', 'Do not combine converted files') { |i| opts[:individual] = i }
|
146
|
+
op.on('-s', '--stdout', 'Write all data to stdout') { opts[:stdout] = true }
|
147
|
+
op.on('-f', '--filter-path PATH', 'Only process pst items in PATH') { |path| opts[:filter_path] = path }
|
148
|
+
op.on( '--enable-pst', 'Turn on experimental PST support') { opts[:enable_pst] = true }
|
149
|
+
#op.on('-d', '--header-default STR', 'Provide a default value for top level mail header') { |hd| opts[:header_defaults] << hd }
|
150
|
+
# --enable-pst
|
151
|
+
op.separator ''
|
152
|
+
op.on('-v', '--[no-]verbose', 'Run verbosely') { |v| opts[:verbose] = v }
|
153
|
+
op.on_tail('-h', '--help', 'Show this message') { puts op; exit }
|
154
|
+
end
|
155
|
+
|
156
|
+
files = op.parse ARGV
|
157
|
+
|
158
|
+
# for windows. see issue #2
|
159
|
+
STDOUT.binmode
|
160
|
+
|
161
|
+
Mapi::Log.level = Ole::Log.level = opts[:verbose] ? Logger::WARN : Logger::FATAL
|
162
|
+
|
163
|
+
tool = begin
|
164
|
+
Mapitool.new(files, opts)
|
165
|
+
rescue ArgumentError
|
166
|
+
puts $!
|
167
|
+
puts op
|
168
|
+
exit 1
|
169
|
+
end
|
170
|
+
|
171
|
+
tool.run
|
172
|
+
end
|
173
|
+
|
174
|
+
mapitool
|
175
|
+
|
176
|
+
__END__
|
177
|
+
|
178
|
+
mapitool [options] [files]
|
179
|
+
|
180
|
+
files is a list of *.msg & *.pst files.
|
181
|
+
|
182
|
+
one of the options should be some sort of path filter to apply to pst items.
|
183
|
+
|
184
|
+
--filter-path=
|
185
|
+
--filter-type=eml,vcf
|
186
|
+
|
187
|
+
with that out of the way, the entire list of files can be converted into a
|
188
|
+
list of items (with meta data about the source).
|
189
|
+
|
190
|
+
--convert
|
191
|
+
--[no-]separate one output file per item or combined output
|
192
|
+
--stdout
|
193
|
+
--output-dir=.
|
194
|
+
|
195
|
+
|