ruby-msg 1.3.1 → 1.4.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/README +108 -113
- data/Rakefile +42 -28
- data/bin/mapitool +195 -0
- data/lib/mapi.rb +109 -0
- data/lib/mapi/convert.rb +61 -0
- data/lib/mapi/convert/contact.rb +142 -0
- data/lib/mapi/convert/note-mime.rb +274 -0
- data/lib/mapi/convert/note-tmail.rb +287 -0
- data/lib/mapi/msg.rb +440 -0
- data/lib/mapi/property_set.rb +269 -0
- data/lib/mapi/pst.rb +1806 -0
- data/lib/mapi/rtf.rb +169 -0
- data/lib/mapi/types.rb +51 -0
- data/lib/rtf.rb +0 -9
- data/test/test_convert_contact.rb +60 -0
- data/test/test_convert_note.rb +66 -0
- data/test/test_mime.rb +4 -2
- data/test/test_msg.rb +29 -0
- data/test/test_property_set.rb +116 -0
- data/test/test_types.rb +17 -0
- metadata +78 -48
- data/bin/msgtool +0 -65
- data/lib/msg.rb +0 -522
- data/lib/msg/properties.rb +0 -532
- data/lib/msg/rtf.rb +0 -236
data/README
CHANGED
@@ -1,121 +1,116 @@
|
|
1
|
-
|
1
|
+
= Introduction
|
2
2
|
|
3
|
-
|
3
|
+
Generally, the goal of the project is the conversion of .msg files
|
4
|
+
into proper rfc2822 emails, independent of outlook, or any platform
|
5
|
+
dependencies etc. In fact its currently pure ruby, so it should be
|
6
|
+
easy to get started with.
|
4
7
|
|
5
|
-
|
6
|
-
|
7
|
-
|
8
|
+
There's also work-in-progess pst support (unfortunately outlook 97
|
9
|
+
only currently), based on libpst, making this project more of a general
|
10
|
+
ruby mapi message store conversion library now (though some significant
|
11
|
+
cleaning up has to happen first).
|
8
12
|
|
9
|
-
It draws on
|
10
|
-
Neither are complete yet, however, but I think
|
13
|
+
It draws on <tt>msgconvert.pl</tt>, but tries to take a cleaner and
|
14
|
+
more complete approach. Neither are complete yet, however, but I think
|
15
|
+
that this project provides a clean foundation upon which to work on
|
16
|
+
a good converter for msg files for use in outlook migrations etc.
|
11
17
|
|
12
18
|
I am happy to accept patches, give commit bits etc.
|
13
19
|
|
14
20
|
Please let me know how it works for you, any feedback would be welcomed.
|
15
21
|
|
16
|
-
=
|
17
|
-
|
18
|
-
|
19
|
-
|
20
|
-
|
21
|
-
|
22
|
-
|
23
|
-
msg
|
24
|
-
|
25
|
-
|
26
|
-
|
27
|
-
|
28
|
-
|
29
|
-
|
30
|
-
|
31
|
-
|
32
|
-
|
33
|
-
|
34
|
-
|
35
|
-
|
36
|
-
|
37
|
-
|
38
|
-
|
39
|
-
|
40
|
-
|
41
|
-
|
42
|
-
|
43
|
-
|
44
|
-
|
45
|
-
|
46
|
-
|
47
|
-
|
48
|
-
|
49
|
-
|
50
|
-
|
51
|
-
|
52
|
-
|
53
|
-
|
54
|
-
|
55
|
-
|
56
|
-
|
57
|
-
|
58
|
-
|
59
|
-
|
60
|
-
|
61
|
-
|
62
|
-
|
63
|
-
|
64
|
-
|
65
|
-
|
66
|
-
|
67
|
-
|
68
|
-
|
69
|
-
|
70
|
-
|
71
|
-
|
72
|
-
|
73
|
-
|
74
|
-
|
75
|
-
...
|
76
|
-
|
77
|
-
|
78
|
-
|
79
|
-
|
80
|
-
|
81
|
-
|
82
|
-
|
83
|
-
|
84
|
-
|
85
|
-
|
86
|
-
|
87
|
-
|
88
|
-
|
89
|
-
|
90
|
-
|
91
|
-
|
92
|
-
|
93
|
-
|
94
|
-
|
95
|
-
|
96
|
-
|
97
|
-
|
98
|
-
|
99
|
-
|
100
|
-
|
101
|
-
|
102
|
-
|
103
|
-
|
104
|
-
|
105
|
-
|
106
|
-
|
107
|
-
|
108
|
-
|
109
|
-
|
110
|
-
|
111
|
-
# and as a fallback, the symbolic lookup will automatically use named properties,
|
112
|
-
# which can be seen:
|
113
|
-
props.resolve :keywords
|
114
|
-
# => #<Key {00020329-0000-0000-c000-000000000046}/"Keywords">
|
115
|
-
|
116
|
-
# which allows this to work:
|
117
|
-
props.keywords # as above
|
118
|
-
}}}
|
119
|
-
|
120
|
-
With some more work, the property storage model should be able to reach feature
|
121
|
-
completion.
|
22
|
+
= Features
|
23
|
+
|
24
|
+
Broad features of the project:
|
25
|
+
|
26
|
+
* Can be used as a general msg library, where conversion to and working
|
27
|
+
on a standard format doesn't make sense.
|
28
|
+
|
29
|
+
* Supports conversion of msg files to standard formats, like rfc2822
|
30
|
+
emails, vCards, etc.
|
31
|
+
|
32
|
+
* Well commented, and easily extended.
|
33
|
+
|
34
|
+
* Most key .msg structures are understood, and the only the parsing
|
35
|
+
code should require minor tweaks. Most of remaining work is in achieving
|
36
|
+
high-fidelity conversion to standards formats (see [TODO]).
|
37
|
+
|
38
|
+
Features of the lower-level msg handling:
|
39
|
+
|
40
|
+
* Supports both types of property storage (large ones in +substg+
|
41
|
+
files, and small ones in the +properties+ file).
|
42
|
+
|
43
|
+
* Complete support for named properties in different GUID namespaces.
|
44
|
+
|
45
|
+
* Support for mapping property codes to symbolic names, with many
|
46
|
+
included.
|
47
|
+
|
48
|
+
* RTF decompression support included, as well as HTML extraction from
|
49
|
+
RTF where appropriate (both in pure ruby, see <tt>lib/msg/rtf.rb</tt>)
|
50
|
+
|
51
|
+
* Initial RTF converter, for providing a readable body when only RTF
|
52
|
+
exists (needs work)
|
53
|
+
|
54
|
+
* Initial support for handling embedded ole files, converting nested
|
55
|
+
.msg files to message/rfc822 attachments, and serializing others
|
56
|
+
as ole file attachments (allows you to view embedded excel for example).
|
57
|
+
|
58
|
+
= Usage
|
59
|
+
|
60
|
+
At the command line, it is simple to convert individual msg files
|
61
|
+
to .eml, or to convert a batch to an mbox format file. See help for
|
62
|
+
details:
|
63
|
+
|
64
|
+
msgtool -c some_email.msg > some_email.eml
|
65
|
+
msgtool -m *.msg > mbox
|
66
|
+
|
67
|
+
There is also a fairly complete and easy to use high level library
|
68
|
+
access:
|
69
|
+
|
70
|
+
require 'msg'
|
71
|
+
|
72
|
+
msg = Msg.open filename
|
73
|
+
|
74
|
+
# access to the 3 main data stores, if you want to poke with the msg
|
75
|
+
# internals
|
76
|
+
msg.recipients
|
77
|
+
# => [#<Recipient:'\'Marley, Bob\' <bob.marley@gmail.com>'>]
|
78
|
+
msg.attachments
|
79
|
+
# => [#<Attachment filename='blah1.tif'>, #<Attachment filename='blah2.tif'>]
|
80
|
+
msg.properties
|
81
|
+
# => #<Properties ... normalized_subject='Testing' ...
|
82
|
+
# creation_time=#<DateTime: 2454042.45074714,0,2299161> ...>
|
83
|
+
|
84
|
+
To completely abstract away all msg peculiarities, convert the msg
|
85
|
+
to a mime object. The message as a whole, and some of its main parts
|
86
|
+
support conversion to mime objects.
|
87
|
+
|
88
|
+
msg.attachments.first.to_mime
|
89
|
+
# => #<Mime content_type='application/octet-stream'>
|
90
|
+
mime = msg.to_mime
|
91
|
+
puts mime.to_tree
|
92
|
+
# =>
|
93
|
+
- #<Mime content_type='multipart/mixed'>
|
94
|
+
|- #<Mime content_type='multipart/alternative'>
|
95
|
+
| |- #<Mime content_type='text/plain'>
|
96
|
+
| \- #<Mime content_type='text/html'>
|
97
|
+
|- #<Mime content_type='application/octet-stream'>
|
98
|
+
\- #<Mime content_type='application/octet-stream'>
|
99
|
+
|
100
|
+
# convert mime object to serialised form,
|
101
|
+
# inclusive of attachments etc. (not ideal in memory, but its wip).
|
102
|
+
puts mime.to_s
|
103
|
+
|
104
|
+
= Other
|
105
|
+
|
106
|
+
For more information, see
|
107
|
+
|
108
|
+
* TODO
|
109
|
+
|
110
|
+
* MsgDetails[http://code.google.com/p/ruby-msg/wiki/MsgDetails]
|
111
|
+
|
112
|
+
* OleDetails[http://code.google.com/p/ruby-ole/wiki/OleDetails]
|
113
|
+
|
114
|
+
* msgconv[http://www.matijs.net/software/msgconv/], the original
|
115
|
+
perl converter.
|
116
|
+
|
data/Rakefile
CHANGED
@@ -8,55 +8,69 @@ require 'fileutils'
|
|
8
8
|
|
9
9
|
$:.unshift 'lib'
|
10
10
|
|
11
|
-
require 'msg'
|
11
|
+
require 'mapi/msg'
|
12
12
|
|
13
13
|
PKG_NAME = 'ruby-msg'
|
14
|
-
PKG_VERSION =
|
14
|
+
PKG_VERSION = Mapi::VERSION
|
15
15
|
|
16
16
|
task :default => [:test]
|
17
17
|
|
18
18
|
Rake::TestTask.new(:test) do |t|
|
19
|
-
t.test_files = FileList["test/test_*.rb"]
|
20
|
-
t.warning =
|
19
|
+
t.test_files = FileList["test/test_*.rb"] - ['test/test_pst.rb']
|
20
|
+
t.warning = false
|
21
21
|
t.verbose = true
|
22
22
|
end
|
23
23
|
|
24
|
-
|
25
|
-
|
26
|
-
|
27
|
-
|
24
|
+
begin
|
25
|
+
require 'rcov/rcovtask'
|
26
|
+
# NOTE: this will not do anything until you add some tests
|
27
|
+
desc "Create a cross-referenced code coverage report"
|
28
|
+
Rcov::RcovTask.new do |t|
|
29
|
+
t.test_files = FileList['test/test*.rb']
|
30
|
+
t.ruby_opts << "-Ilib" # in order to use this rcov
|
31
|
+
t.rcov_opts << "--xrefs" # comment to disable cross-references
|
32
|
+
t.rcov_opts << "--exclude /usr/local/lib/site_ruby"
|
33
|
+
t.verbose = true
|
34
|
+
end
|
35
|
+
rescue LoadError
|
36
|
+
# Rcov not available
|
37
|
+
end
|
38
|
+
|
39
|
+
Rake::RDocTask.new do |t|
|
40
|
+
t.rdoc_dir = 'doc'
|
41
|
+
t.title = "#{PKG_NAME} documentation"
|
42
|
+
t.options += %w[--main README --line-numbers --inline-source --tab-width 2]
|
43
|
+
t.rdoc_files.include 'lib/**/*.rb'
|
44
|
+
t.rdoc_files.include 'README'
|
28
45
|
end
|
29
46
|
|
30
47
|
spec = Gem::Specification.new do |s|
|
31
|
-
s.name
|
32
|
-
s.version
|
33
|
-
s.summary
|
48
|
+
s.name = PKG_NAME
|
49
|
+
s.version = PKG_VERSION
|
50
|
+
s.summary = %q{Ruby Msg library.}
|
34
51
|
s.description = %q{A library for reading Outlook msg files, and for converting them to RFC2822 emails.}
|
35
|
-
s.authors
|
36
|
-
s.email
|
37
|
-
s.homepage
|
38
|
-
|
39
|
-
|
40
|
-
s.executables = ['
|
41
|
-
s.files
|
42
|
-
s.files
|
43
|
-
s.files += Dir.glob("test/test_*.rb")
|
44
|
-
s.files += Dir.glob("bin/*")
|
52
|
+
s.authors = ["Charles Lowe"]
|
53
|
+
s.email = %q{aquasync@gmail.com}
|
54
|
+
s.homepage = %q{http://code.google.com/p/ruby-msg}
|
55
|
+
s.rubyforge_project = %q{ruby-msg}
|
56
|
+
|
57
|
+
s.executables = ['mapitool']
|
58
|
+
s.files = FileList['data/*.yaml', 'Rakefile', 'README', 'FIXES']
|
59
|
+
s.files += FileList['lib/**/*.rb', 'test/test_*.rb', 'bin/*']
|
45
60
|
|
46
|
-
s.has_rdoc
|
47
|
-
s.
|
61
|
+
s.has_rdoc = true
|
62
|
+
s.extra_rdoc_files = ['README']
|
63
|
+
s.rdoc_options += ['--main', 'README',
|
48
64
|
'--title', "#{PKG_NAME} documentation",
|
49
65
|
'--tab-width', '2']
|
50
66
|
|
51
|
-
|
52
|
-
s.
|
53
|
-
|
54
|
-
s.add_dependency 'ruby-ole', '>=1.2.1'
|
67
|
+
s.add_dependency 'ruby-ole', '>=1.2.4'
|
68
|
+
s.add_dependency 'vpim', '>=0.360'
|
55
69
|
end
|
56
70
|
|
57
71
|
Rake::GemPackageTask.new(spec) do |p|
|
58
72
|
p.gem_spec = spec
|
59
|
-
p.need_tar = true
|
73
|
+
p.need_tar = false #true
|
60
74
|
p.need_zip = false
|
61
75
|
p.package_dir = 'build'
|
62
76
|
end
|
data/bin/mapitool
ADDED
@@ -0,0 +1,195 @@
|
|
1
|
+
#! /usr/bin/ruby
|
2
|
+
|
3
|
+
$:.unshift File.dirname(__FILE__) + '/../lib'
|
4
|
+
|
5
|
+
require 'optparse'
|
6
|
+
require 'rubygems'
|
7
|
+
require 'mapi/msg'
|
8
|
+
require 'mapi/pst'
|
9
|
+
require 'mapi/convert'
|
10
|
+
require 'time'
|
11
|
+
|
12
|
+
class Mapitool
|
13
|
+
attr_reader :files, :opts
|
14
|
+
def initialize files, opts
|
15
|
+
@files, @opts = files, opts
|
16
|
+
seen_pst = false
|
17
|
+
raise ArgumentError, 'Must specify 1 or more input files.' if files.empty?
|
18
|
+
files.map! do |f|
|
19
|
+
ext = File.extname(f.downcase)[1..-1]
|
20
|
+
raise ArgumentError, 'Unsupported file type - %s' % f unless ext =~ /^(msg|pst)$/
|
21
|
+
raise ArgumentError, 'Expermiental pst support not enabled' if ext == 'pst' and !opts[:enable_pst]
|
22
|
+
[ext.to_sym, f]
|
23
|
+
end
|
24
|
+
if dir = opts[:output_dir]
|
25
|
+
Dir.mkdir(dir) unless File.directory?(dir)
|
26
|
+
end
|
27
|
+
end
|
28
|
+
|
29
|
+
def each_message(&block)
|
30
|
+
files.each do |format, filename|
|
31
|
+
if format == :pst
|
32
|
+
if filter_path = opts[:filter_path]
|
33
|
+
filter_path = filter_path.tr("\\", '/').gsub(/\/+/, '/').sub(/^\//, '').sub(/\/$/, '')
|
34
|
+
end
|
35
|
+
open filename do |io|
|
36
|
+
pst = Mapi::Pst.new io
|
37
|
+
pst.each do |message|
|
38
|
+
next unless message.type == :message
|
39
|
+
if filter_path
|
40
|
+
next unless message.path =~ /^#{Regexp.quote filter_path}(\/|$)/i
|
41
|
+
end
|
42
|
+
yield message
|
43
|
+
end
|
44
|
+
end
|
45
|
+
else
|
46
|
+
Mapi::Msg.open filename, &block
|
47
|
+
end
|
48
|
+
end
|
49
|
+
end
|
50
|
+
|
51
|
+
def run
|
52
|
+
each_message(&method(:process_message))
|
53
|
+
end
|
54
|
+
|
55
|
+
def make_unique filename
|
56
|
+
@map ||= {}
|
57
|
+
return @map[filename] if !opts[:individual] and @map[filename]
|
58
|
+
try = filename
|
59
|
+
i = 1
|
60
|
+
try = filename.gsub(/(\.[^.]+)$/, ".#{i += 1}\\1") while File.exist?(try)
|
61
|
+
@map[filename] = try
|
62
|
+
try
|
63
|
+
end
|
64
|
+
|
65
|
+
def process_message message
|
66
|
+
# TODO make this more informative
|
67
|
+
mime_type = message.mime_type
|
68
|
+
return unless pair = Mapi::Message::CONVERSION_MAP[mime_type]
|
69
|
+
|
70
|
+
combined_map = {
|
71
|
+
'eml' => 'Mail.mbox',
|
72
|
+
'vcf' => 'Contacts.vcf',
|
73
|
+
'txt' => 'Posts.txt'
|
74
|
+
}
|
75
|
+
|
76
|
+
# TODO handle merged mode, pst, etc etc...
|
77
|
+
case message
|
78
|
+
when Mapi::Msg
|
79
|
+
if opts[:individual]
|
80
|
+
filename = message.root.ole.io.path.gsub(/msg$/i, pair.last)
|
81
|
+
else
|
82
|
+
filename = combined_map[pair.last] or raise NotImplementedError
|
83
|
+
end
|
84
|
+
when Mapi::Pst::Item
|
85
|
+
if opts[:individual]
|
86
|
+
filename = "#{message.subject.tr ' ', '_'}.#{pair.last}".gsub(/[^A-Za-z0-9.()\[\]{}-]/, '_')
|
87
|
+
else
|
88
|
+
filename = combined_map[pair.last] or raise NotImplementedError
|
89
|
+
filename = (message.path.tr(' /', '_.').gsub(/[^A-Za-z0-9.()\[\]{}-]/, '_') + '.' + File.extname(filename)).squeeze('.')
|
90
|
+
end
|
91
|
+
dir = File.dirname(message.instance_variable_get(:@desc).pst.io.path)
|
92
|
+
filename = File.join dir, filename
|
93
|
+
else
|
94
|
+
raise
|
95
|
+
end
|
96
|
+
|
97
|
+
if dir = opts[:output_dir]
|
98
|
+
filename = File.join dir, File.basename(filename)
|
99
|
+
end
|
100
|
+
|
101
|
+
filename = make_unique filename
|
102
|
+
|
103
|
+
write_message = proc do |f|
|
104
|
+
data = message.send(pair.first).to_s
|
105
|
+
if !opts[:individual] and pair.last == 'eml'
|
106
|
+
# we do the append > style mbox quoting (mboxrd i think its called), as it
|
107
|
+
# is the only one that can be robuslty un-quoted. evolution doesn't use this!
|
108
|
+
f.puts "From mapitool@localhost #{Time.now.rfc2822}"
|
109
|
+
#munge_headers mime, opts
|
110
|
+
data.each do |line|
|
111
|
+
if line =~ /^>*From /o
|
112
|
+
f.print '>' + line
|
113
|
+
else
|
114
|
+
f.print line
|
115
|
+
end
|
116
|
+
end
|
117
|
+
else
|
118
|
+
f.write data
|
119
|
+
end
|
120
|
+
end
|
121
|
+
|
122
|
+
if opts[:stdout]
|
123
|
+
write_message[STDOUT]
|
124
|
+
else
|
125
|
+
open filename, 'a', &write_message
|
126
|
+
end
|
127
|
+
end
|
128
|
+
|
129
|
+
def munge_headers mime, opts
|
130
|
+
opts[:header_defaults].each do |s|
|
131
|
+
key, val = s.match(/(.*?):\s+(.*)/)[1..-1]
|
132
|
+
mime.headers[key] = [val] if mime.headers[key].empty?
|
133
|
+
end
|
134
|
+
end
|
135
|
+
end
|
136
|
+
|
137
|
+
def mapitool
|
138
|
+
opts = {:verbose => false, :action => :convert, :header_defaults => []}
|
139
|
+
op = OptionParser.new do |op|
|
140
|
+
op.banner = "Usage: mapitool [options] [files]"
|
141
|
+
#op.separator ''
|
142
|
+
#op.on('-c', '--convert', 'Convert input files (default)') { opts[:action] = :convert }
|
143
|
+
op.separator ''
|
144
|
+
op.on('-o', '--output-dir DIR', 'Put all output files in DIR') { |d| opts[:output_dir] = d }
|
145
|
+
op.on('-i', '--[no-]individual', 'Do not combine converted files') { |i| opts[:individual] = i }
|
146
|
+
op.on('-s', '--stdout', 'Write all data to stdout') { opts[:stdout] = true }
|
147
|
+
op.on('-f', '--filter-path PATH', 'Only process pst items in PATH') { |path| opts[:filter_path] = path }
|
148
|
+
op.on( '--enable-pst', 'Turn on experimental PST support') { opts[:enable_pst] = true }
|
149
|
+
#op.on('-d', '--header-default STR', 'Provide a default value for top level mail header') { |hd| opts[:header_defaults] << hd }
|
150
|
+
# --enable-pst
|
151
|
+
op.separator ''
|
152
|
+
op.on('-v', '--[no-]verbose', 'Run verbosely') { |v| opts[:verbose] = v }
|
153
|
+
op.on_tail('-h', '--help', 'Show this message') { puts op; exit }
|
154
|
+
end
|
155
|
+
|
156
|
+
files = op.parse ARGV
|
157
|
+
|
158
|
+
# for windows. see issue #2
|
159
|
+
STDOUT.binmode
|
160
|
+
|
161
|
+
Mapi::Log.level = Ole::Log.level = opts[:verbose] ? Logger::WARN : Logger::FATAL
|
162
|
+
|
163
|
+
tool = begin
|
164
|
+
Mapitool.new(files, opts)
|
165
|
+
rescue ArgumentError
|
166
|
+
puts $!
|
167
|
+
puts op
|
168
|
+
exit 1
|
169
|
+
end
|
170
|
+
|
171
|
+
tool.run
|
172
|
+
end
|
173
|
+
|
174
|
+
mapitool
|
175
|
+
|
176
|
+
__END__
|
177
|
+
|
178
|
+
mapitool [options] [files]
|
179
|
+
|
180
|
+
files is a list of *.msg & *.pst files.
|
181
|
+
|
182
|
+
one of the options should be some sort of path filter to apply to pst items.
|
183
|
+
|
184
|
+
--filter-path=
|
185
|
+
--filter-type=eml,vcf
|
186
|
+
|
187
|
+
with that out of the way, the entire list of files can be converted into a
|
188
|
+
list of items (with meta data about the source).
|
189
|
+
|
190
|
+
--convert
|
191
|
+
--[no-]separate one output file per item or combined output
|
192
|
+
--stdout
|
193
|
+
--output-dir=.
|
194
|
+
|
195
|
+
|