ruby-ole 1.2.6 → 1.2.7

Sign up to get free protection for your applications and to get access to all the features.
data/ChangeLog CHANGED
@@ -1,3 +1,15 @@
1
+ == 1.2.7 / 2008-08-12
2
+
3
+ - Prepare Ole::Types::PropertySet for write support.
4
+ - Introduce Ole::Storage#meta_data as an easy interface to meta data stored
5
+ within various property sets.
6
+ - Add new --metadata action to oletool to dump said metadata.
7
+ - Add new --mimetype action to oletool, and corresponding Ole::Storage#mime_type
8
+ function to try to guess mime type of a file based on some simple heuristics.
9
+ - Restructure project files a bit, and pull in file_system & meta_data support
10
+ by default.
11
+ - More tests - now have 100% coverage.
12
+
1
13
  == 1.2.6 / 2008-07-21
2
14
 
3
15
  - Fix FileClass#expand_path to work properly on darwin (issue #2)
data/README ADDED
@@ -0,0 +1,27 @@
1
+ = Introduction
2
+
3
+ For now, see the docs for the Ole::Storage class.
4
+
5
+ = TODO
6
+
7
+ == 1.2.8
8
+
9
+ * fix property sets a bit more. see TODO in Ole::Storage::MetaData
10
+ * fix mode strings - like truncate when using 'w+', supporting append
11
+ 'a+' modes etc. done?
12
+ * make ranges io obey readable vs writeable modes.
13
+ * more RangesIO completion. ie, doesn't support #<< at the moment.
14
+ * ability to zero out padding and unused blocks
15
+ * case insensitive mode for ole/file_system?
16
+
17
+ == 1.3.1
18
+
19
+ * fix this README :). maybe move todo out, and put something useful here.
20
+
21
+ == Longer term
22
+
23
+ * more benchmarking, profiling, and speed fixes. was thinking vs other
24
+ ruby filesystems (eg, vs File/Dir itself, and vs rubyzip), and vs other
25
+ ole implementations (maybe perl's, and poifs) just to check its in the
26
+ ballpark, with no remaining silly bottlenecks.
27
+ * supposedly vba does something weird to ole files. test that.
data/Rakefile CHANGED
@@ -55,7 +55,7 @@ spec = Gem::Specification.new do |s|
55
55
  s.rubyforge_project = %q{ruby-ole}
56
56
 
57
57
  s.executables = ['oletool']
58
- s.files = ['Rakefile', 'ChangeLog', 'data/propids.yaml']
58
+ s.files = ['README', 'Rakefile', 'ChangeLog', 'data/propids.yaml']
59
59
  s.files += FileList['lib/**/*.rb']
60
60
  s.files += FileList['test/test_*.rb', 'test/*.doc']
61
61
  s.files += FileList['test/oleWithDirs.ole', 'test/test_SummaryInformation']
@@ -63,6 +63,7 @@ spec = Gem::Specification.new do |s|
63
63
  s.test_files = FileList['test/test_*.rb']
64
64
 
65
65
  s.has_rdoc = true
66
+ s.extra_rdoc_files = ['README']
66
67
  s.rdoc_options += [
67
68
  '--main', 'README',
68
69
  '--title', "#{PKG_NAME} documentation",
@@ -11,6 +11,8 @@ def oletool
11
11
  op.separator ''
12
12
  op.on('-t', '--tree', 'Dump ole trees for files (default)') { opts[:action] = :tree }
13
13
  op.on('-r', '--repack', 'Repack the ole files in canonical form') { opts[:action] = :repack }
14
+ op.on('-m', '--mimetype', 'Print the guessed mime types') { opts[:action] = :mimetype }
15
+ op.on('-y', '--metadata', 'Dump the internal meta data as YAML') { opts[:action] = :metadata }
14
16
  op.separator ''
15
17
  op.on('-v', '--[no-]verbose', 'Run verbosely') { |v| opts[:verbose] = v }
16
18
  op.on_tail('-h', '--help', 'Show this message') { puts op; exit }
@@ -27,7 +29,11 @@ def oletool
27
29
  when :tree
28
30
  Ole::Storage.open(file) { |ole| puts ole.root.to_tree }
29
31
  when :repack
30
- Ole::Storage.open file, 'r+', &:repack
32
+ Ole::Storage.open file, 'rb+', &:repack
33
+ when :metadata
34
+ Ole::Storage.open(file) { |ole| y ole.meta_data.to_h }
35
+ when :mimetype
36
+ puts Ole::Storage.open(file) { |ole| ole.meta_data.mime_type }
31
37
  end
32
38
  end
33
39
  end
@@ -1,424 +1,2 @@
1
- #
2
- # = Introduction
3
- #
4
- # This file intends to provide file system-like api support, a la <tt>zip/zipfilesystem</tt>.
5
- #
6
- # Ideally, this will be the recommended interface, allowing Ole::Storage, Dir, and
7
- # Zip::ZipFile to be used exchangably. It should be possible to write recursive copy using
8
- # the plain api, such that you can copy dirs/files agnostically between any of ole docs, dirs,
9
- # and zip files.
10
- #
11
- # = Usage
12
- #
13
- # Currently you can do something like the following:
14
- #
15
- # Ole::Storage.open 'test.doc' do |ole|
16
- # ole.dir.entries '/' # => [".", "..", "\001Ole", "1Table", "\001CompObj", ...]
17
- # ole.file.read "\001CompObj" # => "\001\000\376\377\003\n\000\000\377\377..."
18
- # end
19
- #
20
- # = Notes
21
- #
22
- # <tt>Ole::Storage</tt> files can have multiple files with the same name,
23
- # or with / in the name, and other things that are probably invalid anyway.
24
- # This API is unable to access those files, but of course the core, low-
25
- # level API can.
26
- #
27
- # need to implement some more IO functions on RangesIO, like #puts, #print
28
- # etc, like AbstractOutputStream from zipfile.
29
- #
30
- # = TODO
31
- #
32
- # - check Dir.mkdir, and File.open, and File.rename, to add in filename
33
- # length checks (max 32 / 31 or something).
34
- # do the automatic truncation, and add in any necessary warnings.
35
- #
36
- # - File.split('a/') == File.split('a') == ['.', 'a']
37
- # the implication of this, is that things that try to force directory
38
- # don't work. like, File.rename('a', 'b'), should work if a is a file
39
- # or directory, but File.rename('a/', 'b') should only work if a is
40
- # a directory. tricky, need to clean things up a bit more.
41
- # i think a general path name => dirent method would work, with flags
42
- # about what should raise an error.
43
- #
44
- # - Need to look at streamlining things after getting all the tests passing,
45
- # as this file's getting pretty long - almost half the real implementation.
46
- # and is probably more inefficient than necessary.
47
- # too many exceptions in the expected path of certain functions.
48
- #
49
- # - should look at profiles before and after switching ruby-msg to use
50
- # the filesystem api.
51
- #
52
-
53
- require 'ole/storage'
54
-
55
- module Ole # :nodoc:
56
- class Storage
57
- def file
58
- @file ||= FileClass.new self
59
- end
60
-
61
- def dir
62
- @dir ||= DirClass.new self
63
- end
64
-
65
- # tries to get a dirent for path. return nil if it doesn't exist
66
- # (change it)
67
- def dirent_from_path path
68
- dirent = @root
69
- path = file.expand_path path
70
- path = path.sub(/^\/*/, '').sub(/\/*$/, '').split(/\/+/)
71
- until path.empty?
72
- return nil if dirent.file?
73
- return nil unless dirent = dirent/path.shift
74
- end
75
- dirent
76
- end
77
-
78
- class FileClass
79
- class Stat
80
- attr_reader :ftype, :size, :blocks, :blksize
81
- attr_reader :nlink, :uid, :gid, :dev, :rdev, :ino
82
- def initialize dirent
83
- @dirent = dirent
84
- @size = dirent.size
85
- if file?
86
- @ftype = 'file'
87
- bat = dirent.ole.bat_for_size(dirent.size)
88
- @blocks = bat.chain(dirent.first_block).length
89
- @blksize = bat.block_size
90
- else
91
- @ftype = 'directory'
92
- @blocks = 0
93
- @blksize = 0
94
- end
95
- # a lot of these are bogus. ole file format has no analogs
96
- @nlink = 1
97
- @uid, @gid = 0, 0
98
- @dev, @rdev = 0, 0
99
- @ino = 0
100
- # need to add times - atime, mtime, ctime.
101
- end
102
-
103
- alias rdev_major :rdev
104
- alias rdev_minor :rdev
105
-
106
- def file?
107
- @dirent.file?
108
- end
109
-
110
- def directory?
111
- @dirent.dir?
112
- end
113
-
114
- def inspect
115
- pairs = (instance_variables - ['@dirent']).map do |n|
116
- "#{n[1..-1]}=#{instance_variable_get n}"
117
- end
118
- "#<#{self.class} #{pairs * ', '}>"
119
- end
120
- end
121
-
122
- def initialize ole
123
- @ole = ole
124
- end
125
-
126
- def expand_path path
127
- # get the raw stored pwd value (its blank for root)
128
- pwd = @ole.dir.instance_variable_get :@pwd
129
- # its only absolute if it starts with a '/'
130
- path = "#{pwd}/#{path}" unless path =~ /^\//
131
- # at this point its already absolute. we use File.expand_path
132
- # just for the .. and . handling
133
- # No longer use RUBY_PLATFORM =~ /win/ as it matches darwin. better way?
134
- if File::ALT_SEPARATOR == "\\"
135
- File.expand_path(path)[2..-1]
136
- else
137
- File.expand_path path
138
- end
139
- end
140
-
141
- # +orig_path+ is just so that we can use the requested path
142
- # in the error messages even if it has been already modified
143
- def dirent_from_path path, orig_path=nil
144
- orig_path ||= path
145
- dirent = @ole.dirent_from_path path
146
- raise Errno::ENOENT, orig_path unless dirent
147
- raise Errno::EISDIR, orig_path if dirent.dir?
148
- dirent
149
- end
150
- private :dirent_from_path
151
-
152
- def exists? path
153
- !!@ole.dirent_from_path(path)
154
- end
155
- alias exist? :exists?
156
-
157
- def file? path
158
- dirent = @ole.dirent_from_path path
159
- dirent and dirent.file?
160
- end
161
-
162
- def directory? path
163
- dirent = @ole.dirent_from_path path
164
- dirent and dirent.dir?
165
- end
166
-
167
- def open path, mode='r', &block
168
- if IO::Mode.new(mode).create?
169
- begin
170
- dirent = dirent_from_path path
171
- rescue Errno::ENOENT
172
- # maybe instead of repeating this everywhere, i should have
173
- # a get_parent_dirent function.
174
- parent_path, basename = File.split expand_path(path)
175
- parent = @ole.dir.send :dirent_from_path, parent_path, path
176
- parent.children << dirent = Dirent.new(@ole, :type => :file, :name => basename)
177
- end
178
- else
179
- dirent = dirent_from_path path
180
- end
181
- dirent.open mode, &block
182
- end
183
-
184
- # explicit wrapper instead of alias to inhibit block
185
- def new path, mode='r'
186
- open path, mode
187
- end
188
-
189
- def size path
190
- dirent_from_path(path).size
191
- rescue Errno::EISDIR
192
- # kind of arbitrary. I'm getting 4096 from ::File, but
193
- # the zip tests want 0.
194
- 0
195
- end
196
-
197
- def stat path
198
- # we do this to allow dirs.
199
- dirent = @ole.dirent_from_path path
200
- raise Errno::ENOENT, path unless dirent
201
- Stat.new dirent
202
- end
203
-
204
- def read path
205
- open path, &:read
206
- end
207
-
208
- # most of the work this function does is moving the dirent between
209
- # 2 parents. the actual name changing is quite simple.
210
- # File.rename can move a file into another folder, which is why i've
211
- # done it too, though i think its not always possible...
212
- #
213
- # FIXME File.rename can be used for directories too....
214
- def rename from_path, to_path
215
- # check what we want to rename from exists. do it this
216
- # way to allow directories.
217
- dirent = @ole.dirent_from_path from_path
218
- raise Errno::ENOENT, from_path unless dirent
219
- # delete what we want to rename to if necessary
220
- begin
221
- unlink to_path
222
- rescue Errno::ENOENT
223
- # we actually get here, but rcov doesn't think so
224
- end
225
- # reparent the dirent
226
- from_parent_path, from_basename = File.split expand_path(from_path)
227
- to_parent_path, to_basename = File.split expand_path(to_path)
228
- from_parent = @ole.dir.send :dirent_from_path, from_parent_path, from_path
229
- to_parent = @ole.dir.send :dirent_from_path, to_parent_path, to_path
230
- from_parent.children.delete dirent
231
- # and also change its name
232
- dirent.name = to_basename
233
- to_parent.children << dirent
234
- 0
235
- end
236
-
237
- # crappy copy from Dir.
238
- def unlink(*paths)
239
- paths.each do |path|
240
- dirent = @ole.dirent_from_path path
241
- # i think we should free all of our blocks from the
242
- # allocation table.
243
- # i think if you run repack, all free blocks should get zeroed,
244
- # but currently the original data is there unmodified.
245
- open(path) { |f| f.truncate 0 }
246
- # remove ourself from our parent, so we won't be part of the dir
247
- # tree at save time.
248
- parent_path, basename = File.split expand_path(path)
249
- parent = @ole.dir.send :dirent_from_path, parent_path, path
250
- parent.children.delete dirent
251
- end
252
- paths.length # hmmm. as per ::File ?
253
- end
254
- alias delete :unlink
255
- end
256
-
257
- #
258
- # an *instance* of this class is supposed to provide similar methods
259
- # to the class methods of Dir itself.
260
- #
261
- # pretty complete. like zip/zipfilesystem's implementation, i provide
262
- # everything except chroot and glob. glob could be done with a glob
263
- # to regex regex, and then simply match in the entries array... although
264
- # recursive glob complicates that somewhat.
265
- #
266
- # Dir.chroot, Dir.glob, Dir.[], and Dir.tmpdir is the complete list.
267
- class DirClass
268
- def initialize ole
269
- @ole = ole
270
- @pwd = ''
271
- end
272
-
273
- # +orig_path+ is just so that we can use the requested path
274
- # in the error messages even if it has been already modified
275
- def dirent_from_path path, orig_path=nil
276
- orig_path ||= path
277
- dirent = @ole.dirent_from_path path
278
- raise Errno::ENOENT, orig_path unless dirent
279
- raise Errno::ENOTDIR, orig_path unless dirent.dir?
280
- dirent
281
- end
282
- private :dirent_from_path
283
-
284
- def open path
285
- dir = Dir.new path, entries(path)
286
- if block_given?
287
- yield dir
288
- else
289
- dir
290
- end
291
- end
292
-
293
- # as for file, explicit alias to inhibit block
294
- def new path
295
- open path
296
- end
297
-
298
- # pwd is always stored without the trailing slash. we handle
299
- # the root case here
300
- def pwd
301
- if @pwd.empty?
302
- '/'
303
- else
304
- @pwd
305
- end
306
- end
307
- alias getwd :pwd
308
-
309
- def chdir orig_path
310
- # make path absolute, squeeze slashes, and remove trailing slash
311
- path = @ole.file.expand_path(orig_path).gsub(/\/+/, '/').sub(/\/$/, '')
312
- # this is just for the side effects of the exceptions if invalid
313
- dirent_from_path path, orig_path
314
- if block_given?
315
- old_pwd = @pwd
316
- begin
317
- @pwd = path
318
- yield
319
- ensure
320
- @pwd = old_pwd
321
- end
322
- else
323
- @pwd = path
324
- 0
325
- end
326
- end
327
-
328
- def entries path
329
- dirent = dirent_from_path path
330
- # Not sure about adding on the dots...
331
- entries = %w[. ..] + dirent.children.map(&:name)
332
- # do some checks about un-reachable files
333
- seen = {}
334
- entries.each do |n|
335
- Log.warn "inaccessible file (filename contains slash) - #{n.inspect}" if n['/']
336
- Log.warn "inaccessible file (duplicate filename) - #{n.inspect}" if seen[n]
337
- seen[n] = true
338
- end
339
- entries
340
- end
341
-
342
- def foreach path, &block
343
- entries(path).each(&block)
344
- end
345
-
346
- # there are some other important ones, like:
347
- # chroot (!), glob etc etc. for now, i think
348
- def mkdir path
349
- # as for rmdir below:
350
- parent_path, basename = File.split @ole.file.expand_path(path)
351
- # note that we will complain about the full path despite accessing
352
- # the parent path. this is consistent with ::Dir
353
- parent = dirent_from_path parent_path, path
354
- # now, we first should ensure that it doesn't already exist
355
- # either as a file or a directory.
356
- raise Errno::EEXIST, path if parent/basename
357
- parent.children << Dirent.new(@ole, :type => :dir, :name => basename)
358
- 0
359
- end
360
-
361
- def rmdir path
362
- dirent = dirent_from_path path
363
- raise Errno::ENOTEMPTY, path unless dirent.children.empty?
364
-
365
- # now delete it, how to do that? the canonical representation that is
366
- # maintained is the root tree, and the children array. we must remove it
367
- # from the children array.
368
- # we need the parent then. this sucks but anyway:
369
- # we need to split the path. but before we can do that, we need
370
- # to expand it first. eg. say we need the parent to unlink
371
- # a/b/../c. the parent should be a, not a/b/.., or a/b.
372
- parent_path, basename = File.split @ole.file.expand_path(path)
373
- # this shouldn't be able to fail if the above didn't
374
- parent = dirent_from_path parent_path
375
- # note that the way this currently works, on save and repack time this will get
376
- # reflected. to work properly, ie to make a difference now it would have to re-write
377
- # the dirent. i think that Ole::Storage#close will handle that. and maybe include a
378
- # #repack.
379
- parent.children.delete dirent
380
- 0 # hmmm. as per ::Dir ?
381
- end
382
- alias delete :rmdir
383
- alias unlink :rmdir
384
-
385
- # note that there is nothing remotely ole specific about
386
- # this class. it simply provides the dir like sequential access
387
- # methods on top of an array.
388
- # hmm, doesn't throw the IOError's on use of a closed directory...
389
- class Dir
390
- include Enumerable
391
-
392
- attr_reader :path, :entries, :pos
393
- def initialize path, entries
394
- @path, @entries, @pos = path, entries, 0
395
- end
396
-
397
- def each(&block)
398
- entries.each(&block)
399
- end
400
-
401
- def close
402
- end
403
-
404
- def read
405
- entries[pos]
406
- ensure
407
- @pos += 1 if pos < entries.length
408
- end
409
-
410
- def pos= pos
411
- @pos = [[0, pos].max, entries.length].min
412
- end
413
-
414
- def rewind
415
- @pos = 0
416
- end
417
-
418
- alias tell :pos
419
- alias seek :pos=
420
- end
421
- end
422
- end
423
- end
424
-
1
+ # keeping this file around for now, but will delete later on...
2
+ require 'ole/storage/file_system'