external 0.1.0 → 0.3.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (49) hide show
  1. data/History +7 -0
  2. data/MIT-LICENSE +1 -3
  3. data/README +162 -127
  4. data/lib/external.rb +2 -3
  5. data/lib/external/base.rb +174 -47
  6. data/lib/external/chunkable.rb +131 -105
  7. data/lib/external/enumerable.rb +78 -33
  8. data/lib/external/io.rb +163 -398
  9. data/lib/external/patches/ruby_1_8_io.rb +31 -0
  10. data/lib/external/patches/windows_io.rb +53 -0
  11. data/lib/external/patches/windows_utils.rb +27 -0
  12. data/lib/external/utils.rb +148 -0
  13. data/lib/external_archive.rb +840 -0
  14. data/lib/external_array.rb +57 -0
  15. data/lib/external_index.rb +1053 -0
  16. metadata +42 -58
  17. data/lib/ext_arc.rb +0 -108
  18. data/lib/ext_arr.rb +0 -727
  19. data/lib/ext_ind.rb +0 -1120
  20. data/test/benchmarks/benchmarks_20070918.txt +0 -45
  21. data/test/benchmarks/benchmarks_20070921.txt +0 -91
  22. data/test/benchmarks/benchmarks_20071006.txt +0 -147
  23. data/test/benchmarks/test_copy_file.rb +0 -80
  24. data/test/benchmarks/test_pos_speed.rb +0 -47
  25. data/test/benchmarks/test_read_time.rb +0 -55
  26. data/test/cached_ext_ind_test.rb +0 -219
  27. data/test/check/benchmark_check.rb +0 -441
  28. data/test/check/namespace_conflicts_check.rb +0 -23
  29. data/test/check/pack_check.rb +0 -90
  30. data/test/ext_arc_test.rb +0 -286
  31. data/test/ext_arr/alt_sep.txt +0 -3
  32. data/test/ext_arr/cr_lf_input.txt +0 -3
  33. data/test/ext_arr/input.index +0 -0
  34. data/test/ext_arr/input.txt +0 -1
  35. data/test/ext_arr/inputb.index +0 -0
  36. data/test/ext_arr/inputb.txt +0 -1
  37. data/test/ext_arr/lf_input.txt +0 -3
  38. data/test/ext_arr/lines.txt +0 -19
  39. data/test/ext_arr/without_index.txt +0 -1
  40. data/test/ext_arr_test.rb +0 -534
  41. data/test/ext_ind_test.rb +0 -1472
  42. data/test/external/base_test.rb +0 -74
  43. data/test/external/chunkable_test.rb +0 -182
  44. data/test/external/index/input.index +0 -0
  45. data/test/external/index/inputb.index +0 -0
  46. data/test/external/io_test.rb +0 -414
  47. data/test/external_test_helper.rb +0 -31
  48. data/test/external_test_suite.rb +0 -4
  49. data/test/test_array.rb +0 -1192
data/History CHANGED
@@ -1,3 +1,10 @@
1
+ == 0.3.0 / 2008-10-27
2
+
3
+ Major update with refactoring (ex ExtArr is now ExternalArray)
4
+ and greatly expanded testing. [] and []= methods all Externals
5
+ now comply with the Array specification in RubySpec[rubyspec.org].
6
+ Implementation of other methods is under way.
7
+
1
8
  == 0.1.0 / 2007-12-10 revision 23
2
9
 
3
10
  Initial release with working [] and []= methods
data/MIT-LICENSE CHANGED
@@ -1,6 +1,4 @@
1
- Copyright (c) 2006-2007, Regents of the University of Colorado.
2
- Developer:: Simon Chiang, Biomolecular Structure Program, Hansen Lab
3
- Support:: CU Denver School of Medicine Deans Academic Enrichment Fund
1
+ Copyright (c) 2006-2008, Regents of the University of Colorado.
4
2
 
5
3
  Permission is hereby granted, free of charge, to any person obtaining a copy of this
6
4
  software and associated documentation files (the "Software"), to deal in the Software
data/README CHANGED
@@ -4,165 +4,200 @@ Indexing and array-like access to data stored on disk rather than in memory.
4
4
 
5
5
  == Description
6
6
 
7
- External provides an easy way to index files such that array-like calls can store and
8
- retrieve entries directly from the file without loading it into memory. The indexes can
9
- be cached for performance or stored on disk alongside the data file, in essence giving you
10
- arbitrarily large arrays.
7
+ External provides a way to index and access array data directly from a file
8
+ without loading it into memory. Indexes may be cached in memory or stored
9
+ on disk with the data file, in essence giving you arbitrarily large arrays.
10
+ Externals automatically chunk and buffer methods like <tt>each</tt> so that
11
+ the memory footprint remains low even during enumeration.
11
12
 
12
- The main classes of external provide array-like access to the following:
13
- * ExtInd (External Index) -- formatted binary data
14
- * ExtArr (External Array) -- externally stored ruby objects
15
- * ExtArc (External Archive) -- externally stored string data
13
+ The main External classes are:
16
14
 
17
- ExtArc is a subclass of ExtArr specialized for string archival files, formats like FASTA
18
- where entries are strings delimited by '>':
15
+ * ExternalIndex -- for formatted binary data
16
+ * ExternalArchive -- for string data
17
+ * ExternalArray -- for objects (serialized as YAML)
19
18
 
20
- >Q9BXQ0|Q9BXQ0_HUMAN Tissue transglutaminase (Fragment) - Homo sapiens (Human).
21
- LEPFSGKALCSWSIC
22
- >P02452|CO1A1_HUMAN Collagen alpha-1(I) chain - Homo sapiens (Human).
23
- MFSFVDLRLLLLLAATALLTHGQEEGQVEGQDEDIPPITCVQNGLRYHDRDVWKPEPCRI
24
- CVCDNGKVLCDDVICDETKNCPGAEVPEGECCPVCPDGSESPTDQETTGVEGPKGDTGPR
25
- GPRGPAGPPGRDGIPGQPGLPGPPGPPGPPGPPGLGGNFAPQLSYGYDEKSTGGISVPGP
26
- ...
19
+ The array-like behavior of these classes is developed using modified versions
20
+ of the RubySpec[http://rubyspec.org] specification for Array. The idea is to
21
+ eventually duck-type all Array methods, including sort and collect, with
22
+ acceptable performance.
27
23
 
28
- The array-like behavior of these classes is developed against modified versions of the
29
- Array tests themselves, and often uses the exact same tests. The idea is to eventually
30
- duck-type all Array methods, including sort and collect, with acceptable performance.
24
+ * Rubyforge[http://rubyforge.org/projects/external]
25
+ * Lighthouse[http://bahuvrihi.lighthouseapp.com/projects/10590-external]
26
+ * Github[http://github.com/bahuvrihi/external/tree/master]
31
27
 
32
- === Bugs/Known Issues
28
+ ==== Bugs/Known Issues
33
29
 
34
30
  * only a limited set of array methods are currently supported
35
- * reindexing of ExtArr does not work for arrays containing yaml strings
36
- * yaml serialization/deserialization of some strings do not reproduce identical input
37
- and so will not be faithfully store in ExtArr. Carriage return string are notable:
38
- "\r", "\r\n", "string_with_\r\n_internal", as are chains of newlines: "\n", "\n\n"
39
- * documentation is poor at the moment
31
+ * currently only [] and []= are fully tested vs RubySpec
32
+ * documentation is patchy
40
33
 
41
- --
42
- == Performance
43
- ++
34
+ Note also that YAML dump/load of some objects doesn't work or doesn't
35
+ reproduce the object; such objects will not be properly stored in an
36
+ ExternalArray. Problematic objects include:
44
37
 
45
- == Info
38
+ Proc and Class:
46
39
 
47
- Copyright (c) 2006-2007, Regents of the University of Colorado.
48
- Developer:: {Simon Chiang}[http://bahuvrihi.wordpress.com], {Biomolecular Structure Program}[http://biomol.uchsc.edu/], {Hansen Lab}[http://hsc-proteomics.uchsc.edu/hansenlab/]
49
- Support:: CU Denver School of Medicine Deans Academic Enrichment Fund
50
- Licence:: MIT-Style
40
+ block = lambda {}
41
+ YAML.load(YAML.dump(block)) # !> TypeError: allocator undefined for Proc
42
+ YAML.dump(Object) # !> TypeError: can't dump anonymous class Class
51
43
 
52
- == Installation
44
+ Carriage returns ("\r"):
53
45
 
54
- External is available from RubyForge[http://rubyforge.org/projects/external]. Use:
46
+ YAML.load(YAML.dump("\r")) # => nil
47
+ YAML.load(YAML.dump("\r\n")) # => ""
48
+ YAML.load(YAML.dump("string with \r\n inside")) # => "string with \n inside"
55
49
 
56
- % gem install external
50
+ Chains of newlines ("\n"):
57
51
 
58
- == Usage
52
+ YAML.load(YAML.dump("\n")) # => ""
53
+ YAML.load(YAML.dump("\n\n")) # => ""
54
+
55
+ DateTime is loaded as Time:
59
56
 
60
- === ExtArr
57
+ YAML.load(YAML.dump(DateTime.now)).class # => Time
58
+
59
+ == Usage
61
60
 
62
- ExtArr can be initialized from data using the [] operator and used as an array.
61
+ === ExternalArray
63
62
 
64
- ea = ExtArr[1, 2.2, "cat", {:key => 'value'}]
65
- ea[2] # => "cat"
66
- ea.last # => {:key => 'value'}
67
- ea << [:a, :b]
68
- ea.to_a # => [1, 2.2, "cat", {:key => 'value'}, [:a, :b]]
63
+ ExternalArray can be initialized from data using the [] operator and used like
64
+ an array.
69
65
 
70
- Behind the scenes, ExtArr serializes and stores entries on a data source (io) and builds an
71
- ExtInd that tracks where each entry begins and ends.
66
+ a = ExternalArray['str', {'key' => 'value'}]
67
+ a[0] # => 'str'
68
+ a.last # => {'key' => 'value'}
69
+ a << [1,2]; a.to_a # => ['str', {'key' => 'value'}, [1,2]]
72
70
 
73
- ea.io.class # => Tempfile
74
- ea.io.rewind
75
- ea.io.read # => "--- 1\n--- 2.2\n--- cat\n--- \n:key: value\n--- \n- :a\n- :b\n"
71
+ ExternalArray serializes and stores entries to an io while building an io_index
72
+ that tracks the start and length of each entry. By default ExternalArray
73
+ will serialize to a Tempfile and use an Array as the io_index:
76
74
 
77
- ea.index.class # => ExtInd
78
- ea.index.to_a # => [[0, 6], [6, 8], [14, 8], [22, 17], [39, 15]]
75
+ a.io.class # => Tempfile
76
+ a.io.rewind; a.io.read # => "--- str\n--- \nkey: value\n--- \n- 1\n- 2\n"
77
+ a.io_index.class # => Array
78
+ a.io_index.to_a # => [[0, 8], [8, 16], [24, 13]]
79
79
 
80
- By default External supports File, Tempfile, and StringIO data sources. If no data source is
81
- given (as above), the external array is initialized to a Tempfile so that it will be cleaned
82
- up on exit.
80
+ To save this data more permanently, provide a path to <tt>close</tt>; the tempfile
81
+ is moved to the path and a binary index file will be created:
83
82
 
84
- ExtArr can be initialized from existing data sources. In this case, ExtArr tries to find and
85
- load an existing index; if the index doesn't exist, then you have to reindex the data manually.
83
+ a.close('example.yml')
84
+ File.read('example.yml') # => "--- str\n--- \nkey: value\n--- \n- 1\n- 2\n"
85
+
86
+ index = File.read('example.index')
87
+ index.unpack('I*') # => [0, 8, 8, 16, 24, 13]
86
88
 
87
- File.open('path/to/file.txt', "w+") do |file|
88
- file << "--- 1\n--- 2.2\n--- cat\n--- \n:key: value\n--- \n- :a\n- :b\n"
89
- file.flush
89
+ ExternalArray provides <tt>open</tt> to create ExternalArrays from an existing
90
+ file; the instance will use an index file if it exists and automatically
91
+ reindex the data if it does not. Manual calls to reindex may be necessary when
92
+ you initialize an ExternalArray with <tt>new</tt> instead of <tt>open</tt>:
90
93
 
91
- index_filepath = ExtArr.default_index_filepath(file.path)
92
- File.exists?(index_filepath) # => false
93
-
94
- ea = ExtArr.new(file)
95
- ea.to_a # => []
96
- ea.reindex
97
- ea.to_a # => [1, 2.2, "cat", {:key => 'value'}, [:a, :b]]
94
+ # use of an existing index file
95
+ ExternalArray.open('example.yml') do |b|
96
+ File.basename(b.io_index.io.path) # => 'example.index'
97
+ b.to_a # => ['str', {'key' => 'value'}, [1,2]]
98
98
  end
99
99
 
100
- ExtArr provides an open method for easy access to file data:
101
-
102
- ExtArr.open('path/to/file.txt') do |ea|
103
- # ...
100
+ # automatic reindexing
101
+ FileUtils.rm('example.index')
102
+ ExternalArray.open('example.yml') do |b|
103
+ b.to_a # => ['str', {'key' => 'value'}, [1,2]]
104
104
  end
105
-
106
- === ExtArc
107
-
108
- ExtArc is a subclass of ExtArr designed for string archival files. Rather than serialize and
109
- load ruby objects to and from the data file, ExtArc simply read and writes strings. In
110
- addition, ExtArc provides additional reindexing methods designed to make reindexing easy.
111
-
112
- arc = ExtArc[">swift", ">brown", ">fox"]
113
- arc[2] # => ">fox"
114
- arc.to_a # => [">swift", ">brown", ">fox"]
115
-
116
- arc.io.class # => Tempfile
117
- arc.io.rewind
118
- arc.io.read # => ">swift>brown>fox"
119
-
120
- File.open('path/to/file.txt', "w+") do |file|
121
- file << ">swift>brown>fox"
122
- file.flush
123
-
124
- # Reindex by a separation string
125
- arc = ExtArc.new(file)
126
- arc.to_a # => []
127
- arc.reindex_by_sep(:sep_string => ">", :entry_follows_sep => true)
128
- arc.to_a # => [">swift", ">brown", ">fox"]
129
-
130
- # Reindex by scanning an entry
131
- arc = ExtArc.new(file)
132
- arc.to_a # => []
133
- arc.reindex_by_scan(/>\w*/)
134
- arc.to_a # => [">swift", ">brown", ">fox"]
105
+
106
+ # manual reindexing
107
+ file = File.open('example.yml')
108
+ c = ExternalArray.new(file)
109
+
110
+ c.to_a # => []
111
+ c.reindex
112
+ c.to_a # => ['str', {'key' => 'value'}, [1,2]]
113
+
114
+ === ExternalArchive
115
+
116
+ ExternalArchive is exactly like ExternalArray except that it only stores
117
+ strings (ExternalArray is actually a subclass of ExternalArchive which
118
+ dumps/loads strings).
119
+
120
+ arc = ExternalArchive["swift", "brown", "fox"]
121
+ arc[2] # => "fox"
122
+ arc.to_a # => ["swift", "brown", "fox"]
123
+ arc.io.rewind; arc.io.read # => "swiftbrownfox"
124
+
125
+ ExternalArchive is useful as a base for classes to access archival data.
126
+ Here is a simple parser for FASTA[http://en.wikipedia.org/wiki/Fasta_format]
127
+ data:
128
+
129
+ # A sample FASTA entry
130
+ # >gi|5524211|gb|AAD44166.1| cytochrome b [Elephas maximus maximus]
131
+ # LCLYTHIGRNIYYGSYLYSETWNTGIMLLLITMATAFMGYVLPWGQMSFWGATVITNLFSAIPYIGTNLV
132
+ # EWIWGGFSVDKATLNRFFAFHFILPFTMVALAGVHLTFLHETGSNNPLGLTSDSDKIPFHPYYTIKDFLG
133
+ # LLILILLLLLLALLSPDMLGDPDNHMPADPLNTPLHIKPEWYFLFAYAILRSVPNKLGGVLALFLSIVIL
134
+ # GLMPFLHTSKHRSMMLRPLSQALFWTLTMDLLTLTWIGSQPVEYPYTIIGQMASILYFSIILAFLPIAGX
135
+ # IENY
136
+
137
+ class FastaEntry
138
+ attr_reader :header, :body
139
+
140
+ def initialize(str)
141
+ @body = str.split(/\r?\n/)
142
+ @header = body.shift
143
+ end
135
144
  end
136
-
137
- === ExtInd
138
-
139
- ExtInd provides array-like access to formatted binary data. The index of ExtArr is an
140
- ExtInd constructed to access data formatted as 'II'; two integers corresponding to the
141
- start position and length of entries in the ExtArr data source. For simple, repetitive
142
- formats like 'II', processing is optimized to use a general format and frame.
143
-
144
- ea = ExtArr.new
145
- ea.index.class # => ExtInd
146
- index = ea.index
147
-
148
- index.format # => 'I*'
149
- index.frame # => 2
150
- index << [1,2]
151
- index << [3,4]
152
- index.to_a # => [[1,2],[3,4]]
153
-
154
- ExtInd handles arbitrary packing formats, opening many possibilites:
155
-
156
- File.open('path/to/file', "w+") do |file|
145
+
146
+ class FastaArchive < ExternalArchive
147
+ def str_to_entry(str); FastaEntry.new(str); end
148
+ def entry_to_str(entry); ([entry.header] + entry.body).join("\n"); end
149
+
150
+ def reindex
151
+ reindex_by_sep('>', :entry_follows_sep => true)
152
+ end
153
+ end
154
+
155
+ require 'open-uri'
156
+ fasta = FastaArchive.new open('http://external.rubyforge.org/doc/tiny_fasta.txt')
157
+ fasta.reindex
158
+
159
+ fasta.length # => 5
160
+ fasta[0].body # => ["MEVNILAFIATTLFVLVPTAFLLIIYVKTVSQSD"]
161
+
162
+ The non-redundant {NCBI protein database}[ftp://ftp.ncbi.nih.gov/blast/db/FASTA/]
163
+ contains greater than 7 million FASTA entries in a 3.56 GB file; ExternalArchive
164
+ is targeted at files that size, where lazy loading of data and a small memory
165
+ footprint are critical.
166
+
167
+ === ExternalIndex
168
+
169
+ ExternalIndex provides array-like access to formatted binary data. The index of an
170
+ uncached ExternalArray is an ExternalIndex configured for binary data like 'II'; two
171
+ integers corresponding to the start position and length an entry.
172
+
173
+ index = ExternalIndex[1, 2, 3, 4, 5, 6, {:format => 'II'}]
174
+ index.format # => 'I*'
175
+ index.frame # => 2
176
+ index[1] # => [3,4]
177
+ index.to_a # => [[1,2], [3,4], [5,6]]
178
+
179
+ ExternalIndex handles arbitrary packing formats, opening many possibilities:
180
+
181
+ Tempfile.new('sample.txt') do |file|
157
182
  file << [1,2,3].pack("IQS")
158
183
  file << [4,5,6].pack("IQS")
159
184
  file << [7,8,9].pack("IQS")
160
185
  file.flush
161
186
 
162
- index = ExtInd.new(file, :format => "IQS")
163
- index[1] # => [4,5,6]
164
- index.to_a # => [[1,2,3],[4,5,6],[7,8,9]]
187
+ index = ExternalIndex.new(file, :format => "IQS")
188
+ index[1] # => [4,5,6]
189
+ index.to_a # => [[1,2,3], [4,5,6], [7,8,9]]
165
190
  end
166
191
 
167
- Note: at the moment formats must be specified longhand, ie 'III' cannot be written as 'I3',
168
- and the native size directives for sSiIlL are not supported.
192
+ == Installation
193
+
194
+ External is available from RubyForge[http://rubyforge.org/projects/external]. Use:
195
+
196
+ % gem install external
197
+
198
+ == Info
199
+
200
+ Copyright (c) 2006-2008, Regents of the University of Colorado.
201
+ Developer:: {Simon Chiang}[http://bahuvrihi.wordpress.com], {Biomolecular Structure Program}[http://biomol.uchsc.edu/], {Hansen Lab}[http://hsc-proteomics.uchsc.edu/hansenlab/]
202
+ Support:: CU Denver School of Medicine Deans Academic Enrichment Fund
203
+ Licence:: {MIT-Style}[link:files/MIT-LICENSE.html]
data/lib/external.rb CHANGED
@@ -1,3 +1,2 @@
1
- require 'ext_ind'
2
- require 'ext_arr'
3
- require 'ext_arc'
1
+ $:.unshift File.expand_path(File.dirname(__FILE__))
2
+ require 'external_array'
data/lib/external/base.rb CHANGED
@@ -1,66 +1,65 @@
1
- require 'external/io'
2
- require 'external/chunkable'
3
- require 'external/enumerable'
1
+ # For some inexplicable reason yaml MUST be required before
2
+ # tempfile in order for ExtArrTest::test_LSHIFT to pass.
3
+ # Otherwise it fails with 'TypeError: allocator undefined for Proc'
4
+
5
+ require 'yaml'
4
6
  require 'tempfile'
5
7
 
8
+ require 'external/enumerable'
9
+ require 'external/io'
10
+
6
11
  module External
7
12
 
8
- #--
9
- # Base provides the basic array functionality shared by ExtArr and Index,
10
- # essentially wrapping the IO functions required to access and utilized external
11
- # array data with the standard array functions. Bases can be opened with
12
- # in any of the IO modes; the capabilities of Base will be reduced accordingly
13
- # (ie read-only Bases cannot write values using []=, for instance).
14
- #
15
- # It is VERY IMPORTANT to realize that the underlying IO will be opened using the
16
- # given mode. The 'w' mode will overwrite all existing data; 'r+' is a safer mode
17
- # for full read-write functionality. Note that since Base actively scans over
18
- # the IO, append modes essentially behaves like write, but does not overwrite existing
19
- # data.
20
- #
21
- # To work properly, Base must be subclassed with methods:
22
- # * length
23
- # * io_fetch
24
- #++
25
- #
26
- #
13
+ # Base provides shared IO and Array-like methods used by ExternalArchive,
14
+ # ExternalArray, and ExternalIndex.
27
15
  class Base
28
16
  class << self
29
- def open(fd=nil, mode="r", options={})
30
- fd = File.open(fd, mode) unless fd == nil
31
- ab = self.new(fd, options)
17
+
18
+ # Initializes an instance of self with File.open(path, mode) as an io.
19
+ # As with File.open, the instance will be passed to the block and
20
+ # closed when the block returns. If no block is given, open returns
21
+ # the new instance.
22
+ #
23
+ # Nil may be provided as an fd, in which case a Tempfile will be
24
+ # used (in which case mode gets ignored as Tempfiles always open
25
+ # in 'r+' mode).
26
+ def open(path=nil, mode="rb", *argv)
27
+ path = File.open(path, mode) unless path == nil
28
+ base = new(path, *argv)
32
29
 
33
30
  if block_given?
34
31
  begin
35
- yield(ab)
32
+ yield(base)
36
33
  ensure
37
- ab.close
34
+ base.close
38
35
  end
39
36
  else
40
- ab
37
+ base
41
38
  end
42
39
  end
43
40
  end
44
41
 
45
42
  include External::Enumerable
46
43
  include External::Chunkable
47
-
44
+
45
+ # The underlying io for self.
48
46
  attr_reader :io
49
47
 
50
- # Initializes a new Base given the file descriptor, mode and options.
51
- # (see open_io for details on what io is opened for a given file descriptor)
52
- #
53
- # If mode contains an 's', then the Base will be initialized in strio
54
- # mode where the underlying IO will be a StringIO. In this case the fd
55
- # will be used as the string to initialize the StringIO.
56
- #
57
- # Standard options for Base include:
58
- # nil_value:: the value written to file for nils, and converted to nil on read
59
- # (default ' ')
60
- # max_gap:: the maximum gap size used by Offset (default 10000)
61
- # max_chunk_size:: the chunk size used by Offset (default 1M)
48
+ # The default tempfile basename for Base instances
49
+ # initialized without an io.
50
+ TEMPFILE_BASENAME = "external_base"
51
+
52
+ # Creates a new instance of self with the specified io. A
53
+ # nil io causes initialization with a Tempfile; a string
54
+ # io will be converted into a StringIO.
62
55
  def initialize(io=nil)
63
- self.io = (io.nil? ? Tempfile.new("array_base") : io)
56
+ self.io = case io
57
+ when nil then Tempfile.new(TEMPFILE_BASENAME)
58
+ when String then StringIO.new(io)
59
+ else io
60
+ end
61
+
62
+ @enumerate_to_a = true
64
63
  end
65
64
 
66
65
  # True if io is closed.
@@ -68,18 +67,146 @@ module External
68
67
  io.closed?
69
68
  end
70
69
 
71
- # Closes io.
72
- def close
70
+ # Closes io. If a path is specified, io will be dumped to it. If
71
+ # io is a File or Tempfile, the existing file is moved (not dumped)
72
+ # to path. Raises an error if path already exists and overwrite is
73
+ # not specified.
74
+ def close(path=nil, overwrite=false)
75
+ result = !io.closed?
76
+
77
+ if path
78
+ if File.exists?(path) && !overwrite
79
+ raise ArgumentError, "already exists: #{path}"
80
+ end
81
+
82
+ case io
83
+ when File, Tempfile
84
+ io.close unless io.closed?
85
+ FileUtils.move(io.path, path)
86
+ else
87
+ io.flush
88
+ io.rewind
89
+ File.open(path, "w") do |file|
90
+ file << io.read(io.default_blksize) while !io.eof?
91
+ end
92
+ end
93
+ end
94
+
73
95
  io.close unless io.closed?
96
+ result
97
+ end
98
+
99
+ # Flushes the io and resets the io length. Returns self
100
+ def flush
101
+ io.flush
102
+ io.reset_length
103
+ self
104
+ end
105
+
106
+ # Returns a duplicate of self. This can be a slow operation
107
+ # as it may involve copying the full contents of one large
108
+ # file to another.
109
+ def dup
110
+ flush
111
+ another.concat(self)
112
+ end
113
+
114
+ # Returns another instance of self. Must be
115
+ # implemented in a subclass.
116
+ def another
117
+ raise NotImplementedError
118
+ end
119
+
120
+ ###########################
121
+ # Array methods
122
+ ###########################
123
+
124
+ # Returns true if _self_ contains no elements
125
+ def empty?
126
+ length == 0
127
+ end
128
+
129
+ def eql?(another)
130
+ self == another
131
+ end
132
+
133
+ # Returns the first n entries (default 1)
134
+ def first(n=nil)
135
+ n.nil? ? self[0] : self[0,n]
136
+ end
137
+
138
+ # Alias for []
139
+ def slice(one, two = nil)
140
+ self[one, two]
141
+ end
142
+
143
+ # Returns self.
144
+ #--
145
+ # Warning -- errors show up when this doesn't return
146
+ # an Array... however to return an array with to_ary
147
+ # may mean converting a Base into an Array for
148
+ # insertions... see/modify convert_to_ary
149
+ def to_ary
150
+ self
151
+ end
152
+
153
+ #
154
+ def inspect
155
+ "#<#{self.class}:#{object_id} #{ellipse_inspect(self)}>"
74
156
  end
75
157
 
76
158
  protected
77
159
 
78
- # Sets io and extends the input io with External::Position.
79
- def io=(io)
80
- io.extend External::IO unless io.kind_of?(External::IO)
160
+ # Sets io and extends the input io with Io.
161
+ def io=(io) # :nodoc:
162
+ io.extend Io unless io.kind_of?(Io)
81
163
  @io = io
82
164
  end
165
+
166
+ # converts obj to an int using the <tt>to_int</tt>
167
+ # method, if the object responds to <tt>to_int</tt>
168
+ def convert_to_int(obj) # :nodoc:
169
+ obj.respond_to?(:to_int) ? obj.to_int : obj
170
+ end
83
171
 
172
+ # converts obj to an array using the <tt>to_ary</tt>
173
+ # method, if the object responds to <tt>to_ary</tt>
174
+ def convert_to_ary(obj) # :nodoc:
175
+ obj == nil ? [] : obj.respond_to?(:to_ary) ? obj.to_ary : [obj]
176
+ end
177
+
178
+ # a more array-compliant version of Chunkable#split_range
179
+ def split_range(range, total=length) # :nodoc:
180
+ # split the range
181
+ start = convert_to_int(range.begin)
182
+ raise TypeError, "can't convert #{range.begin.class} into Integer" unless start.kind_of?(Integer)
183
+ start += total if start < 0
184
+
185
+ finish = convert_to_int(range.end)
186
+ raise TypeError, "can't convert #{range.end.class} into Integer" unless finish.kind_of?(Integer)
187
+ finish += total if finish < 0
188
+
189
+ length = finish - start
190
+ length -= 1 if range.exclude_end?
191
+
192
+ [start, length]
193
+ end
194
+
195
+ # helper to inspect large arrays
196
+ def ellipse_inspect(array) # :nodoc:
197
+ if array.length > 10
198
+ "[#{collect_join(array[0,5])} ... #{collect_join(array[-5,5])}] (length = #{array.length})"
199
+ else
200
+ "[#{collect_join(array.to_a)}]"
201
+ end
202
+ end
203
+
204
+ # another helper to inspect large arrays
205
+ def collect_join(array) # :nodoc:
206
+ array.collect do |obj|
207
+ obj.inspect
208
+ end.join(', ')
209
+ end
210
+
84
211
  end
85
212
  end