down 4.1.1 → 4.2.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 5e6bbd2b9b51b4b88834180552227abece0bed9a
4
- data.tar.gz: 102d96a339e3a252af86a59d99dd992ad80a2bac
3
+ metadata.gz: 443491ec05f03e75593726f54314b04587637bf2
4
+ data.tar.gz: a04c39fd140b7266f7ea80bfee2da4f649c079c4
5
5
  SHA512:
6
- metadata.gz: ae39c2f9a1e29a0d75af0ec43efcb02b9d5774a3aa3eb4c0ec1d73cd14552b7a405050e3c39b81857db97cda8355b744823bc375523f2a99a4966e5a5ac35767
7
- data.tar.gz: a3f8183ed4fdd120cc35dfbb69231975c86411f23e3e17007b737edf7a8963735af807679eb8e704af3e58828f8901283094b3c23ed620e0e2f2468a66e79320
6
+ metadata.gz: 1a79349a849954e2e68b5bcf187e79825255c9a1c3fd5f1ea81dc12f2d97e4d20a5419443a6e3d3942e82b3af8615fcce24cfcdbd4ded091048d7f02e17d5039
7
+ data.tar.gz: 2fb8ed137234217f44790f35a767dead06eecff2174fba0960711ebde04d90ff4ee76f20dced2ab2bd6a48b7e5b31c335556504d6892d9603f6c89092cc7512b
@@ -0,0 +1,274 @@
1
+ ## 4.2.0 (2017-12-22)
2
+
3
+ * Handle `:max_redirects` in `Down::NetHttp#open` and follow up to 2 redirects by default (@janko-m)
4
+
5
+ ## 4.1.1 (2017-10-15)
6
+
7
+ * Raise all system call exceptions as `Down::ConnectionError` in `Down::NetHttp` (@janko-m)
8
+
9
+ * Raise `Errno::ETIMEDOUT` as `Down::TimeoutError` in `Down::NetHttp` (@janko-m)
10
+
11
+ * Raise `Addressable::URI::InvalidURIError` as `Down::InvalidUrl` in `Down::Http` (@janko-m)
12
+
13
+ ## 4.1.0 (2017-08-29)
14
+
15
+ * Fix `FiberError` occurring on `Down::NetHttp.open` when response is chunked and gzipped (@janko-m)
16
+
17
+ * Use a default `User-Agent` in `Down::NetHttp.open` (@janko-m)
18
+
19
+ * Fix raw read timeout error sometimes being raised instead of `Down::TimeoutError` in `Down.open` (@janko-m)
20
+
21
+ * `Down::ChunkedIO` can now be parsed by the CSV Ruby standard library (@janko-m)
22
+
23
+ * Implement `Down::ChunkedIO#gets` (@janko-m)
24
+
25
+ * Implement `Down::ChunkedIO#pos` (@janko-m)
26
+
27
+ ## 4.0.1 (2017-07-08)
28
+
29
+ * Load and assign the `NetHttp` backend immediately on `require "down"` (@janko-m)
30
+
31
+ * Remove undocumented `Down::ChunkedIO#backend=` that was added in 4.0.0 to avoid confusion (@janko-m)
32
+
33
+ ## 4.0.0 (2017-06-24)
34
+
35
+ * Don't apply `Down.download` and `Down.open` overrides when loading a backend (@janko-m)
36
+
37
+ * Remove `Down::Http.client` attribute accessor (@janko-m)
38
+
39
+ * Make `Down::NetHttp`, `Down::Http`, and `Down::Wget` classes instead of modules (@janko-m)
40
+
41
+ * Remove `Down.copy_to_tempfile` (@janko-m)
42
+
43
+ * Add Wget backend (@janko-m)
44
+
45
+ * Add `:content_length_proc` and `:progress_proc` to the HTTP.rb backend (@janko-m)
46
+
47
+ * Halve string allocations in `Down::ChunkedIO#readpartial` when buffer string is not used (@janko-m)
48
+
49
+ ## 3.2.0 (2017-06-21)
50
+
51
+ * Add `Down::ChunkedIO#readpartial` for more memory efficient reading (@janko-m)
52
+
53
+ * Fix `Down::ChunkedIO` not returning second part of the last chunk if it was previously partially read (@janko-m)
54
+
55
+ * Strip internal variables from `Down::ChunkedIO#inspect` and show only the important ones (@janko-m)
56
+
57
+ * Add `Down::ChunkedIO#closed?` (@janko-m)
58
+
59
+ * Add `Down::ChunkedIO#rewindable?` (@janko-m)
60
+
61
+ * In `Down::ChunkedIO` only create the Tempfile if it's going to be used (@janko-m)
62
+
63
+ ## 3.1.0 (2017-06-16)
64
+
65
+ * Split `Down::NotFound` into explanatory exceptions (@janko-m)
66
+
67
+ * Add `:read_timeout` and `:open_timeout` options to `Down::NetHttp.open` (@janko-m)
68
+
69
+ * Return an `Integer` in `data[:status]` on a result of `Down.open` when using the HTTP.rb strategy (@janko-m)
70
+
71
+ ## 3.0.0 (2017-05-24)
72
+
73
+ * Make `Down.open` pass encoding from content type charset to `Down::ChunkedIO` (@janko-m)
74
+
75
+ * Add `:encoding` option to `Down::ChunkedIO.new` for specifying the encoding of returned content (@janko-m)
76
+
77
+ * Add HTTP.rb backend as an alternative to Net::HTTP (@janko-m)
78
+
79
+ * Stop testing on MRI 2.1 (@janko-m)
80
+
81
+ * Forward cookies from the `Set-Cookie` response header when redirecting (@janko-m)
82
+
83
+ * Add `frozen-string-literal: true` comments for less string allocations on Ruby 2.3+ (@janko-m)
84
+
85
+ * Modify `#content_type` to return nil instead of `application/octet-stream` when `Content-Type` is blank in `Down.download` (@janko-m)
86
+
87
+ * `Down::ChunkedIO#read`, `#each_chunk`, `#eof?`, `rewind` now raise an `IOError` when `Down::ChunkedIO` has been closed (@janko-m)
88
+
89
+ * `Down::ChunkedIO` now caches only the content that has been read (@janko-m)
90
+
91
+ * Add `Down::ChunkedIO#size=` to allow assigning size after the `Down::ChunkedIO` has been instantiated (@janko-m)
92
+
93
+ * Make `:size` an optional argument in `Down::ChunkedIO` (@janko-m)
94
+
95
+ * Call enumerator's `ensure` block when `Down::ChunkedIO#close` is called (@janko-m)
96
+
97
+ * Add `:rewindable` option to `Down::ChunkedIO` and `Down.open` for disabling caching read content into a file (@janko-m)
98
+
99
+ * Drop support for MRI 2.0 (@janko-m)
100
+
101
+ * Drop support for MRI 1.9.3 (@janko-m)
102
+
103
+ * Remove deprecated `:progress` option (@janko-m)
104
+
105
+ * Remove deprecated `:timeout` option (@janko-m)
106
+
107
+ * Reraise only a subset of exceptions as `Down::NotFound` in `Down.download` (@janko-m)
108
+
109
+ * Support streaming of "Transfer-Encoding: chunked" responses in `Down.open` again (@janko-m)
110
+
111
+ * Remove deprecated `Down.stream` (@janko-m)
112
+
113
+ ## 2.5.1 (2017-05-13)
114
+
115
+ * Remove URL from the error messages (@janko-m)
116
+
117
+ ## 2.5.0 (2017-05-03)
118
+
119
+ * Support both Strings and `URI` objects in `Down.download` and `Down.open` (@olleolleolle)
120
+
121
+ * Work around a `CGI.unescape` bug in Ruby 2.4.
122
+
123
+ * Apply HTTP Basic authentication contained in URLs in `Down.open`.
124
+
125
+ * Raise `Down::NotFound` on 4xx and 5xx responses in `Down.open`.
126
+
127
+ * Write `:status` and `:headers` information to `Down::ChunkedIO#data` in `Down.open`.
128
+
129
+ * Add `#data` attribute to `Down::ChunkedIO` for saving custom result data.
130
+
131
+ * Don't save retrieved chunks into the file in `Down::ChunkedIO#each_chunk`.
132
+
133
+ * Add `:proxy` option to `Down.download` and `Down.open`.
134
+
135
+ ## 2.4.3 (2017-04-06)
136
+
137
+ * Show the input URL in the `Down::Error` message.
138
+
139
+ ## 2.4.2 (2017-03-28)
140
+
141
+ * Don't raise `StopIteration` in `Down::ChunkedIO` when `:chunks` is an empty
142
+ Enumerator.
143
+
144
+ ## 2.4.1 (2017-03-23)
145
+
146
+ * Correctly detect empty filename from `Content-Disposition` header, and
147
+ in this case continue extracting filename from URL.
148
+
149
+ ## 2.4.0 (2017-03-19)
150
+
151
+ * Allow `Down.open` to accept request headers as options with String keys,
152
+ just like `Down.download` does.
153
+
154
+ * Decode URI-decoded filenames from the `Content-Disposition` header
155
+
156
+ * Parse filenames without quotes from the `Content-Disposition` header
157
+
158
+ ## 2.3.8 (2016-11-07)
159
+
160
+ * Work around `Transfer-Encoding: chunked` responses by downloading whole
161
+ response body.
162
+
163
+ ## 2.3.7 (2016-11-06)
164
+
165
+ * In `Down.open` send requests using the URI *path* instead of the full URI.
166
+
167
+ ## 2.3.6 (2016-07-26)
168
+
169
+ * Read #original_filename from the "Content-Disposition" header.
170
+
171
+ * Extract `Down::ChunkedIO` into a file, so that it can be required separately.
172
+
173
+ * In `Down.stream` close the IO after reading from it.
174
+
175
+ ## 2.3.5 (2016-07-18)
176
+
177
+ * Prevent reading the whole response body when the IO returned by `Down.open`
178
+ is closed.
179
+
180
+ ## 2.3.4 (2016-07-14)
181
+
182
+ * Require `net/http`
183
+
184
+ ## 2.3.3 (2016-06-23)
185
+
186
+ * Improve `Down::ChunkedIO` (and thus `Down.open`):
187
+
188
+ - `#each_chunk` and `#read` now automatically call `:on_close` when all
189
+ chunks were downloaded
190
+
191
+ - `#eof?` had incorrect behaviour, where it would return true if
192
+ everything was downloaded, instead only when it's also at the end of
193
+ the file
194
+
195
+ - `#close` can now be called multiple times, as the `:on_close` will always
196
+ be called only once
197
+
198
+ - end of download is now detected immediately when the last chunk was
199
+ downloaded (as opposed to after trying to read the next one)
200
+
201
+ ## 2.3.2 (2016-06-22)
202
+
203
+ * Add `Down.open` for IO-like streaming, and deprecate `Down.stream` (janko-m)
204
+
205
+ * Allow URLs with basic authentication (`http://user:password@example.com`) (janko-m)
206
+
207
+ ## ~~2.3.1 (2016-06-22)~~ (yanked)
208
+
209
+ ## ~~2.3.0 (2016-06-22)~~ (yanked)
210
+
211
+ ## 2.2.1 (2016-06-06)
212
+
213
+ * Make Down work on Windows (martinsefcik)
214
+
215
+ * Close an internal file descriptor that was left open (martinsefcik)
216
+
217
+ ## 2.2.0 (2016-05-19)
218
+
219
+ * Add ability to follow redirects, and allow maximum of 2 redirects by default (janko-m)
220
+
221
+ * Fix a potential Windows issue when extracting `#original_filename` (janko-m)
222
+
223
+ * Fix `#original_filename` being incomplete if filename contains a slash (janko-m)
224
+
225
+ ## 2.1.0 (2016-04-12)
226
+
227
+ * Make `:progress_proc` and `:content_length_proc` work with `:max_size` (janko-m)
228
+
229
+ * Deprecate `:progress` in favor of open-uri's `:progress_proc` (janko-m)
230
+
231
+ * Deprecate `:timeout` in favor of open-uri's `:open_timeout` and `:read_timeout` (janko-m)
232
+
233
+ * Add `Down.stream` for streaming remote files in chunks (janko-m)
234
+
235
+ * Replace deprecated `URI.encode` with `CGI.unescape` in downloaded file's `#original_filename` (janko-m)
236
+
237
+ ## 2.0.1 (2016-03-06)
238
+
239
+ * Add error message when file was to large, and use a simple error message for other generic download failures (janko-m)
240
+
241
+ ## 2.0.0 (2016-02-03)
242
+
243
+ * Fix an issue where valid URLs were transformed into invalid URLs (janko-m)
244
+
245
+ - All input URLs now have to be properly encoded, which should already be the
246
+ case in most situations.
247
+
248
+ * Include the error class when download fails (janko-m)
249
+
250
+ ## 1.1.0 (2016-01-26)
251
+
252
+ * Forward all additional options to open-uri (janko-m)
253
+
254
+ ## 1.0.5 (2015-12-18)
255
+
256
+ * Move the open-uri file to the new location instead of copying it (janko-m)
257
+
258
+ ## 1.0.4 (2015-11-19)
259
+
260
+ * Delete the old open-uri file after using it (janko-m)
261
+
262
+ ## 1.0.3 (2015-11-16)
263
+
264
+ * Fix `#download` and `#copy_to_tempfile` not preserving the file extension (janko-m)
265
+
266
+ * Fix `#copy_to_tempfile` not working when given a nested basename (janko-m)
267
+
268
+ ## 1.0.2 (2015-10-24)
269
+
270
+ * Fix Down not working with Ruby 1.9.3 (janko-m)
271
+
272
+ ## 1.0.1 (2015-10-01)
273
+
274
+ * Don't allow redirects when downloading files (janko-m)
data/README.md CHANGED
@@ -263,6 +263,10 @@ default maximum of 2 redirects will be followed, but you can change it via the
263
263
  Down::NetHttp.download("http://example.com/image.jpg") # 2 redirects allowed
264
264
  Down::NetHttp.download("http://example.com/image.jpg", max_redirects: 5) # 5 redirects allowed
265
265
  Down::NetHttp.download("http://example.com/image.jpg", max_redirects: 0) # 0 redirects allowed
266
+
267
+ Down::NetHttp.open("http://example.com/image.jpg") # 2 redirects allowed
268
+ Down::NetHttp.open("http://example.com/image.jpg", max_redirects: 5) # 5 redirects allowed
269
+ Down::NetHttp.open("http://example.com/image.jpg", max_redirects: 0) # 0 redirects allowed
266
270
  ```
267
271
 
268
272
  #### Proxy
@@ -343,7 +347,7 @@ Net::HTTP include:
343
347
  All additional options will be forwarded to `HTTP::Client#request`:
344
348
 
345
349
  ```rb
346
- Down::Http.download("http://example.org/image.jpg", timeout: { open: 3 })
350
+ Down::Http.download("http://example.org/image.jpg", timeout: { connect: 3 })
347
351
  Down::Http.open("http://example.org/image.jpg", follow: { max_hops: 0 })
348
352
  ```
349
353
 
@@ -351,16 +355,16 @@ If you prefer to add options using the chainable API, you can pass a block:
351
355
 
352
356
  ```rb
353
357
  Down::Http.open("http://example.org/image.jpg") do |client|
354
- client.timeout(open: 3)
358
+ client.timeout(connect: 3)
355
359
  end
356
360
  ```
357
361
 
358
362
  You can also initialize the backend with default options:
359
363
 
360
364
  ```rb
361
- http = Down::Http.new(timeout: { open: 3 })
365
+ http = Down::Http.new(timeout: { connect: 3 })
362
366
  # or
363
- http = Down::Http.new(HTTP.timeout(open: 3))
367
+ http = Down::Http.new(HTTP.timeout(connect: 3))
364
368
 
365
369
  http.download("http://example.com/image.jpg")
366
370
  http.open("http://example.com/image.jpg")
@@ -12,7 +12,7 @@ Gem::Specification.new do |spec|
12
12
  spec.email = ["janko.marohnic@gmail.com"]
13
13
  spec.license = "MIT"
14
14
 
15
- spec.files = Dir["README.md", "LICENSE.txt", "*.gemspec", "lib/**/*.rb"]
15
+ spec.files = Dir["README.md", "LICENSE.txt", "CHANGELOG.md", "*.gemspec", "lib/**/*.rb"]
16
16
  spec.require_path = "lib"
17
17
 
18
18
  spec.add_development_dependency "minitest", "~> 5.8"
@@ -4,6 +4,26 @@ require "tempfile"
4
4
  require "fiber"
5
5
 
6
6
  module Down
7
+ # Wraps an enumerator that yields chunks of content into an IO object. It
8
+ # implements some essential IO methods:
9
+ #
10
+ # * IO#read
11
+ # * IO#readpartial
12
+ # * IO#gets
13
+ # * IO#size
14
+ # * IO#pos
15
+ # * IO#eof?
16
+ # * IO#rewind
17
+ # * IO#close
18
+ #
19
+ # By default the Down::ChunkedIO caches all read content into a tempfile,
20
+ # allowing it to be rewindable. If rewindability won't be used, it can be
21
+ # disabled by setting `:rewindable` to false, which eliminates any disk I/O.
22
+ #
23
+ # Any cleanup code (i.e. ensure block) that the given enumerator carries is
24
+ # guaranteed to get executed, either when all content has been retrieved or
25
+ # when Down::ChunkedIO is closed. One can also specify an `:on_close`
26
+ # callback that will also get executed in those situations.
7
27
  class ChunkedIO
8
28
  attr_accessor :size, :data, :encoding
9
29
 
@@ -12,23 +32,36 @@ module Down
12
32
  @size = size
13
33
  @on_close = on_close
14
34
  @data = data
15
- @encoding = find_encoding(encoding || Encoding::BINARY)
35
+ @encoding = find_encoding(encoding || "binary")
16
36
  @rewindable = rewindable
17
37
  @buffer = nil
18
- @bytes_read = 0
38
+ @position = 0
19
39
 
20
- retrieve_chunk
40
+ retrieve_chunk # fetch first chunk so that we know whether the file is empty
21
41
  end
22
42
 
43
+ # Yields elements of the underlying enumerator.
23
44
  def each_chunk
24
- raise IOError, "closed stream" if closed?
45
+ fail IOError, "closed stream" if closed?
46
+
47
+ return enum_for(__method__) unless block_given?
25
48
 
26
- return enum_for(__method__) if !block_given?
27
49
  yield retrieve_chunk until chunks_depleted?
28
50
  end
29
51
 
52
+ # Implements IO#read semantics. Without arguments it retrieves and returns
53
+ # all content.
54
+ #
55
+ # With `length` argument returns exactly that number of bytes if they're
56
+ # available.
57
+ #
58
+ # With `outbuf` argument each call will return that same string object,
59
+ # where the value is replaced with retrieved content.
60
+ #
61
+ # If end of file is reached, returns empty string if called without
62
+ # arguments, or nil if called with arguments. Raises IOError if closed.
30
63
  def read(length = nil, outbuf = nil)
31
- raise IOError, "closed stream" if closed?
64
+ fail IOError, "closed stream" if closed?
32
65
 
33
66
  remaining_length = length
34
67
 
@@ -47,8 +80,22 @@ module Down
47
80
  data.to_s unless length && (data.nil? || data.empty?)
48
81
  end
49
82
 
83
+ # Implements IO#gets semantics. Without arguments it retrieves lines of
84
+ # content separated by newlines.
85
+ #
86
+ # With `separator` argument it does the following:
87
+ #
88
+ # * if `separator` is a nonempty string returns chunks of content
89
+ # surrounded with that sequence of bytes
90
+ # * if `separator` is an empty string returns paragraphs of content
91
+ # (content delimited by two newlines)
92
+ # * if `separator` is nil returns all content
93
+ #
94
+ # With `limit` argument returns maximum of that amount of bytes.
95
+ #
96
+ # Returns nil if end of file is reached. Raises IOError if closed.
50
97
  def gets(separator_or_limit = $/, limit = nil)
51
- raise IOError, "closed stream" if closed?
98
+ fail IOError, "closed stream" if closed?
52
99
 
53
100
  if separator_or_limit.is_a?(Integer)
54
101
  separator = $/
@@ -84,8 +131,22 @@ module Down
84
131
  line
85
132
  end
86
133
 
134
+ # Implements IO#readpartial semantics. If there is any content readily
135
+ # available reads from it, otherwise fetches and reads from the next chunk.
136
+ # It writes to and reads from the cache when needed.
137
+ #
138
+ # Without arguments it either returns all content that's readily available,
139
+ # or the next chunk. This is useful when you don't care about the size of
140
+ # chunks and you want to minimize string allocations.
141
+ #
142
+ # With `length` argument returns maximum of that amount of bytes.
143
+ #
144
+ # With `outbuf` argument each call will return that same string object,
145
+ # where the value is replaced with retrieved content.
146
+ #
147
+ # Raises EOFError if end of file is reached. Raises IOError if closed.
87
148
  def readpartial(length = nil, outbuf = nil)
88
- raise IOError, "closed stream" if closed?
149
+ fail IOError, "closed stream" if closed?
89
150
 
90
151
  data = outbuf.replace("").force_encoding(@encoding) if outbuf
91
152
 
@@ -95,7 +156,7 @@ module Down
95
156
  end
96
157
 
97
158
  if @buffer.nil? && (data.nil? || data.empty?)
98
- raise EOFError, "end of file reached" if chunks_depleted?
159
+ fail EOFError, "end of file reached" if chunks_depleted?
99
160
  @buffer = retrieve_chunk
100
161
  end
101
162
 
@@ -123,50 +184,63 @@ module Down
123
184
  end
124
185
  end
125
186
 
126
- @bytes_read += data.bytesize
187
+ @position += data.bytesize
127
188
 
128
189
  data
129
190
  end
130
191
 
192
+ # Implements IO#pos semantics. Returns the current position of the
193
+ # Down::ChunkedIO.
131
194
  def pos
132
- @bytes_read
195
+ @position
133
196
  end
134
197
 
198
+ # Implements IO#eof? semantics. Returns whether we've reached end of file.
199
+ # It returns true if cache is at the end and there is no more content to
200
+ # retrieve. Raises IOError if closed.
135
201
  def eof?
136
- raise IOError, "closed stream" if closed?
202
+ fail IOError, "closed stream" if closed?
137
203
 
138
204
  return false if cache && !cache.eof?
139
205
  @buffer.nil? && chunks_depleted?
140
206
  end
141
207
 
208
+ # Implements IO#rewind semantics. Rewinds the Down::ChunkedIO by rewinding
209
+ # the cache and setting the position to the beginning of the file. Raises
210
+ # IOError if closed or not rewindable.
142
211
  def rewind
143
- raise IOError, "closed stream" if closed?
144
- raise IOError, "this Down::ChunkedIO is not rewindable" if cache.nil?
212
+ fail IOError, "closed stream" if closed?
213
+ fail IOError, "this Down::ChunkedIO is not rewindable" if cache.nil?
145
214
 
146
215
  cache.rewind
147
- @bytes_read = 0
216
+ @position = 0
148
217
  end
149
218
 
219
+ # Implements IO#close semantics. Closes the Down::ChunkedIO by terminating
220
+ # chunk retrieval and deleting the cached content.
150
221
  def close
151
222
  return if @closed
152
223
 
153
224
  chunks_fiber.resume(:terminate) if chunks_fiber.alive?
154
- @buffer = nil
155
225
  cache.close! if cache
226
+ @buffer = nil
156
227
  @closed = true
157
228
  end
158
229
 
230
+ # Returns whether the Down::ChunkedIO has been closed.
159
231
  def closed?
160
232
  !!@closed
161
233
  end
162
234
 
235
+ # Returns whether the Down::ChunkedIO was specified as rewindable.
163
236
  def rewindable?
164
237
  @rewindable
165
238
  end
166
239
 
240
+ # Returns useful information about the Down::ChunkedIO object.
167
241
  def inspect
168
242
  string = String.new
169
- string << "#<Down::ChunkedIO"
243
+ string << "#<#{self.class.name}"
170
244
  string << " chunks=#{@chunks.inspect}"
171
245
  string << " size=#{size.inspect}"
172
246
  string << " encoding=#{encoding.inspect}"
@@ -179,20 +253,29 @@ module Down
179
253
 
180
254
  private
181
255
 
256
+ # If Down::ChunkedIO is specified as rewindable, returns a new Tempfile for
257
+ # writing read content to. This allows the Down::ChunkedIO to be rewinded.
182
258
  def cache
183
259
  @cache ||= Tempfile.new("down-chunked_io", binmode: true) if @rewindable
184
260
  end
185
261
 
262
+ # Returns current chunk and retrieves the next chunk. If next chunk is nil,
263
+ # we know we've reached EOF.
186
264
  def retrieve_chunk
187
265
  chunk = @next_chunk
188
266
  @next_chunk = chunks_fiber.resume
189
267
  chunk.force_encoding(@encoding) if chunk
190
268
  end
191
269
 
270
+ # Returns whether there is any content left to retrieve.
192
271
  def chunks_depleted?
193
272
  !chunks_fiber.alive?
194
273
  end
195
274
 
275
+ # Creates a Fiber wrapper around the underlying enumerator. The advantage
276
+ # of using a Fiber here is that we can terminate the chunk retrieval, in a
277
+ # way that executes any cleanup code that the enumerator carries. At the
278
+ # end of iteration the :on_close callback is executed if one was specified.
196
279
  def chunks_fiber
197
280
  @chunks_fiber ||= Fiber.new do
198
281
  begin
@@ -206,6 +289,8 @@ module Down
206
289
  end
207
290
  end
208
291
 
292
+ # Finds encoding by name. If the encoding couldn't be find, falls back to
293
+ # the generic binary encoding.
209
294
  def find_encoding(encoding)
210
295
  Encoding.find(encoding)
211
296
  rescue ArgumentError
@@ -9,17 +9,14 @@ require "cgi"
9
9
  require "base64"
10
10
 
11
11
  if Gem::Version.new(HTTP::VERSION) < Gem::Version.new("2.1.0")
12
- fail "Down requires HTTP.rb version 2.1.0 or higher"
12
+ fail "Down::Http requires HTTP.rb version 2.1.0 or higher"
13
13
  end
14
14
 
15
15
  module Down
16
16
  class Http < Backend
17
- def initialize(client_or_options = nil)
18
- options = client_or_options
19
- options = client_or_options.default_options if client_or_options.is_a?(HTTP::Client)
20
-
21
- @client = HTTP.headers("User-Agent" => "Down/#{Down::VERSION}").follow(max_hops: 2)
22
- @client = HTTP::Client.new(@client.default_options.merge(options)) if options
17
+ def initialize(client_or_options = {})
18
+ options = client_or_options.is_a?(HTTP::Client) ? client_or_options.default_options : client_or_options
19
+ @options = { headers: { "User-Agent" => "Down/#{Down::VERSION}" }, follow: { max_hops: 2 } }.merge(options)
23
20
  end
24
21
 
25
22
  def download(url, max_size: nil, progress_proc: nil, content_length_proc: nil, **options, &block)
@@ -59,47 +56,55 @@ module Down
59
56
  end
60
57
 
61
58
  def open(url, rewindable: true, **options, &block)
62
- begin
63
- response = get(url, **options, &block)
64
- rescue => exception
65
- request_error!(exception)
66
- end
59
+ response = get(url, **options, &block)
67
60
 
68
61
  response_error!(response) unless response.status.success?
69
62
 
70
- body_chunks = Enumerator.new do |yielder|
71
- begin
72
- response.body.each { |chunk| yielder << chunk }
73
- rescue => exception
74
- request_error!(exception)
75
- end
76
- end
77
-
78
63
  Down::ChunkedIO.new(
79
- chunks: body_chunks,
64
+ chunks: enum_for(:stream_body, response),
80
65
  size: response.content_length,
81
66
  encoding: response.content_type.charset,
82
67
  rewindable: rewindable,
83
- on_close: (-> { response.connection.close } unless @client.persistent?),
68
+ on_close: (-> { response.connection.close } unless default_client.persistent?),
84
69
  data: { status: response.code, headers: response.headers.to_h, response: response },
85
70
  )
86
71
  end
87
72
 
88
73
  private
89
74
 
75
+ def default_client
76
+ @default_client ||= HTTP::Client.new(@options)
77
+ end
78
+
90
79
  def get(url, **options, &block)
80
+ url = process_url(url, options)
81
+
82
+ client = default_client
83
+ client = block.call(client) if block
84
+
85
+ client.get(url, options)
86
+ rescue => exception
87
+ request_error!(exception)
88
+ end
89
+
90
+ def stream_body(response, &block)
91
+ response.body.each(&block)
92
+ rescue => exception
93
+ request_error!(exception)
94
+ end
95
+
96
+ def process_url(url, options)
91
97
  uri = HTTP::URI.parse(url)
92
98
 
93
99
  if uri.user || uri.password
94
100
  user, pass = uri.user, uri.password
95
101
  authorization = "Basic #{Base64.strict_encode64("#{user}:#{pass}")}"
96
- (options[:headers] ||= {}).merge!("Authorization" => authorization)
102
+ options[:headers] ||= {}
103
+ options[:headers].merge!("Authorization" => authorization)
97
104
  uri.user = uri.password = nil
98
105
  end
99
106
 
100
- client = @client
101
- client = block.call(client) if block
102
- client.get(url, options)
107
+ uri.to_s
103
108
  end
104
109
 
105
110
  def response_error!(response)
@@ -12,14 +12,14 @@ require "cgi"
12
12
  module Down
13
13
  class NetHttp < Backend
14
14
  def initialize(options = {})
15
- @options = { "User-Agent" => "Down/#{Down::VERSION}" }.merge(options)
15
+ @options = { "User-Agent" => "Down/#{Down::VERSION}", max_redirects: 2 }.merge(options)
16
16
  end
17
17
 
18
- def download(uri, options = {})
18
+ def download(url, options = {})
19
19
  options = @options.merge(options)
20
20
 
21
21
  max_size = options.delete(:max_size)
22
- max_redirects = options.delete(:max_redirects) || 2
22
+ max_redirects = options.delete(:max_redirects)
23
23
  progress_proc = options.delete(:progress_proc)
24
24
  content_length_proc = options.delete(:content_length_proc)
25
25
 
@@ -56,14 +56,7 @@ module Down
56
56
 
57
57
  open_uri_options.merge!(options)
58
58
 
59
- tries = max_redirects + 1
60
-
61
- begin
62
- uri = URI(uri)
63
- raise Down::InvalidUrl, "URL scheme needs to be http or https" unless uri.is_a?(URI::HTTP)
64
- rescue URI::InvalidURIError => exception
65
- raise Down::InvalidUrl, exception.message
66
- end
59
+ uri = ensure_uri(url)
67
60
 
68
61
  if uri.user || uri.password
69
62
  open_uri_options[:http_basic_authentication] ||= [uri.user, uri.password]
@@ -71,56 +64,123 @@ module Down
71
64
  uri.password = nil
72
65
  end
73
66
 
74
- begin
75
- downloaded_file = uri.open(open_uri_options)
76
- rescue OpenURI::HTTPRedirect => exception
77
- if (tries -= 1) > 0
78
- uri = exception.uri
67
+ open_uri_file = open_uri(uri, open_uri_options, follows_remaining: max_redirects)
79
68
 
80
- if !exception.io.meta["set-cookie"].to_s.empty?
81
- open_uri_options["Cookie"] = exception.io.meta["set-cookie"]
82
- end
69
+ tempfile = ensure_tempfile(open_uri_file)
70
+ tempfile.extend Down::NetHttp::DownloadedFile
83
71
 
84
- retry
85
- else
86
- raise Down::TooManyRedirects, "too many redirects"
87
- end
88
- rescue OpenURI::HTTPError => exception
89
- code, message = exception.io.status
90
- response_class = Net::HTTPResponse::CODE_TO_OBJ.fetch(code)
91
- response = response_class.new(nil, code, message)
92
- exception.io.metas.each do |name, values|
93
- values.each { |value| response.add_field(name, value) }
72
+ tempfile
73
+ end
74
+
75
+ def open(url, options = {})
76
+ options = @options.merge(options)
77
+
78
+ uri = ensure_uri(url)
79
+
80
+ request = Fiber.new do
81
+ net_http_request(uri, options) do |response|
82
+ Fiber.yield response
94
83
  end
84
+ end
95
85
 
96
- response_error!(response)
97
- rescue => exception
98
- request_error!(exception)
86
+ response = request.resume
87
+
88
+ response_error!(response) unless response.is_a?(Net::HTTPSuccess)
89
+
90
+ Down::ChunkedIO.new(
91
+ chunks: enum_for(:stream_body, response),
92
+ size: response["Content-Length"] && response["Content-Length"].to_i,
93
+ encoding: response.type_params["charset"],
94
+ rewindable: options.fetch(:rewindable, true),
95
+ on_close: -> { request.resume }, # close HTTP connnection
96
+ data: {
97
+ status: response.code.to_i,
98
+ headers: response.each_header.inject({}) { |headers, (downcased_name, value)|
99
+ name = downcased_name.split("-").map(&:capitalize).join("-")
100
+ headers.merge!(name => value)
101
+ },
102
+ response: response,
103
+ },
104
+ )
105
+ end
106
+
107
+ private
108
+
109
+ def open_uri(uri, options, follows_remaining: 0)
110
+ downloaded_file = uri.open(options)
111
+ rescue OpenURI::HTTPRedirect => exception
112
+ raise Down::TooManyRedirects, "too many redirects" if follows_remaining == 0
113
+
114
+ uri = exception.uri
115
+
116
+ if !exception.io.meta["set-cookie"].to_s.empty?
117
+ options["Cookie"] = exception.io.meta["set-cookie"]
99
118
  end
100
119
 
101
- # open-uri will return a StringIO instead of a Tempfile if the filesize is
102
- # less than 10 KB, so if it happens we convert it back to Tempfile. We want
103
- # to do this with a Tempfile as well, because open-uri doesn't preserve the
104
- # file extension, so we want to run it against #copy_to_tempfile which
105
- # does.
106
- open_uri_file = downloaded_file
107
- downloaded_file = copy_to_tempfile(uri.path, open_uri_file)
108
- OpenURI::Meta.init downloaded_file, open_uri_file
109
-
110
- downloaded_file.extend Down::NetHttp::DownloadedFile
111
- downloaded_file
120
+ follows_remaining -= 1
121
+ retry
122
+ rescue OpenURI::HTTPError => exception
123
+ code, message = exception.io.status
124
+ response_class = Net::HTTPResponse::CODE_TO_OBJ.fetch(code)
125
+ response = response_class.new(nil, code, message)
126
+ exception.io.metas.each do |name, values|
127
+ values.each { |value| response.add_field(name, value) }
128
+ end
129
+
130
+ response_error!(response)
131
+ rescue => exception
132
+ request_error!(exception)
112
133
  end
113
134
 
114
- def open(uri, options = {})
115
- options = @options.merge(options)
135
+ # Converts the open-uri result file into a Tempfile if it isn't already,
136
+ # and makes sure the Tempfile has the correct file extension.
137
+ def ensure_tempfile(open_uri_file)
138
+ extension = File.extname(open_uri_file.base_uri.path)
139
+ tempfile = Tempfile.new(["down-net_http", extension], binmode: true)
140
+
141
+ if open_uri_file.is_a?(Tempfile)
142
+ # Windows requires file descriptors to be closed before files are moved
143
+ open_uri_file.close
144
+ tempfile.close
145
+ FileUtils.mv open_uri_file.path, tempfile.path
146
+ else # open-uri returns a StringIO when there is less than 10KB of content
147
+ IO.copy_stream(open_uri_file, tempfile)
148
+ open_uri_file.close
149
+ end
150
+
151
+ tempfile.open
152
+ OpenURI::Meta.init tempfile, open_uri_file # adds open-uri methods
153
+
154
+ tempfile
155
+ end
156
+
157
+ def net_http_request(uri, options, follows_remaining: options.fetch(:max_redirects, 2), &block)
158
+ http, request = create_net_http(uri, options)
116
159
 
117
160
  begin
118
- uri = URI(uri)
119
- raise Down::InvalidUrl, "URL scheme needs to be http or https" unless uri.is_a?(URI::HTTP)
120
- rescue URI::InvalidURIError => exception
121
- raise Down::InvalidUrl, exception.message
161
+ response = http.start do
162
+ http.request(request) do |response|
163
+ unless response.is_a?(Net::HTTPRedirection)
164
+ yield response
165
+ response.instance_variable_set("@read", true) # mark response as read
166
+ end
167
+ end
168
+ end
169
+ rescue => exception
170
+ request_error!(exception)
171
+ end
172
+
173
+ if response.is_a?(Net::HTTPRedirection)
174
+ raise Down::TooManyRedirects if follows_remaining == 0
175
+
176
+ location = URI.parse(response["Location"])
177
+ location = uri + location if location.relative?
178
+
179
+ net_http_request(location, options, follows_remaining: follows_remaining - 1, &block)
122
180
  end
181
+ end
123
182
 
183
+ def create_net_http(uri, options)
124
184
  http_class = Net::HTTP
125
185
 
126
186
  if options[:proxy]
@@ -154,62 +214,21 @@ module Down
154
214
  get = Net::HTTP::Get.new(uri.request_uri, request_headers)
155
215
  get.basic_auth(uri.user, uri.password) if uri.user || uri.password
156
216
 
157
- request = Fiber.new do
158
- http.start do
159
- http.request(get) do |response|
160
- Fiber.yield response
161
- response.instance_variable_set("@read", true)
162
- end
163
- end
164
- end
165
-
166
- begin
167
- response = request.resume
168
- rescue => exception
169
- request_error!(exception)
170
- end
171
-
172
- response_error!(response) unless (200..299).cover?(response.code.to_i)
173
-
174
- body_chunks = Enumerator.new do |yielder|
175
- begin
176
- response.read_body { |chunk| yielder << chunk }
177
- rescue => exception
178
- request_error!(exception)
179
- end
180
- end
181
-
182
- Down::ChunkedIO.new(
183
- chunks: body_chunks,
184
- size: response["Content-Length"] && response["Content-Length"].to_i,
185
- encoding: response.type_params["charset"],
186
- rewindable: options.fetch(:rewindable, true),
187
- on_close: -> { request.resume }, # close HTTP connnection
188
- data: {
189
- status: response.code.to_i,
190
- headers: response.each_header.inject({}) { |headers, (downcased_name, value)|
191
- name = downcased_name.split("-").map(&:capitalize).join("-")
192
- headers.merge!(name => value)
193
- },
194
- response: response,
195
- },
196
- )
217
+ [http, get]
197
218
  end
198
219
 
199
- private
220
+ def stream_body(response, &block)
221
+ response.read_body(&block)
222
+ rescue => exception
223
+ request_error!(exception)
224
+ end
200
225
 
201
- def copy_to_tempfile(basename, io)
202
- tempfile = Tempfile.new(["down-net_http", File.extname(basename)], binmode: true)
203
- if io.is_a?(OpenURI::Meta) && io.is_a?(Tempfile)
204
- io.close
205
- tempfile.close
206
- FileUtils.mv io.path, tempfile.path
207
- else
208
- IO.copy_stream(io, tempfile)
209
- io.rewind
210
- end
211
- tempfile.open
212
- tempfile
226
+ def ensure_uri(url)
227
+ uri = URI(url)
228
+ raise Down::InvalidUrl, "URL scheme needs to be http or https" unless uri.is_a?(URI::HTTP)
229
+ uri
230
+ rescue URI::InvalidURIError => exception
231
+ raise Down::InvalidUrl, exception.message
213
232
  end
214
233
 
215
234
  def response_error!(response)
@@ -227,7 +246,7 @@ module Down
227
246
 
228
247
  def request_error!(exception)
229
248
  case exception
230
- when Errno::ETIMEDOUT, Net::OpenTimeout
249
+ when Net::OpenTimeout
231
250
  raise Down::TimeoutError, "timed out waiting for connection to open"
232
251
  when Net::ReadTimeout
233
252
  raise Down::TimeoutError, "timed out while reading data"
@@ -1,5 +1,5 @@
1
1
  # frozen-string-literal: true
2
2
 
3
3
  module Down
4
- VERSION = "4.1.1"
4
+ VERSION = "4.2.0"
5
5
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: down
3
3
  version: !ruby/object:Gem::Version
4
- version: 4.1.1
4
+ version: 4.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Janko Marohnić
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2017-10-14 00:00:00.000000000 Z
11
+ date: 2017-12-22 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: minitest
@@ -101,6 +101,7 @@ executables: []
101
101
  extensions: []
102
102
  extra_rdoc_files: []
103
103
  files:
104
+ - CHANGELOG.md
104
105
  - LICENSE.txt
105
106
  - README.md
106
107
  - down.gemspec