down 4.1.1 → 4.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 5e6bbd2b9b51b4b88834180552227abece0bed9a
4
- data.tar.gz: 102d96a339e3a252af86a59d99dd992ad80a2bac
3
+ metadata.gz: 443491ec05f03e75593726f54314b04587637bf2
4
+ data.tar.gz: a04c39fd140b7266f7ea80bfee2da4f649c079c4
5
5
  SHA512:
6
- metadata.gz: ae39c2f9a1e29a0d75af0ec43efcb02b9d5774a3aa3eb4c0ec1d73cd14552b7a405050e3c39b81857db97cda8355b744823bc375523f2a99a4966e5a5ac35767
7
- data.tar.gz: a3f8183ed4fdd120cc35dfbb69231975c86411f23e3e17007b737edf7a8963735af807679eb8e704af3e58828f8901283094b3c23ed620e0e2f2468a66e79320
6
+ metadata.gz: 1a79349a849954e2e68b5bcf187e79825255c9a1c3fd5f1ea81dc12f2d97e4d20a5419443a6e3d3942e82b3af8615fcce24cfcdbd4ded091048d7f02e17d5039
7
+ data.tar.gz: 2fb8ed137234217f44790f35a767dead06eecff2174fba0960711ebde04d90ff4ee76f20dced2ab2bd6a48b7e5b31c335556504d6892d9603f6c89092cc7512b
@@ -0,0 +1,274 @@
1
+ ## 4.2.0 (2017-12-22)
2
+
3
+ * Handle `:max_redirects` in `Down::NetHttp#open` and follow up to 2 redirects by default (@janko-m)
4
+
5
+ ## 4.1.1 (2017-10-15)
6
+
7
+ * Raise all system call exceptions as `Down::ConnectionError` in `Down::NetHttp` (@janko-m)
8
+
9
+ * Raise `Errno::ETIMEDOUT` as `Down::TimeoutError` in `Down::NetHttp` (@janko-m)
10
+
11
+ * Raise `Addressable::URI::InvalidURIError` as `Down::InvalidUrl` in `Down::Http` (@janko-m)
12
+
13
+ ## 4.1.0 (2017-08-29)
14
+
15
+ * Fix `FiberError` occurring on `Down::NetHttp.open` when response is chunked and gzipped (@janko-m)
16
+
17
+ * Use a default `User-Agent` in `Down::NetHttp.open` (@janko-m)
18
+
19
+ * Fix raw read timeout error sometimes being raised instead of `Down::TimeoutError` in `Down.open` (@janko-m)
20
+
21
+ * `Down::ChunkedIO` can now be parsed by the CSV Ruby standard library (@janko-m)
22
+
23
+ * Implement `Down::ChunkedIO#gets` (@janko-m)
24
+
25
+ * Implement `Down::ChunkedIO#pos` (@janko-m)
26
+
27
+ ## 4.0.1 (2017-07-08)
28
+
29
+ * Load and assign the `NetHttp` backend immediately on `require "down"` (@janko-m)
30
+
31
+ * Remove undocumented `Down::ChunkedIO#backend=` that was added in 4.0.0 to avoid confusion (@janko-m)
32
+
33
+ ## 4.0.0 (2017-06-24)
34
+
35
+ * Don't apply `Down.download` and `Down.open` overrides when loading a backend (@janko-m)
36
+
37
+ * Remove `Down::Http.client` attribute accessor (@janko-m)
38
+
39
+ * Make `Down::NetHttp`, `Down::Http`, and `Down::Wget` classes instead of modules (@janko-m)
40
+
41
+ * Remove `Down.copy_to_tempfile` (@janko-m)
42
+
43
+ * Add Wget backend (@janko-m)
44
+
45
+ * Add `:content_length_proc` and `:progress_proc` to the HTTP.rb backend (@janko-m)
46
+
47
+ * Halve string allocations in `Down::ChunkedIO#readpartial` when buffer string is not used (@janko-m)
48
+
49
+ ## 3.2.0 (2017-06-21)
50
+
51
+ * Add `Down::ChunkedIO#readpartial` for more memory efficient reading (@janko-m)
52
+
53
+ * Fix `Down::ChunkedIO` not returning second part of the last chunk if it was previously partially read (@janko-m)
54
+
55
+ * Strip internal variables from `Down::ChunkedIO#inspect` and show only the important ones (@janko-m)
56
+
57
+ * Add `Down::ChunkedIO#closed?` (@janko-m)
58
+
59
+ * Add `Down::ChunkedIO#rewindable?` (@janko-m)
60
+
61
+ * In `Down::ChunkedIO` only create the Tempfile if it's going to be used (@janko-m)
62
+
63
+ ## 3.1.0 (2017-06-16)
64
+
65
+ * Split `Down::NotFound` into explanatory exceptions (@janko-m)
66
+
67
+ * Add `:read_timeout` and `:open_timeout` options to `Down::NetHttp.open` (@janko-m)
68
+
69
+ * Return an `Integer` in `data[:status]` on a result of `Down.open` when using the HTTP.rb strategy (@janko-m)
70
+
71
+ ## 3.0.0 (2017-05-24)
72
+
73
+ * Make `Down.open` pass encoding from content type charset to `Down::ChunkedIO` (@janko-m)
74
+
75
+ * Add `:encoding` option to `Down::ChunkedIO.new` for specifying the encoding of returned content (@janko-m)
76
+
77
+ * Add HTTP.rb backend as an alternative to Net::HTTP (@janko-m)
78
+
79
+ * Stop testing on MRI 2.1 (@janko-m)
80
+
81
+ * Forward cookies from the `Set-Cookie` response header when redirecting (@janko-m)
82
+
83
+ * Add `frozen-string-literal: true` comments for less string allocations on Ruby 2.3+ (@janko-m)
84
+
85
+ * Modify `#content_type` to return nil instead of `application/octet-stream` when `Content-Type` is blank in `Down.download` (@janko-m)
86
+
87
+ * `Down::ChunkedIO#read`, `#each_chunk`, `#eof?`, `rewind` now raise an `IOError` when `Down::ChunkedIO` has been closed (@janko-m)
88
+
89
+ * `Down::ChunkedIO` now caches only the content that has been read (@janko-m)
90
+
91
+ * Add `Down::ChunkedIO#size=` to allow assigning size after the `Down::ChunkedIO` has been instantiated (@janko-m)
92
+
93
+ * Make `:size` an optional argument in `Down::ChunkedIO` (@janko-m)
94
+
95
+ * Call enumerator's `ensure` block when `Down::ChunkedIO#close` is called (@janko-m)
96
+
97
+ * Add `:rewindable` option to `Down::ChunkedIO` and `Down.open` for disabling caching read content into a file (@janko-m)
98
+
99
+ * Drop support for MRI 2.0 (@janko-m)
100
+
101
+ * Drop support for MRI 1.9.3 (@janko-m)
102
+
103
+ * Remove deprecated `:progress` option (@janko-m)
104
+
105
+ * Remove deprecated `:timeout` option (@janko-m)
106
+
107
+ * Reraise only a subset of exceptions as `Down::NotFound` in `Down.download` (@janko-m)
108
+
109
+ * Support streaming of "Transfer-Encoding: chunked" responses in `Down.open` again (@janko-m)
110
+
111
+ * Remove deprecated `Down.stream` (@janko-m)
112
+
113
+ ## 2.5.1 (2017-05-13)
114
+
115
+ * Remove URL from the error messages (@janko-m)
116
+
117
+ ## 2.5.0 (2017-05-03)
118
+
119
+ * Support both Strings and `URI` objects in `Down.download` and `Down.open` (@olleolleolle)
120
+
121
+ * Work around a `CGI.unescape` bug in Ruby 2.4.
122
+
123
+ * Apply HTTP Basic authentication contained in URLs in `Down.open`.
124
+
125
+ * Raise `Down::NotFound` on 4xx and 5xx responses in `Down.open`.
126
+
127
+ * Write `:status` and `:headers` information to `Down::ChunkedIO#data` in `Down.open`.
128
+
129
+ * Add `#data` attribute to `Down::ChunkedIO` for saving custom result data.
130
+
131
+ * Don't save retrieved chunks into the file in `Down::ChunkedIO#each_chunk`.
132
+
133
+ * Add `:proxy` option to `Down.download` and `Down.open`.
134
+
135
+ ## 2.4.3 (2017-04-06)
136
+
137
+ * Show the input URL in the `Down::Error` message.
138
+
139
+ ## 2.4.2 (2017-03-28)
140
+
141
+ * Don't raise `StopIteration` in `Down::ChunkedIO` when `:chunks` is an empty
142
+ Enumerator.
143
+
144
+ ## 2.4.1 (2017-03-23)
145
+
146
+ * Correctly detect empty filename from `Content-Disposition` header, and
147
+ in this case continue extracting filename from URL.
148
+
149
+ ## 2.4.0 (2017-03-19)
150
+
151
+ * Allow `Down.open` to accept request headers as options with String keys,
152
+ just like `Down.download` does.
153
+
154
+ * Decode URI-decoded filenames from the `Content-Disposition` header
155
+
156
+ * Parse filenames without quotes from the `Content-Disposition` header
157
+
158
+ ## 2.3.8 (2016-11-07)
159
+
160
+ * Work around `Transfer-Encoding: chunked` responses by downloading whole
161
+ response body.
162
+
163
+ ## 2.3.7 (2016-11-06)
164
+
165
+ * In `Down.open` send requests using the URI *path* instead of the full URI.
166
+
167
+ ## 2.3.6 (2016-07-26)
168
+
169
+ * Read #original_filename from the "Content-Disposition" header.
170
+
171
+ * Extract `Down::ChunkedIO` into a file, so that it can be required separately.
172
+
173
+ * In `Down.stream` close the IO after reading from it.
174
+
175
+ ## 2.3.5 (2016-07-18)
176
+
177
+ * Prevent reading the whole response body when the IO returned by `Down.open`
178
+ is closed.
179
+
180
+ ## 2.3.4 (2016-07-14)
181
+
182
+ * Require `net/http`
183
+
184
+ ## 2.3.3 (2016-06-23)
185
+
186
+ * Improve `Down::ChunkedIO` (and thus `Down.open`):
187
+
188
+ - `#each_chunk` and `#read` now automatically call `:on_close` when all
189
+ chunks were downloaded
190
+
191
+ - `#eof?` had incorrect behaviour, where it would return true if
192
+ everything was downloaded, instead only when it's also at the end of
193
+ the file
194
+
195
+ - `#close` can now be called multiple times, as the `:on_close` will always
196
+ be called only once
197
+
198
+ - end of download is now detected immediately when the last chunk was
199
+ downloaded (as opposed to after trying to read the next one)
200
+
201
+ ## 2.3.2 (2016-06-22)
202
+
203
+ * Add `Down.open` for IO-like streaming, and deprecate `Down.stream` (janko-m)
204
+
205
+ * Allow URLs with basic authentication (`http://user:password@example.com`) (janko-m)
206
+
207
+ ## ~~2.3.1 (2016-06-22)~~ (yanked)
208
+
209
+ ## ~~2.3.0 (2016-06-22)~~ (yanked)
210
+
211
+ ## 2.2.1 (2016-06-06)
212
+
213
+ * Make Down work on Windows (martinsefcik)
214
+
215
+ * Close an internal file descriptor that was left open (martinsefcik)
216
+
217
+ ## 2.2.0 (2016-05-19)
218
+
219
+ * Add ability to follow redirects, and allow maximum of 2 redirects by default (janko-m)
220
+
221
+ * Fix a potential Windows issue when extracting `#original_filename` (janko-m)
222
+
223
+ * Fix `#original_filename` being incomplete if filename contains a slash (janko-m)
224
+
225
+ ## 2.1.0 (2016-04-12)
226
+
227
+ * Make `:progress_proc` and `:content_length_proc` work with `:max_size` (janko-m)
228
+
229
+ * Deprecate `:progress` in favor of open-uri's `:progress_proc` (janko-m)
230
+
231
+ * Deprecate `:timeout` in favor of open-uri's `:open_timeout` and `:read_timeout` (janko-m)
232
+
233
+ * Add `Down.stream` for streaming remote files in chunks (janko-m)
234
+
235
+ * Replace deprecated `URI.encode` with `CGI.unescape` in downloaded file's `#original_filename` (janko-m)
236
+
237
+ ## 2.0.1 (2016-03-06)
238
+
239
+ * Add error message when file was to large, and use a simple error message for other generic download failures (janko-m)
240
+
241
+ ## 2.0.0 (2016-02-03)
242
+
243
+ * Fix an issue where valid URLs were transformed into invalid URLs (janko-m)
244
+
245
+ - All input URLs now have to be properly encoded, which should already be the
246
+ case in most situations.
247
+
248
+ * Include the error class when download fails (janko-m)
249
+
250
+ ## 1.1.0 (2016-01-26)
251
+
252
+ * Forward all additional options to open-uri (janko-m)
253
+
254
+ ## 1.0.5 (2015-12-18)
255
+
256
+ * Move the open-uri file to the new location instead of copying it (janko-m)
257
+
258
+ ## 1.0.4 (2015-11-19)
259
+
260
+ * Delete the old open-uri file after using it (janko-m)
261
+
262
+ ## 1.0.3 (2015-11-16)
263
+
264
+ * Fix `#download` and `#copy_to_tempfile` not preserving the file extension (janko-m)
265
+
266
+ * Fix `#copy_to_tempfile` not working when given a nested basename (janko-m)
267
+
268
+ ## 1.0.2 (2015-10-24)
269
+
270
+ * Fix Down not working with Ruby 1.9.3 (janko-m)
271
+
272
+ ## 1.0.1 (2015-10-01)
273
+
274
+ * Don't allow redirects when downloading files (janko-m)
data/README.md CHANGED
@@ -263,6 +263,10 @@ default maximum of 2 redirects will be followed, but you can change it via the
263
263
  Down::NetHttp.download("http://example.com/image.jpg") # 2 redirects allowed
264
264
  Down::NetHttp.download("http://example.com/image.jpg", max_redirects: 5) # 5 redirects allowed
265
265
  Down::NetHttp.download("http://example.com/image.jpg", max_redirects: 0) # 0 redirects allowed
266
+
267
+ Down::NetHttp.open("http://example.com/image.jpg") # 2 redirects allowed
268
+ Down::NetHttp.open("http://example.com/image.jpg", max_redirects: 5) # 5 redirects allowed
269
+ Down::NetHttp.open("http://example.com/image.jpg", max_redirects: 0) # 0 redirects allowed
266
270
  ```
267
271
 
268
272
  #### Proxy
@@ -343,7 +347,7 @@ Net::HTTP include:
343
347
  All additional options will be forwarded to `HTTP::Client#request`:
344
348
 
345
349
  ```rb
346
- Down::Http.download("http://example.org/image.jpg", timeout: { open: 3 })
350
+ Down::Http.download("http://example.org/image.jpg", timeout: { connect: 3 })
347
351
  Down::Http.open("http://example.org/image.jpg", follow: { max_hops: 0 })
348
352
  ```
349
353
 
@@ -351,16 +355,16 @@ If you prefer to add options using the chainable API, you can pass a block:
351
355
 
352
356
  ```rb
353
357
  Down::Http.open("http://example.org/image.jpg") do |client|
354
- client.timeout(open: 3)
358
+ client.timeout(connect: 3)
355
359
  end
356
360
  ```
357
361
 
358
362
  You can also initialize the backend with default options:
359
363
 
360
364
  ```rb
361
- http = Down::Http.new(timeout: { open: 3 })
365
+ http = Down::Http.new(timeout: { connect: 3 })
362
366
  # or
363
- http = Down::Http.new(HTTP.timeout(open: 3))
367
+ http = Down::Http.new(HTTP.timeout(connect: 3))
364
368
 
365
369
  http.download("http://example.com/image.jpg")
366
370
  http.open("http://example.com/image.jpg")
@@ -12,7 +12,7 @@ Gem::Specification.new do |spec|
12
12
  spec.email = ["janko.marohnic@gmail.com"]
13
13
  spec.license = "MIT"
14
14
 
15
- spec.files = Dir["README.md", "LICENSE.txt", "*.gemspec", "lib/**/*.rb"]
15
+ spec.files = Dir["README.md", "LICENSE.txt", "CHANGELOG.md", "*.gemspec", "lib/**/*.rb"]
16
16
  spec.require_path = "lib"
17
17
 
18
18
  spec.add_development_dependency "minitest", "~> 5.8"
@@ -4,6 +4,26 @@ require "tempfile"
4
4
  require "fiber"
5
5
 
6
6
  module Down
7
+ # Wraps an enumerator that yields chunks of content into an IO object. It
8
+ # implements some essential IO methods:
9
+ #
10
+ # * IO#read
11
+ # * IO#readpartial
12
+ # * IO#gets
13
+ # * IO#size
14
+ # * IO#pos
15
+ # * IO#eof?
16
+ # * IO#rewind
17
+ # * IO#close
18
+ #
19
+ # By default the Down::ChunkedIO caches all read content into a tempfile,
20
+ # allowing it to be rewindable. If rewindability won't be used, it can be
21
+ # disabled by setting `:rewindable` to false, which eliminates any disk I/O.
22
+ #
23
+ # Any cleanup code (i.e. ensure block) that the given enumerator carries is
24
+ # guaranteed to get executed, either when all content has been retrieved or
25
+ # when Down::ChunkedIO is closed. One can also specify an `:on_close`
26
+ # callback that will also get executed in those situations.
7
27
  class ChunkedIO
8
28
  attr_accessor :size, :data, :encoding
9
29
 
@@ -12,23 +32,36 @@ module Down
12
32
  @size = size
13
33
  @on_close = on_close
14
34
  @data = data
15
- @encoding = find_encoding(encoding || Encoding::BINARY)
35
+ @encoding = find_encoding(encoding || "binary")
16
36
  @rewindable = rewindable
17
37
  @buffer = nil
18
- @bytes_read = 0
38
+ @position = 0
19
39
 
20
- retrieve_chunk
40
+ retrieve_chunk # fetch first chunk so that we know whether the file is empty
21
41
  end
22
42
 
43
+ # Yields elements of the underlying enumerator.
23
44
  def each_chunk
24
- raise IOError, "closed stream" if closed?
45
+ fail IOError, "closed stream" if closed?
46
+
47
+ return enum_for(__method__) unless block_given?
25
48
 
26
- return enum_for(__method__) if !block_given?
27
49
  yield retrieve_chunk until chunks_depleted?
28
50
  end
29
51
 
52
+ # Implements IO#read semantics. Without arguments it retrieves and returns
53
+ # all content.
54
+ #
55
+ # With `length` argument returns exactly that number of bytes if they're
56
+ # available.
57
+ #
58
+ # With `outbuf` argument each call will return that same string object,
59
+ # where the value is replaced with retrieved content.
60
+ #
61
+ # If end of file is reached, returns empty string if called without
62
+ # arguments, or nil if called with arguments. Raises IOError if closed.
30
63
  def read(length = nil, outbuf = nil)
31
- raise IOError, "closed stream" if closed?
64
+ fail IOError, "closed stream" if closed?
32
65
 
33
66
  remaining_length = length
34
67
 
@@ -47,8 +80,22 @@ module Down
47
80
  data.to_s unless length && (data.nil? || data.empty?)
48
81
  end
49
82
 
83
+ # Implements IO#gets semantics. Without arguments it retrieves lines of
84
+ # content separated by newlines.
85
+ #
86
+ # With `separator` argument it does the following:
87
+ #
88
+ # * if `separator` is a nonempty string returns chunks of content
89
+ # surrounded with that sequence of bytes
90
+ # * if `separator` is an empty string returns paragraphs of content
91
+ # (content delimited by two newlines)
92
+ # * if `separator` is nil returns all content
93
+ #
94
+ # With `limit` argument returns maximum of that amount of bytes.
95
+ #
96
+ # Returns nil if end of file is reached. Raises IOError if closed.
50
97
  def gets(separator_or_limit = $/, limit = nil)
51
- raise IOError, "closed stream" if closed?
98
+ fail IOError, "closed stream" if closed?
52
99
 
53
100
  if separator_or_limit.is_a?(Integer)
54
101
  separator = $/
@@ -84,8 +131,22 @@ module Down
84
131
  line
85
132
  end
86
133
 
134
+ # Implements IO#readpartial semantics. If there is any content readily
135
+ # available reads from it, otherwise fetches and reads from the next chunk.
136
+ # It writes to and reads from the cache when needed.
137
+ #
138
+ # Without arguments it either returns all content that's readily available,
139
+ # or the next chunk. This is useful when you don't care about the size of
140
+ # chunks and you want to minimize string allocations.
141
+ #
142
+ # With `length` argument returns maximum of that amount of bytes.
143
+ #
144
+ # With `outbuf` argument each call will return that same string object,
145
+ # where the value is replaced with retrieved content.
146
+ #
147
+ # Raises EOFError if end of file is reached. Raises IOError if closed.
87
148
  def readpartial(length = nil, outbuf = nil)
88
- raise IOError, "closed stream" if closed?
149
+ fail IOError, "closed stream" if closed?
89
150
 
90
151
  data = outbuf.replace("").force_encoding(@encoding) if outbuf
91
152
 
@@ -95,7 +156,7 @@ module Down
95
156
  end
96
157
 
97
158
  if @buffer.nil? && (data.nil? || data.empty?)
98
- raise EOFError, "end of file reached" if chunks_depleted?
159
+ fail EOFError, "end of file reached" if chunks_depleted?
99
160
  @buffer = retrieve_chunk
100
161
  end
101
162
 
@@ -123,50 +184,63 @@ module Down
123
184
  end
124
185
  end
125
186
 
126
- @bytes_read += data.bytesize
187
+ @position += data.bytesize
127
188
 
128
189
  data
129
190
  end
130
191
 
192
+ # Implements IO#pos semantics. Returns the current position of the
193
+ # Down::ChunkedIO.
131
194
  def pos
132
- @bytes_read
195
+ @position
133
196
  end
134
197
 
198
+ # Implements IO#eof? semantics. Returns whether we've reached end of file.
199
+ # It returns true if cache is at the end and there is no more content to
200
+ # retrieve. Raises IOError if closed.
135
201
  def eof?
136
- raise IOError, "closed stream" if closed?
202
+ fail IOError, "closed stream" if closed?
137
203
 
138
204
  return false if cache && !cache.eof?
139
205
  @buffer.nil? && chunks_depleted?
140
206
  end
141
207
 
208
+ # Implements IO#rewind semantics. Rewinds the Down::ChunkedIO by rewinding
209
+ # the cache and setting the position to the beginning of the file. Raises
210
+ # IOError if closed or not rewindable.
142
211
  def rewind
143
- raise IOError, "closed stream" if closed?
144
- raise IOError, "this Down::ChunkedIO is not rewindable" if cache.nil?
212
+ fail IOError, "closed stream" if closed?
213
+ fail IOError, "this Down::ChunkedIO is not rewindable" if cache.nil?
145
214
 
146
215
  cache.rewind
147
- @bytes_read = 0
216
+ @position = 0
148
217
  end
149
218
 
219
+ # Implements IO#close semantics. Closes the Down::ChunkedIO by terminating
220
+ # chunk retrieval and deleting the cached content.
150
221
  def close
151
222
  return if @closed
152
223
 
153
224
  chunks_fiber.resume(:terminate) if chunks_fiber.alive?
154
- @buffer = nil
155
225
  cache.close! if cache
226
+ @buffer = nil
156
227
  @closed = true
157
228
  end
158
229
 
230
+ # Returns whether the Down::ChunkedIO has been closed.
159
231
  def closed?
160
232
  !!@closed
161
233
  end
162
234
 
235
+ # Returns whether the Down::ChunkedIO was specified as rewindable.
163
236
  def rewindable?
164
237
  @rewindable
165
238
  end
166
239
 
240
+ # Returns useful information about the Down::ChunkedIO object.
167
241
  def inspect
168
242
  string = String.new
169
- string << "#<Down::ChunkedIO"
243
+ string << "#<#{self.class.name}"
170
244
  string << " chunks=#{@chunks.inspect}"
171
245
  string << " size=#{size.inspect}"
172
246
  string << " encoding=#{encoding.inspect}"
@@ -179,20 +253,29 @@ module Down
179
253
 
180
254
  private
181
255
 
256
+ # If Down::ChunkedIO is specified as rewindable, returns a new Tempfile for
257
+ # writing read content to. This allows the Down::ChunkedIO to be rewinded.
182
258
  def cache
183
259
  @cache ||= Tempfile.new("down-chunked_io", binmode: true) if @rewindable
184
260
  end
185
261
 
262
+ # Returns current chunk and retrieves the next chunk. If next chunk is nil,
263
+ # we know we've reached EOF.
186
264
  def retrieve_chunk
187
265
  chunk = @next_chunk
188
266
  @next_chunk = chunks_fiber.resume
189
267
  chunk.force_encoding(@encoding) if chunk
190
268
  end
191
269
 
270
+ # Returns whether there is any content left to retrieve.
192
271
  def chunks_depleted?
193
272
  !chunks_fiber.alive?
194
273
  end
195
274
 
275
+ # Creates a Fiber wrapper around the underlying enumerator. The advantage
276
+ # of using a Fiber here is that we can terminate the chunk retrieval, in a
277
+ # way that executes any cleanup code that the enumerator carries. At the
278
+ # end of iteration the :on_close callback is executed if one was specified.
196
279
  def chunks_fiber
197
280
  @chunks_fiber ||= Fiber.new do
198
281
  begin
@@ -206,6 +289,8 @@ module Down
206
289
  end
207
290
  end
208
291
 
292
+ # Finds encoding by name. If the encoding couldn't be find, falls back to
293
+ # the generic binary encoding.
209
294
  def find_encoding(encoding)
210
295
  Encoding.find(encoding)
211
296
  rescue ArgumentError
@@ -9,17 +9,14 @@ require "cgi"
9
9
  require "base64"
10
10
 
11
11
  if Gem::Version.new(HTTP::VERSION) < Gem::Version.new("2.1.0")
12
- fail "Down requires HTTP.rb version 2.1.0 or higher"
12
+ fail "Down::Http requires HTTP.rb version 2.1.0 or higher"
13
13
  end
14
14
 
15
15
  module Down
16
16
  class Http < Backend
17
- def initialize(client_or_options = nil)
18
- options = client_or_options
19
- options = client_or_options.default_options if client_or_options.is_a?(HTTP::Client)
20
-
21
- @client = HTTP.headers("User-Agent" => "Down/#{Down::VERSION}").follow(max_hops: 2)
22
- @client = HTTP::Client.new(@client.default_options.merge(options)) if options
17
+ def initialize(client_or_options = {})
18
+ options = client_or_options.is_a?(HTTP::Client) ? client_or_options.default_options : client_or_options
19
+ @options = { headers: { "User-Agent" => "Down/#{Down::VERSION}" }, follow: { max_hops: 2 } }.merge(options)
23
20
  end
24
21
 
25
22
  def download(url, max_size: nil, progress_proc: nil, content_length_proc: nil, **options, &block)
@@ -59,47 +56,55 @@ module Down
59
56
  end
60
57
 
61
58
  def open(url, rewindable: true, **options, &block)
62
- begin
63
- response = get(url, **options, &block)
64
- rescue => exception
65
- request_error!(exception)
66
- end
59
+ response = get(url, **options, &block)
67
60
 
68
61
  response_error!(response) unless response.status.success?
69
62
 
70
- body_chunks = Enumerator.new do |yielder|
71
- begin
72
- response.body.each { |chunk| yielder << chunk }
73
- rescue => exception
74
- request_error!(exception)
75
- end
76
- end
77
-
78
63
  Down::ChunkedIO.new(
79
- chunks: body_chunks,
64
+ chunks: enum_for(:stream_body, response),
80
65
  size: response.content_length,
81
66
  encoding: response.content_type.charset,
82
67
  rewindable: rewindable,
83
- on_close: (-> { response.connection.close } unless @client.persistent?),
68
+ on_close: (-> { response.connection.close } unless default_client.persistent?),
84
69
  data: { status: response.code, headers: response.headers.to_h, response: response },
85
70
  )
86
71
  end
87
72
 
88
73
  private
89
74
 
75
+ def default_client
76
+ @default_client ||= HTTP::Client.new(@options)
77
+ end
78
+
90
79
  def get(url, **options, &block)
80
+ url = process_url(url, options)
81
+
82
+ client = default_client
83
+ client = block.call(client) if block
84
+
85
+ client.get(url, options)
86
+ rescue => exception
87
+ request_error!(exception)
88
+ end
89
+
90
+ def stream_body(response, &block)
91
+ response.body.each(&block)
92
+ rescue => exception
93
+ request_error!(exception)
94
+ end
95
+
96
+ def process_url(url, options)
91
97
  uri = HTTP::URI.parse(url)
92
98
 
93
99
  if uri.user || uri.password
94
100
  user, pass = uri.user, uri.password
95
101
  authorization = "Basic #{Base64.strict_encode64("#{user}:#{pass}")}"
96
- (options[:headers] ||= {}).merge!("Authorization" => authorization)
102
+ options[:headers] ||= {}
103
+ options[:headers].merge!("Authorization" => authorization)
97
104
  uri.user = uri.password = nil
98
105
  end
99
106
 
100
- client = @client
101
- client = block.call(client) if block
102
- client.get(url, options)
107
+ uri.to_s
103
108
  end
104
109
 
105
110
  def response_error!(response)
@@ -12,14 +12,14 @@ require "cgi"
12
12
  module Down
13
13
  class NetHttp < Backend
14
14
  def initialize(options = {})
15
- @options = { "User-Agent" => "Down/#{Down::VERSION}" }.merge(options)
15
+ @options = { "User-Agent" => "Down/#{Down::VERSION}", max_redirects: 2 }.merge(options)
16
16
  end
17
17
 
18
- def download(uri, options = {})
18
+ def download(url, options = {})
19
19
  options = @options.merge(options)
20
20
 
21
21
  max_size = options.delete(:max_size)
22
- max_redirects = options.delete(:max_redirects) || 2
22
+ max_redirects = options.delete(:max_redirects)
23
23
  progress_proc = options.delete(:progress_proc)
24
24
  content_length_proc = options.delete(:content_length_proc)
25
25
 
@@ -56,14 +56,7 @@ module Down
56
56
 
57
57
  open_uri_options.merge!(options)
58
58
 
59
- tries = max_redirects + 1
60
-
61
- begin
62
- uri = URI(uri)
63
- raise Down::InvalidUrl, "URL scheme needs to be http or https" unless uri.is_a?(URI::HTTP)
64
- rescue URI::InvalidURIError => exception
65
- raise Down::InvalidUrl, exception.message
66
- end
59
+ uri = ensure_uri(url)
67
60
 
68
61
  if uri.user || uri.password
69
62
  open_uri_options[:http_basic_authentication] ||= [uri.user, uri.password]
@@ -71,56 +64,123 @@ module Down
71
64
  uri.password = nil
72
65
  end
73
66
 
74
- begin
75
- downloaded_file = uri.open(open_uri_options)
76
- rescue OpenURI::HTTPRedirect => exception
77
- if (tries -= 1) > 0
78
- uri = exception.uri
67
+ open_uri_file = open_uri(uri, open_uri_options, follows_remaining: max_redirects)
79
68
 
80
- if !exception.io.meta["set-cookie"].to_s.empty?
81
- open_uri_options["Cookie"] = exception.io.meta["set-cookie"]
82
- end
69
+ tempfile = ensure_tempfile(open_uri_file)
70
+ tempfile.extend Down::NetHttp::DownloadedFile
83
71
 
84
- retry
85
- else
86
- raise Down::TooManyRedirects, "too many redirects"
87
- end
88
- rescue OpenURI::HTTPError => exception
89
- code, message = exception.io.status
90
- response_class = Net::HTTPResponse::CODE_TO_OBJ.fetch(code)
91
- response = response_class.new(nil, code, message)
92
- exception.io.metas.each do |name, values|
93
- values.each { |value| response.add_field(name, value) }
72
+ tempfile
73
+ end
74
+
75
+ def open(url, options = {})
76
+ options = @options.merge(options)
77
+
78
+ uri = ensure_uri(url)
79
+
80
+ request = Fiber.new do
81
+ net_http_request(uri, options) do |response|
82
+ Fiber.yield response
94
83
  end
84
+ end
95
85
 
96
- response_error!(response)
97
- rescue => exception
98
- request_error!(exception)
86
+ response = request.resume
87
+
88
+ response_error!(response) unless response.is_a?(Net::HTTPSuccess)
89
+
90
+ Down::ChunkedIO.new(
91
+ chunks: enum_for(:stream_body, response),
92
+ size: response["Content-Length"] && response["Content-Length"].to_i,
93
+ encoding: response.type_params["charset"],
94
+ rewindable: options.fetch(:rewindable, true),
95
+ on_close: -> { request.resume }, # close HTTP connnection
96
+ data: {
97
+ status: response.code.to_i,
98
+ headers: response.each_header.inject({}) { |headers, (downcased_name, value)|
99
+ name = downcased_name.split("-").map(&:capitalize).join("-")
100
+ headers.merge!(name => value)
101
+ },
102
+ response: response,
103
+ },
104
+ )
105
+ end
106
+
107
+ private
108
+
109
+ def open_uri(uri, options, follows_remaining: 0)
110
+ downloaded_file = uri.open(options)
111
+ rescue OpenURI::HTTPRedirect => exception
112
+ raise Down::TooManyRedirects, "too many redirects" if follows_remaining == 0
113
+
114
+ uri = exception.uri
115
+
116
+ if !exception.io.meta["set-cookie"].to_s.empty?
117
+ options["Cookie"] = exception.io.meta["set-cookie"]
99
118
  end
100
119
 
101
- # open-uri will return a StringIO instead of a Tempfile if the filesize is
102
- # less than 10 KB, so if it happens we convert it back to Tempfile. We want
103
- # to do this with a Tempfile as well, because open-uri doesn't preserve the
104
- # file extension, so we want to run it against #copy_to_tempfile which
105
- # does.
106
- open_uri_file = downloaded_file
107
- downloaded_file = copy_to_tempfile(uri.path, open_uri_file)
108
- OpenURI::Meta.init downloaded_file, open_uri_file
109
-
110
- downloaded_file.extend Down::NetHttp::DownloadedFile
111
- downloaded_file
120
+ follows_remaining -= 1
121
+ retry
122
+ rescue OpenURI::HTTPError => exception
123
+ code, message = exception.io.status
124
+ response_class = Net::HTTPResponse::CODE_TO_OBJ.fetch(code)
125
+ response = response_class.new(nil, code, message)
126
+ exception.io.metas.each do |name, values|
127
+ values.each { |value| response.add_field(name, value) }
128
+ end
129
+
130
+ response_error!(response)
131
+ rescue => exception
132
+ request_error!(exception)
112
133
  end
113
134
 
114
- def open(uri, options = {})
115
- options = @options.merge(options)
135
+ # Converts the open-uri result file into a Tempfile if it isn't already,
136
+ # and makes sure the Tempfile has the correct file extension.
137
+ def ensure_tempfile(open_uri_file)
138
+ extension = File.extname(open_uri_file.base_uri.path)
139
+ tempfile = Tempfile.new(["down-net_http", extension], binmode: true)
140
+
141
+ if open_uri_file.is_a?(Tempfile)
142
+ # Windows requires file descriptors to be closed before files are moved
143
+ open_uri_file.close
144
+ tempfile.close
145
+ FileUtils.mv open_uri_file.path, tempfile.path
146
+ else # open-uri returns a StringIO when there is less than 10KB of content
147
+ IO.copy_stream(open_uri_file, tempfile)
148
+ open_uri_file.close
149
+ end
150
+
151
+ tempfile.open
152
+ OpenURI::Meta.init tempfile, open_uri_file # adds open-uri methods
153
+
154
+ tempfile
155
+ end
156
+
157
+ def net_http_request(uri, options, follows_remaining: options.fetch(:max_redirects, 2), &block)
158
+ http, request = create_net_http(uri, options)
116
159
 
117
160
  begin
118
- uri = URI(uri)
119
- raise Down::InvalidUrl, "URL scheme needs to be http or https" unless uri.is_a?(URI::HTTP)
120
- rescue URI::InvalidURIError => exception
121
- raise Down::InvalidUrl, exception.message
161
+ response = http.start do
162
+ http.request(request) do |response|
163
+ unless response.is_a?(Net::HTTPRedirection)
164
+ yield response
165
+ response.instance_variable_set("@read", true) # mark response as read
166
+ end
167
+ end
168
+ end
169
+ rescue => exception
170
+ request_error!(exception)
171
+ end
172
+
173
+ if response.is_a?(Net::HTTPRedirection)
174
+ raise Down::TooManyRedirects if follows_remaining == 0
175
+
176
+ location = URI.parse(response["Location"])
177
+ location = uri + location if location.relative?
178
+
179
+ net_http_request(location, options, follows_remaining: follows_remaining - 1, &block)
122
180
  end
181
+ end
123
182
 
183
+ def create_net_http(uri, options)
124
184
  http_class = Net::HTTP
125
185
 
126
186
  if options[:proxy]
@@ -154,62 +214,21 @@ module Down
154
214
  get = Net::HTTP::Get.new(uri.request_uri, request_headers)
155
215
  get.basic_auth(uri.user, uri.password) if uri.user || uri.password
156
216
 
157
- request = Fiber.new do
158
- http.start do
159
- http.request(get) do |response|
160
- Fiber.yield response
161
- response.instance_variable_set("@read", true)
162
- end
163
- end
164
- end
165
-
166
- begin
167
- response = request.resume
168
- rescue => exception
169
- request_error!(exception)
170
- end
171
-
172
- response_error!(response) unless (200..299).cover?(response.code.to_i)
173
-
174
- body_chunks = Enumerator.new do |yielder|
175
- begin
176
- response.read_body { |chunk| yielder << chunk }
177
- rescue => exception
178
- request_error!(exception)
179
- end
180
- end
181
-
182
- Down::ChunkedIO.new(
183
- chunks: body_chunks,
184
- size: response["Content-Length"] && response["Content-Length"].to_i,
185
- encoding: response.type_params["charset"],
186
- rewindable: options.fetch(:rewindable, true),
187
- on_close: -> { request.resume }, # close HTTP connnection
188
- data: {
189
- status: response.code.to_i,
190
- headers: response.each_header.inject({}) { |headers, (downcased_name, value)|
191
- name = downcased_name.split("-").map(&:capitalize).join("-")
192
- headers.merge!(name => value)
193
- },
194
- response: response,
195
- },
196
- )
217
+ [http, get]
197
218
  end
198
219
 
199
- private
220
+ def stream_body(response, &block)
221
+ response.read_body(&block)
222
+ rescue => exception
223
+ request_error!(exception)
224
+ end
200
225
 
201
- def copy_to_tempfile(basename, io)
202
- tempfile = Tempfile.new(["down-net_http", File.extname(basename)], binmode: true)
203
- if io.is_a?(OpenURI::Meta) && io.is_a?(Tempfile)
204
- io.close
205
- tempfile.close
206
- FileUtils.mv io.path, tempfile.path
207
- else
208
- IO.copy_stream(io, tempfile)
209
- io.rewind
210
- end
211
- tempfile.open
212
- tempfile
226
+ def ensure_uri(url)
227
+ uri = URI(url)
228
+ raise Down::InvalidUrl, "URL scheme needs to be http or https" unless uri.is_a?(URI::HTTP)
229
+ uri
230
+ rescue URI::InvalidURIError => exception
231
+ raise Down::InvalidUrl, exception.message
213
232
  end
214
233
 
215
234
  def response_error!(response)
@@ -227,7 +246,7 @@ module Down
227
246
 
228
247
  def request_error!(exception)
229
248
  case exception
230
- when Errno::ETIMEDOUT, Net::OpenTimeout
249
+ when Net::OpenTimeout
231
250
  raise Down::TimeoutError, "timed out waiting for connection to open"
232
251
  when Net::ReadTimeout
233
252
  raise Down::TimeoutError, "timed out while reading data"
@@ -1,5 +1,5 @@
1
1
  # frozen-string-literal: true
2
2
 
3
3
  module Down
4
- VERSION = "4.1.1"
4
+ VERSION = "4.2.0"
5
5
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: down
3
3
  version: !ruby/object:Gem::Version
4
- version: 4.1.1
4
+ version: 4.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Janko Marohnić
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2017-10-14 00:00:00.000000000 Z
11
+ date: 2017-12-22 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: minitest
@@ -101,6 +101,7 @@ executables: []
101
101
  extensions: []
102
102
  extra_rdoc_files: []
103
103
  files:
104
+ - CHANGELOG.md
104
105
  - LICENSE.txt
105
106
  - README.md
106
107
  - down.gemspec