down 4.8.1 → 5.2.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: ab86935222e38ea1981821660b1ae90ff71a4a739706468d2c7bfaadd48b4956
4
- data.tar.gz: 8c8b1245a41f426e4ecd4c378c1bd3ed2ac967b92b58f421c3ea6dc9bee78c2e
3
+ metadata.gz: d48d573f542ac195a462c3bab0ebe1546c8dd78b0a5210eed1d1fed1f66b674f
4
+ data.tar.gz: 492b997d3e889475544267d753df5fb900d28728131602e6950a57dfacb7842c
5
5
  SHA512:
6
- metadata.gz: 73bd4f1f52e1791171b9ed3d68b50d4ac294e12de2bb947df04d3fffb2bb960904ff35c5f89d158add762b137652818720c0ea05ac41ebc119d7d34d57e5cbdb
7
- data.tar.gz: d0b0f7c9e3ef8e602beeb7ef8b3216c090c863ea1434c8cf40096da6833546aa525cc1b99b3b448ed92d70494d0e25f6d0df72dcf286b59c0b335c9fcd6885ac
6
+ metadata.gz: e0fa81667368033a51588f37ae8a5fca2198ce341268e5c6fc24a8241be1af7ce88cd4bc7e8076f69af41b3e0318a4c8733c28a9d173dcdae750b59a9e2401e6
7
+ data.tar.gz: 568632e1c75daa84a838675ae49ceb844a4500e781b8303cf3c1259974b1d01a26e3d2420e08d0d544c81ef312dc790cafdd47fcdc2ed6f3a853b91f8a99bd48
@@ -1,3 +1,47 @@
1
+ ## 5.2.0 (2020-09-20)
2
+
3
+ * Add `:uri_normalizer` option to `Down::NetHttp` (@janko)
4
+
5
+ * Add `:http_basic_authentication` option to `Down::NetHttp#download` (@janko)
6
+
7
+ * Fix uninitialized instance variables warnings in `Down::ChunkedIO` (@janko)
8
+
9
+ * Handle unknown HTTP error codes in `Down::NetHttp` (@darndt)
10
+
11
+ ## 5.1.1 (2020-02-04)
12
+
13
+ * Fix keyword arguments warnings on Ruby 2.7 in `Down.download` and `Down.open` (@janko)
14
+
15
+ ## 5.1.0 (2020-01-09)
16
+
17
+ * Fix keyword arguments warnings on Ruby 2.7 (@janko)
18
+
19
+ * Fix `FrozenError` exception in `Down::ChunkedIO#readpartial` (@janko)
20
+
21
+ * Deprecate passing headers as top-level options in `Down::NetHttp` (@janko)
22
+
23
+ ## 5.0.1 (2019-12-20)
24
+
25
+ * In `Down::NetHttp` only use Addressable normalization if `URI.parse` fails (@coding-chimp)
26
+
27
+ ## 5.0.0 (2019-09-26)
28
+
29
+ * Change `ChunkedIO#each_chunk` to return chunks in original encoding (@janko)
30
+
31
+ * Always return binary strings in `ChunkedIO#readpartial` (@janko)
32
+
33
+ * Handle frozen chunks in `Down::ChunkedIO` (@janko)
34
+
35
+ * Change `ChunkedIO#gets` to return lines in specified encoding (@janko)
36
+
37
+ * Halve memory allocation for `ChunkedIO#gets` (@janko)
38
+
39
+ * Halve memory allocation for `ChunkedIO#read` without arguments (@janko)
40
+
41
+ * Drop support for `HTTP::Client` argument in `Down::HTTP.new` (@janko)
42
+
43
+ * Repurpose `Down::NotFound` to be raised on `404 Not Found` response (@janko)
44
+
1
45
  ## 4.8.1 (2019-05-01)
2
46
 
3
47
  * Make `ChunkedIO#read`/`#readpartial` with length always return strings in binary encoding (@janko)
data/README.md CHANGED
@@ -1,13 +1,13 @@
1
1
  # Down
2
2
 
3
3
  Down is a utility tool for streaming, flexible and safe downloading of remote
4
- files. It can use [open-uri] + `Net::HTTP`, [HTTP.rb] or `wget` as the backend
4
+ files. It can use [open-uri] + `Net::HTTP`, [http.rb] or `wget` as the backend
5
5
  HTTP library.
6
6
 
7
7
  ## Installation
8
8
 
9
9
  ```rb
10
- gem "down", "~> 4.4"
10
+ gem "down", "~> 5.0"
11
11
  ```
12
12
 
13
13
  ## Downloading
@@ -57,8 +57,12 @@ specific location on disk, you can specify the `:destination` option:
57
57
 
58
58
  ```rb
59
59
  Down.download("http://example.com/image.jpg", destination: "/path/to/destination")
60
+ #=> nil
60
61
  ```
61
62
 
63
+ In this case `Down.download` won't have any return value, so if you need a File
64
+ object you'll have to create it manually.
65
+
62
66
  ### Basic authentication
63
67
 
64
68
  `Down.download` and `Down.open` will automatically detect and apply HTTP basic
@@ -103,6 +107,16 @@ remote_file.eof? #=> true
103
107
  remote_file.close # closes the HTTP connection and deletes the internal Tempfile
104
108
  ```
105
109
 
110
+ The following IO methods are implemented:
111
+
112
+ * `#read` & `#readpartial`
113
+ * `#gets`
114
+ * `#seek`
115
+ * `#pos` & `#tell`
116
+ * `#eof?`
117
+ * `#rewind`
118
+ * `#close`
119
+
106
120
  ### Caching
107
121
 
108
122
  By default the downloaded content is internally cached into a `Tempfile`, so
@@ -147,10 +161,10 @@ remote_file.data[:headers] #=> { ... }
147
161
  remote_file.data[:response] # returns the response object
148
162
  ```
149
163
 
150
- Note that `Down::NotFound` error will automatically be raised if response
151
- status was 4xx or 5xx.
164
+ Note that a `Down::ResponseError` exception will automatically be raised if
165
+ response status was 4xx or 5xx.
152
166
 
153
- ### `Down::ChunkedIO`
167
+ ### Down::ChunkedIO
154
168
 
155
169
  The `Down.open` performs HTTP logic and returns an instance of
156
170
  `Down::ChunkedIO`. However, `Down::ChunkedIO` is a generic class that can wrap
@@ -196,21 +210,23 @@ the `Down::Error` subclasses. This is Down's exception hierarchy:
196
210
 
197
211
  * `Down::Error`
198
212
  * `Down::TooLarge`
199
- * `Down::NotFound`
200
- * `Down::InvalidUrl`
201
- * `Down::TooManyRedirects`
202
- * `Down::ResponseError`
203
- * `Down::ClientError`
204
- * `Down::ServerError`
205
- * `Down::ConnectionError`
206
- * `Down::TimeoutError`
207
- * `Down::SSLError`
213
+ * `Down::InvalidUrl`
214
+ * `Down::TooManyRedirects`
215
+ * `Down::ResponseError`
216
+ * `Down::ClientError`
217
+ * `Down::NotFound`
218
+ * `Down::ServerError`
219
+ * `Down::ConnectionError`
220
+ * `Down::TimeoutError`
221
+ * `Down::SSLError`
208
222
 
209
223
  ## Backends
210
224
 
211
- By default Down implements `Down.download` and `Down.open` using the built-in
212
- [open-uri] + [Net::HTTP] Ruby standard libraries. However, there are other
213
- backends as well, see the sections below.
225
+ The following backends are available:
226
+
227
+ * [Down::NetHttp](#downnethttp) (default)
228
+ * [Down::Http](#downhttp)
229
+ * [Down::Wget](#downwget)
214
230
 
215
231
  You can use the backend directly:
216
232
 
@@ -232,7 +248,10 @@ Down.download("...")
232
248
  Down.open("...")
233
249
  ```
234
250
 
235
- ### open-uri + Net::HTTP
251
+ ### Down::NetHttp
252
+
253
+ The `Down::NetHttp` backend implements downloads using [open-uri] and
254
+ [Net::HTTP].
236
255
 
237
256
  ```rb
238
257
  gem "down", "~> 4.4"
@@ -314,6 +333,18 @@ Down::NetHttp.open("http://example.com/image.jpg",
314
333
  ssl_verify_mode: OpenSSL::SSL::VERIFY_PEER)
315
334
  ```
316
335
 
336
+ #### URI normalization
337
+
338
+ If the URL isn't parseable by `URI.parse`, `Down::NetHttp` will
339
+ attempt to normalize the URL using [Addressable::URI], URI-escaping
340
+ any potentially unescaped characters. You can change the normalizer
341
+ via the `:uri_normalizer` option:
342
+
343
+ ```rb
344
+ # this skips URL normalization
345
+ Down::NetHttp.download("http://example.com/image.jpg", uri_normalizer: -> (url) { url })
346
+ ```
347
+
317
348
  #### Additional options
318
349
 
319
350
  Any additional options passed to `Down.download` will be forwarded to
@@ -334,7 +365,9 @@ net_http.download("http://example.com/image.jpg")
334
365
  net_http.open("http://example.com/image.jpg")
335
366
  ```
336
367
 
337
- ### HTTP.rb
368
+ ### Down::Http
369
+
370
+ The `Down::Http` backend implements downloads using the [http.rb] gem.
338
371
 
339
372
  ```rb
340
373
  gem "down", "~> 4.4"
@@ -350,7 +383,7 @@ io = Down::Http.open("http://nature.com/forest.jpg")
350
383
  io #=> #<Down::ChunkedIO ...>
351
384
  ```
352
385
 
353
- Some features that give the HTTP.rb backend an advantage over `open-uri` +
386
+ Some features that give the http.rb backend an advantage over `open-uri` and
354
387
  `Net::HTTP` include:
355
388
 
356
389
  * Low memory usage (**10x less** than `open-uri`/`Net::HTTP`)
@@ -401,7 +434,10 @@ down = Down::Http.new(method: :post)
401
434
  down.download("http://example.org/image.jpg")
402
435
  ```
403
436
 
404
- ### Wget (experimental)
437
+ ### Down::Wget (experimental)
438
+
439
+ The `Down::Wget` backend implements downloads using the `wget` command line
440
+ utility.
405
441
 
406
442
  ```rb
407
443
  gem "down", "~> 4.4"
@@ -418,9 +454,8 @@ io = Down::Wget.open("http://nature.com/forest.jpg")
418
454
  io #=> #<Down::ChunkedIO ...>
419
455
  ```
420
456
 
421
- The Wget backend uses the `wget` command line utility for downloading. One
422
- major advantage of `wget` is that it automatically resumes downloads that were
423
- interrupted due to network failures, which is very useful when you're
457
+ One major advantage of `wget` is that it automatically resumes downloads that
458
+ were interrupted due to network failures, which is very useful when you're
424
459
  downloading large files.
425
460
 
426
461
  However, the Wget backend should still be considered experimental, as it wasn't
@@ -447,10 +482,12 @@ wget.open("http://nature.com/forest.jpg")
447
482
 
448
483
  ## Supported Ruby versions
449
484
 
450
- * MRI 2.2
451
485
  * MRI 2.3
452
486
  * MRI 2.4
453
- * JRuby
487
+ * MRI 2.5
488
+ * MRI 2.6
489
+ * MRI 2.7
490
+ * JRuby 9.2
454
491
 
455
492
  ## Development
456
493
 
@@ -469,6 +506,6 @@ you'll need to have Docker installed and running.
469
506
 
470
507
  [open-uri]: http://ruby-doc.org/stdlib-2.3.0/libdoc/open-uri/rdoc/OpenURI.html
471
508
  [Net::HTTP]: https://ruby-doc.org/stdlib-2.4.1/libdoc/net/http/rdoc/Net/HTTP.html
472
- [HTTP.rb]: https://github.com/httprb/http
509
+ [http.rb]: https://github.com/httprb/http
473
510
  [Addressable::URI]: https://github.com/sporkmonger/addressable
474
511
  [kennethreitz/httpbin]: https://github.com/kennethreitz/httpbin
@@ -4,7 +4,7 @@ Gem::Specification.new do |spec|
4
4
  spec.name = "down"
5
5
  spec.version = Down::VERSION
6
6
 
7
- spec.required_ruby_version = ">= 2.1"
7
+ spec.required_ruby_version = ">= 2.3"
8
8
 
9
9
  spec.summary = "Robust streaming downloads using Net::HTTP, HTTP.rb or wget."
10
10
  spec.homepage = "https://github.com/janko/down"
@@ -20,8 +20,9 @@ Gem::Specification.new do |spec|
20
20
  spec.add_development_dependency "minitest", "~> 5.8"
21
21
  spec.add_development_dependency "mocha", "~> 1.5"
22
22
  spec.add_development_dependency "rake"
23
- spec.add_development_dependency "http", "~> 4.0"
23
+ spec.add_development_dependency "http", "~> 4.3"
24
24
  spec.add_development_dependency "posix-spawn" unless RUBY_ENGINE == "jruby"
25
25
  spec.add_development_dependency "http_parser.rb"
26
26
  spec.add_development_dependency "docker-api"
27
+ spec.add_development_dependency "warning" if RUBY_VERSION >= "2.4"
27
28
  end
@@ -6,12 +6,12 @@ require "down/net_http"
6
6
  module Down
7
7
  module_function
8
8
 
9
- def download(*args, &block)
10
- backend.download(*args, &block)
9
+ def download(*args, **options, &block)
10
+ backend.download(*args, **options, &block)
11
11
  end
12
12
 
13
- def open(*args, &block)
14
- backend.open(*args, &block)
13
+ def open(*args, **options, &block)
14
+ backend.open(*args, **options, &block)
15
15
  end
16
16
 
17
17
  # Allows setting a backend via a symbol or a downloader object.
@@ -9,12 +9,12 @@ require "fileutils"
9
9
 
10
10
  module Down
11
11
  class Backend
12
- def self.download(*args, &block)
13
- new.download(*args, &block)
12
+ def self.download(*args, **options, &block)
13
+ new.download(*args, **options, &block)
14
14
  end
15
15
 
16
- def self.open(*args, &block)
17
- new.open(*args, &block)
16
+ def self.open(*args, **options, &block)
17
+ new.open(*args, **options, &block)
18
18
  end
19
19
 
20
20
  private
@@ -36,6 +36,8 @@ module Down
36
36
  @rewindable = rewindable
37
37
  @buffer = nil
38
38
  @position = 0
39
+ @next_chunk = nil
40
+ @closed = false
39
41
 
40
42
  retrieve_chunk # fetch first chunk so that we know whether the file is empty
41
43
  end
@@ -63,21 +65,20 @@ module Down
63
65
  def read(length = nil, outbuf = nil)
64
66
  fail IOError, "closed stream" if closed?
65
67
 
66
- remaining_length = length
68
+ data = outbuf.clear.force_encoding(Encoding::BINARY) if outbuf
69
+ data ||= "".b
67
70
 
68
- begin
69
- data = readpartial(remaining_length, outbuf)
70
- data = data.dup unless outbuf
71
- remaining_length = length - data.bytesize if length
72
- rescue EOFError
73
- end
71
+ remaining_length = length
74
72
 
75
73
  until remaining_length == 0 || eof?
76
- data << readpartial(remaining_length)
74
+ data << readpartial(remaining_length, buffer ||= String.new)
77
75
  remaining_length = length - data.bytesize if length
78
76
  end
79
77
 
80
- data.to_s unless length && length > 0 && (data.nil? || data.empty?)
78
+ buffer.clear if buffer # deallocate string
79
+
80
+ data.force_encoding(@encoding) unless length
81
+ data unless data.empty? && length && length > 0
81
82
  end
82
83
 
83
84
  # Implements IO#gets semantics. Without arguments it retrieves lines of
@@ -108,27 +109,33 @@ module Down
108
109
 
109
110
  separator = "\n\n" if separator.empty?
110
111
 
111
- begin
112
- data = readpartial(limit)
112
+ data = String.new
113
113
 
114
- until data.include?(separator) || data.bytesize == limit || eof?
115
- remaining_length = limit - data.bytesize if limit
116
- data << readpartial(remaining_length, outbuf ||= String.new)
117
- end
114
+ until data.include?(separator) || data.bytesize == limit || eof?
115
+ remaining_length = limit - data.bytesize if limit
116
+ data << readpartial(remaining_length, buffer ||= String.new)
117
+ end
118
+
119
+ buffer.clear if buffer # deallocate buffer
120
+
121
+ line, extra = data.split(separator, 2)
122
+ line << separator if data.include?(separator)
118
123
 
119
- line, extra = data.split(separator, 2)
120
- line << separator if data.include?(separator)
124
+ data.clear # deallocate data
121
125
 
126
+ if extra
122
127
  if cache
123
- cache.pos -= extra.to_s.bytesize
128
+ cache.pos -= extra.bytesize
124
129
  else
125
- @buffer = @buffer.to_s.prepend(extra.to_s)
130
+ if @buffer
131
+ @buffer.prepend(extra)
132
+ else
133
+ @buffer = extra
134
+ end
126
135
  end
127
- rescue EOFError
128
- line = nil
129
136
  end
130
137
 
131
- line
138
+ line.force_encoding(@encoding) if line
132
139
  end
133
140
 
134
141
  # Implements IO#readpartial semantics. If there is any content readily
@@ -139,33 +146,33 @@ module Down
139
146
  # or the next chunk. This is useful when you don't care about the size of
140
147
  # chunks and you want to minimize string allocations.
141
148
  #
142
- # With `length` argument returns maximum of that amount of bytes.
149
+ # With `maxlen` argument returns maximum of that amount of bytes (default
150
+ # is 16KB).
143
151
  #
144
152
  # With `outbuf` argument each call will return that same string object,
145
153
  # where the value is replaced with retrieved content.
146
154
  #
147
155
  # Raises EOFError if end of file is reached. Raises IOError if closed.
148
- def readpartial(length = nil, outbuf = nil)
156
+ def readpartial(maxlen = nil, outbuf = nil)
149
157
  fail IOError, "closed stream" if closed?
150
158
 
151
- data = outbuf.clear.force_encoding(@encoding) if outbuf
159
+ maxlen ||= 16*1024
152
160
 
153
- return data.to_s if length == 0
161
+ data = cache.read(maxlen, outbuf) if cache && !cache.eof?
162
+ data ||= outbuf.clear.force_encoding(Encoding::BINARY) if outbuf
163
+ data ||= "".b
154
164
 
155
- if cache && !cache.eof?
156
- data = cache.read(length, outbuf)
157
- data.force_encoding(@encoding)
158
- end
165
+ return data if maxlen == 0
159
166
 
160
- if @buffer.nil? && (data.nil? || data.empty?)
167
+ if @buffer.nil? && data.empty?
161
168
  fail EOFError, "end of file reached" if chunks_depleted?
162
169
  @buffer = retrieve_chunk
163
170
  end
164
171
 
165
- remaining_length = data && length ? length - data.bytesize : length
172
+ remaining_length = maxlen - data.bytesize
166
173
 
167
174
  unless @buffer.nil? || remaining_length == 0
168
- if remaining_length && remaining_length < @buffer.bytesize
175
+ if remaining_length < @buffer.bytesize
169
176
  buffered_data = @buffer.byteslice(0, remaining_length)
170
177
  @buffer = @buffer.byteslice(remaining_length..-1)
171
178
  else
@@ -173,21 +180,46 @@ module Down
173
180
  @buffer = nil
174
181
  end
175
182
 
176
- if data
177
- data << buffered_data
178
- else
179
- data = buffered_data
180
- end
183
+ data << buffered_data
181
184
 
182
185
  cache.write(buffered_data) if cache
183
186
 
184
- buffered_data.clear unless buffered_data.equal?(data)
187
+ buffered_data.clear unless buffered_data.frozen?
185
188
  end
186
189
 
187
190
  @position += data.bytesize
188
191
 
189
- data.force_encoding(Encoding::BINARY) if length
190
- data
192
+ data.force_encoding(Encoding::BINARY)
193
+ end
194
+
195
+ # Implements IO#seek semantics.
196
+ def seek(amount, whence = IO::SEEK_SET)
197
+ fail Errno::ESPIPE, "Illegal seek" if cache.nil?
198
+
199
+ case whence
200
+ when IO::SEEK_SET, :SET
201
+ target_pos = amount
202
+ when IO::SEEK_CUR, :CUR
203
+ target_pos = @position + amount
204
+ when IO::SEEK_END, :END
205
+ unless chunks_depleted?
206
+ cache.seek(0, IO::SEEK_END)
207
+ IO.copy_stream(self, File::NULL)
208
+ end
209
+
210
+ target_pos = cache.size + amount
211
+ else
212
+ fail ArgumentError, "invalid whence: #{whence.inspect}"
213
+ end
214
+
215
+ if target_pos <= cache.size
216
+ cache.seek(target_pos)
217
+ else
218
+ cache.seek(0, IO::SEEK_END)
219
+ IO.copy_stream(self, File::NULL, target_pos - cache.size)
220
+ end
221
+
222
+ @position = cache.pos
191
223
  end
192
224
 
193
225
  # Implements IO#pos semantics. Returns the current position of the
@@ -195,6 +227,7 @@ module Down
195
227
  def pos
196
228
  @position
197
229
  end
230
+ alias tell pos
198
231
 
199
232
  # Implements IO#eof? semantics. Returns whether we've reached end of file.
200
233
  # It returns true if cache is at the end and there is no more content to
@@ -272,7 +305,7 @@ module Down
272
305
  def retrieve_chunk
273
306
  chunk = @next_chunk
274
307
  @next_chunk = chunks_fiber.resume
275
- chunk.force_encoding(@encoding) if chunk
308
+ chunk
276
309
  end
277
310
 
278
311
  # Returns whether there is any content left to retrieve.
@@ -7,20 +7,17 @@ module Down
7
7
  # raised when the file is larger than the specified maximum size
8
8
  class TooLarge < Error; end
9
9
 
10
- # raised when the file failed to be retrieved for whatever reason
11
- class NotFound < Error; end
12
-
13
10
  # raised when the given URL couldn't be parsed
14
- class InvalidUrl < NotFound; end
11
+ class InvalidUrl < Error; end
15
12
 
16
13
  # raised when the number of redirects was larger than the specified maximum
17
- class TooManyRedirects < NotFound; end
14
+ class TooManyRedirects < Error; end
18
15
 
19
16
  # raised when response returned 4xx or 5xx response
20
- class ResponseError < NotFound
17
+ class ResponseError < Error
21
18
  attr_reader :response
22
19
 
23
- def initialize(message, response: nil)
20
+ def initialize(message, response = nil)
24
21
  super(message)
25
22
  @response = response
26
23
  end
@@ -29,15 +26,18 @@ module Down
29
26
  # raised when response returned 4xx response
30
27
  class ClientError < ResponseError; end
31
28
 
29
+ # raised when response returned 404 response
30
+ class NotFound < ClientError; end
31
+
32
32
  # raised when response returned 5xx response
33
33
  class ServerError < ResponseError; end
34
34
 
35
35
  # raised when there was an error connecting to the server
36
- class ConnectionError < NotFound; end
36
+ class ConnectionError < Error; end
37
37
 
38
38
  # raised when connecting to the server too longer than the specified timeout
39
39
  class TimeoutError < ConnectionError; end
40
40
 
41
41
  # raised when an SSL error was raised
42
- class SSLError < NotFound; end
42
+ class SSLError < Error; end
43
43
  end
@@ -12,12 +12,7 @@ module Down
12
12
  # Provides streaming downloads implemented with HTTP.rb.
13
13
  class Http < Backend
14
14
  # Initializes the backend with common defaults.
15
- def initialize(options = {}, &block)
16
- if options.is_a?(HTTP::Client)
17
- warn "[Down] Passing an HTTP::Client object to Down::Http#initialize is deprecated and won't be supported in Down 5. Use the block initialization instead."
18
- options = options.default_options.to_hash
19
- end
20
-
15
+ def initialize(**options, &block)
21
16
  @method = options.delete(:method) || :get
22
17
  @client = HTTP
23
18
  .headers("User-Agent" => "Down/#{Down::VERSION}")
@@ -111,9 +106,10 @@ module Down
111
106
 
112
107
  # Raises non-sucessful response as a Down::ResponseError.
113
108
  def response_error!(response)
114
- args = [response.status.to_s, response: response]
109
+ args = [response.status.to_s, response]
115
110
 
116
111
  case response.code
112
+ when 404 then raise Down::NotFound.new(*args)
117
113
  when 400..499 then raise Down::ClientError.new(*args)
118
114
  when 500..599 then raise Down::ServerError.new(*args)
119
115
  else raise Down::ResponseError.new(*args)
@@ -12,27 +12,34 @@ require "fileutils"
12
12
  module Down
13
13
  # Provides streaming downloads implemented with Net::HTTP and open-uri.
14
14
  class NetHttp < Backend
15
+ URI_NORMALIZER = -> (url) do
16
+ addressable_uri = Addressable::URI.parse(url)
17
+ addressable_uri.normalize.to_s
18
+ end
19
+
15
20
  # Initializes the backend with common defaults.
16
- def initialize(options = {})
17
- @options = {
18
- "User-Agent" => "Down/#{Down::VERSION}",
19
- max_redirects: 2,
20
- open_timeout: 30,
21
- read_timeout: 30,
22
- }.merge(options)
21
+ def initialize(*args, **options)
22
+ @options = merge_options({
23
+ headers: { "User-Agent" => "Down/#{Down::VERSION}" },
24
+ max_redirects: 2,
25
+ open_timeout: 30,
26
+ read_timeout: 30,
27
+ uri_normalizer: URI_NORMALIZER,
28
+ }, *args, **options)
23
29
  end
24
30
 
25
31
  # Downloads a remote file to disk using open-uri. Accepts any open-uri
26
32
  # options, and a few more.
27
- def download(url, options = {})
28
- options = @options.merge(options)
33
+ def download(url, *args, **options)
34
+ options = merge_options(@options, *args, **options)
29
35
 
30
36
  max_size = options.delete(:max_size)
31
37
  max_redirects = options.delete(:max_redirects)
32
38
  progress_proc = options.delete(:progress_proc)
33
39
  content_length_proc = options.delete(:content_length_proc)
34
40
  destination = options.delete(:destination)
35
- headers = options.delete(:headers) || {}
41
+ headers = options.delete(:headers)
42
+ uri_normalizer = options.delete(:uri_normalizer)
36
43
 
37
44
  # Use open-uri's :content_lenth_proc or :progress_proc to raise an
38
45
  # exception early if the file is too large.
@@ -74,7 +81,7 @@ module Down
74
81
  open_uri_options.merge!(options)
75
82
  open_uri_options.merge!(headers)
76
83
 
77
- uri = ensure_uri(addressable_normalize(url))
84
+ uri = ensure_uri(normalize_uri(url, uri_normalizer: uri_normalizer))
78
85
 
79
86
  # Handle basic authentication in the remote URL.
80
87
  if uri.user || uri.password
@@ -95,13 +102,17 @@ module Down
95
102
 
96
103
  # Starts retrieving the remote file using Net::HTTP and returns an IO-like
97
104
  # object which downloads the response body on-demand.
98
- def open(url, options = {})
99
- uri = ensure_uri(addressable_normalize(url))
100
- options = @options.merge(options)
105
+ def open(url, *args, **options)
106
+ options = merge_options(@options, *args, **options)
107
+
108
+ max_redirects = options.delete(:max_redirects)
109
+ uri_normalizer = options.delete(:uri_normalizer)
110
+
111
+ uri = ensure_uri(normalize_uri(url, uri_normalizer: uri_normalizer))
101
112
 
102
113
  # Create a Fiber that halts when response headers are received.
103
114
  request = Fiber.new do
104
- net_http_request(uri, options) do |response|
115
+ net_http_request(uri, options, follows_remaining: max_redirects) do |response|
105
116
  Fiber.yield response
106
117
  end
107
118
  end
@@ -131,7 +142,7 @@ module Down
131
142
  private
132
143
 
133
144
  # Calls open-uri's URI::HTTP#open method. Additionally handles redirects.
134
- def open_uri(uri, options, follows_remaining: 0)
145
+ def open_uri(uri, options, follows_remaining:)
135
146
  uri.open(options)
136
147
  rescue OpenURI::HTTPRedirect => exception
137
148
  raise Down::TooManyRedirects, "too many redirects" if follows_remaining == 0
@@ -186,7 +197,7 @@ module Down
186
197
  end
187
198
 
188
199
  # Makes a Net::HTTP request and follows redirects.
189
- def net_http_request(uri, options, follows_remaining: options.fetch(:max_redirects, 2), &block)
200
+ def net_http_request(uri, options, follows_remaining:, &block)
190
201
  http, request = create_net_http(uri, options)
191
202
 
192
203
  begin
@@ -251,12 +262,13 @@ module Down
251
262
  http.read_timeout = options[:read_timeout] if options.key?(:read_timeout)
252
263
  http.open_timeout = options[:open_timeout] if options.key?(:open_timeout)
253
264
 
254
- headers = options.select { |key, value| key.is_a?(String) }
255
- headers.merge!(options[:headers]) if options[:headers]
265
+ headers = options[:headers].to_h
256
266
  headers["Accept-Encoding"] = "" # Net::HTTP's inflater causes FiberErrors
257
267
 
258
268
  get = Net::HTTP::Get.new(uri.request_uri, headers)
259
- get.basic_auth(uri.user, uri.password) if uri.user || uri.password
269
+
270
+ user, password = options[:http_basic_authentication] || [uri.user, uri.password]
271
+ get.basic_auth(user, password) if user || password
260
272
 
261
273
  [http, get]
262
274
  end
@@ -284,9 +296,10 @@ module Down
284
296
  end
285
297
 
286
298
  # Makes sure that the URL is properly encoded.
287
- def addressable_normalize(url)
288
- addressable_uri = Addressable::URI.parse(url)
289
- addressable_uri.normalize.to_s
299
+ def normalize_uri(url, uri_normalizer:)
300
+ URI(url)
301
+ rescue URI::InvalidURIError
302
+ uri_normalizer.call(url)
290
303
  end
291
304
 
292
305
  # When open-uri raises an exception, it doesn't expose the response object.
@@ -295,7 +308,11 @@ module Down
295
308
  def rebuild_response_from_open_uri_exception(exception)
296
309
  code, message = exception.io.status
297
310
 
298
- response_class = Net::HTTPResponse::CODE_TO_OBJ.fetch(code)
311
+ response_class = Net::HTTPResponse::CODE_TO_OBJ.fetch(code) do |code|
312
+ Net::HTTPResponse::CODE_CLASS_TO_OBJ.fetch(code[0]) do
313
+ Net::HTTPUnknownResponse
314
+ end
315
+ end
299
316
  response = response_class.new(nil, code, message)
300
317
 
301
318
  exception.io.metas.each do |name, values|
@@ -310,9 +327,10 @@ module Down
310
327
  code = response.code.to_i
311
328
  message = response.message.split(" ").map(&:capitalize).join(" ")
312
329
 
313
- args = ["#{code} #{message}", response: response]
330
+ args = ["#{code} #{message}", response]
314
331
 
315
332
  case response.code.to_i
333
+ when 404 then raise Down::NotFound.new(*args)
316
334
  when 400..499 then raise Down::ClientError.new(*args)
317
335
  when 500..599 then raise Down::ServerError.new(*args)
318
336
  else raise Down::ResponseError.new(*args)
@@ -335,6 +353,24 @@ module Down
335
353
  end
336
354
  end
337
355
 
356
+ # Merge default and ad-hoc options, merging nested headers.
357
+ def merge_options(options, headers = {}, **new_options)
358
+ # Deprecate passing headers as top-level options, taking into account
359
+ # that Ruby 2.7+ accepts kwargs with string keys.
360
+ if headers.any?
361
+ warn %([Down::NetHttp] Passing headers as top-level options has been deprecated, use the :headers option instead, e.g: `Down::NetHttp.download(headers: { "Key" => "Value", ... }, ...)`)
362
+ new_options[:headers] = headers
363
+ elsif new_options.any? { |key, value| key.is_a?(String) }
364
+ warn %([Down::NetHttp] Passing headers as top-level options has been deprecated, use the :headers option instead, e.g: `Down::NetHttp.download(headers: { "Key" => "Value", ... }, ...)`)
365
+ new_options[:headers] = new_options.select { |key, value| key.is_a?(String) }
366
+ new_options.reject! { |key, value| key.is_a?(String) }
367
+ end
368
+
369
+ options.merge(new_options) do |key, value1, value2|
370
+ key == :headers ? value1.merge(value2) : value2
371
+ end
372
+ end
373
+
338
374
  # Defines some additional attributes for the returned Tempfile (on top of what
339
375
  # OpenURI::Meta already defines).
340
376
  module DownloadedFile
@@ -1,5 +1,5 @@
1
1
  # frozen-string-literal: true
2
2
 
3
3
  module Down
4
- VERSION = "4.8.1"
4
+ VERSION = "5.2.0"
5
5
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: down
3
3
  version: !ruby/object:Gem::Version
4
- version: 4.8.1
4
+ version: 5.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Janko Marohnić
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2019-05-01 00:00:00.000000000 Z
11
+ date: 2020-09-20 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: addressable
@@ -72,14 +72,14 @@ dependencies:
72
72
  requirements:
73
73
  - - "~>"
74
74
  - !ruby/object:Gem::Version
75
- version: '4.0'
75
+ version: '4.3'
76
76
  type: :development
77
77
  prerelease: false
78
78
  version_requirements: !ruby/object:Gem::Requirement
79
79
  requirements:
80
80
  - - "~>"
81
81
  - !ruby/object:Gem::Version
82
- version: '4.0'
82
+ version: '4.3'
83
83
  - !ruby/object:Gem::Dependency
84
84
  name: posix-spawn
85
85
  requirement: !ruby/object:Gem::Requirement
@@ -122,6 +122,20 @@ dependencies:
122
122
  - - ">="
123
123
  - !ruby/object:Gem::Version
124
124
  version: '0'
125
+ - !ruby/object:Gem::Dependency
126
+ name: warning
127
+ requirement: !ruby/object:Gem::Requirement
128
+ requirements:
129
+ - - ">="
130
+ - !ruby/object:Gem::Version
131
+ version: '0'
132
+ type: :development
133
+ prerelease: false
134
+ version_requirements: !ruby/object:Gem::Requirement
135
+ requirements:
136
+ - - ">="
137
+ - !ruby/object:Gem::Version
138
+ version: '0'
125
139
  description:
126
140
  email:
127
141
  - janko.marohnic@gmail.com
@@ -154,14 +168,14 @@ required_ruby_version: !ruby/object:Gem::Requirement
154
168
  requirements:
155
169
  - - ">="
156
170
  - !ruby/object:Gem::Version
157
- version: '2.1'
171
+ version: '2.3'
158
172
  required_rubygems_version: !ruby/object:Gem::Requirement
159
173
  requirements:
160
174
  - - ">="
161
175
  - !ruby/object:Gem::Version
162
176
  version: '0'
163
177
  requirements: []
164
- rubygems_version: 3.0.3
178
+ rubygems_version: 3.1.1
165
179
  signing_key:
166
180
  specification_version: 4
167
181
  summary: Robust streaming downloads using Net::HTTP, HTTP.rb or wget.