down 5.0.0 → 5.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: a1f7a1532b638ed92acdb3bdf74211dbd022e7a0de37d1fdbc25f665337d4bf1
4
- data.tar.gz: 7bbc2684d53e278376981b4dd4741c49bc0d8da0990ece28fb03776f39aea6fe
3
+ metadata.gz: e0544a70de3afab2a00c68df6923572a5533f05a0964b8c5186728019d04e55b
4
+ data.tar.gz: 3474da6a6c7a182aa02deb72139002ebf97ac98bee9c4a833de2b0413e373924
5
5
  SHA512:
6
- metadata.gz: 6a3e62293c5fa1e5b43a5c835af3804ab8307babfb573179057ec279239b72a16ff74b2b2e96df9da033a3e4e13be4af6adeb8ec00dd73a27f75942b646419de
7
- data.tar.gz: d267672bb2468c8998e271ff3849908c7beead00d8b9777799e6bc0abd1ab51ccf14d5c567c16f3bb30a41eb3c57b66ffbc2755d333055fba972d327a865acb4
6
+ metadata.gz: 64d22c0a25ddf60f2dea37cb03264ec5770a8c7877fb7a843052abca4b043ba9d2d0c8eed830fc2cec97bb852338b114e2907c357aaf4b68aad26a6860459f67
7
+ data.tar.gz: 992209a7e4201cab464958bac0960ba4a6dc0344cd410b7ceaec2a02b89ebc1bd876396349204bb0bfe175a279fbdb41900c9fee85eee7e98d74eec7afa2e7a0
data/CHANGELOG.md CHANGED
@@ -1,3 +1,61 @@
1
+ ## 5.4.0 (2022-12-26)
2
+
3
+ * Add new HTTPX backend, which supports HTTP/2 protocol among other features (@HoneyryderChuck)
4
+
5
+ ## 5.3.1 (2022-03-25)
6
+
7
+ * Correctly split cookie headers on `;` instead of `,` when forwarding them on redirects (@ermolaev)
8
+
9
+ ## 5.3.0 (2022-02-20)
10
+
11
+ * Add `:extension` argument to `Down.download` for overriding tempfile extension (@razum2um)
12
+
13
+ * Normalize response header names for http.rb and wget backends (@zarqman)
14
+
15
+ ## 5.2.4 (2021-09-12)
16
+
17
+ * Keep original cookies between redirections (@antprt)
18
+
19
+ ## 5.2.3 (2021-08-03)
20
+
21
+ * Bump addressable version requirement to 2.8+ to remediate vulnerability (@aldodelgado)
22
+
23
+ ## 5.2.2 (2021-05-27)
24
+
25
+ * Add info about received content length in `Down::TooLarge` error (@evheny0)
26
+
27
+ * Relax http.rb constraint to allow versions 5.x (@mgrunberg)
28
+
29
+ ## 5.2.1 (2021-04-26)
30
+
31
+ * Raise `Down::NotModified` on 304 response status in `Down::NetHttp#open` (@ellafeldmann)
32
+
33
+ ## 5.2.0 (2020-09-20)
34
+
35
+ * Add `:uri_normalizer` option to `Down::NetHttp` (@janko)
36
+
37
+ * Add `:http_basic_authentication` option to `Down::NetHttp#open` (@janko)
38
+
39
+ * Fix uninitialized instance variables warnings in `Down::ChunkedIO` (@janko)
40
+
41
+ * Handle unknown HTTP error codes in `Down::NetHttp` (@darndt)
42
+
43
+ ## 5.1.1 (2020-02-04)
44
+
45
+ * Fix keyword arguments warnings on Ruby 2.7 in `Down.download` and `Down.open` (@janko)
46
+
47
+ ## 5.1.0 (2020-01-09)
48
+
49
+ * Fix keyword arguments warnings on Ruby 2.7 (@janko)
50
+
51
+ * Fix `FrozenError` exception in `Down::ChunkedIO#readpartial` (@janko)
52
+
53
+ * Deprecate passing headers as top-level options in `Down::NetHttp` (@janko)
54
+
55
+ ## 5.0.1 (2019-12-20)
56
+
57
+ * In `Down::NetHttp` only use Addressable normalization if `URI.parse` fails (@coding-chimp)
58
+
1
59
  ## 5.0.0 (2019-09-26)
2
60
 
3
61
  * Change `ChunkedIO#each_chunk` to return chunks in original encoding (@janko)
data/README.md CHANGED
@@ -1,13 +1,13 @@
1
1
  # Down
2
2
 
3
3
  Down is a utility tool for streaming, flexible and safe downloading of remote
4
- files. It can use [open-uri] + `Net::HTTP`, [http.rb] or `wget` as the backend
5
- HTTP library.
4
+ files. It can use [open-uri] + `Net::HTTP`, [http.rb], [HTTPX], or `wget` as
5
+ the backend HTTP library.
6
6
 
7
7
  ## Installation
8
8
 
9
9
  ```rb
10
- gem "down", "~> 4.4"
10
+ gem "down", "~> 5.0"
11
11
  ```
12
12
 
13
13
  ## Downloading
@@ -63,6 +63,13 @@ Down.download("http://example.com/image.jpg", destination: "/path/to/destination
63
63
  In this case `Down.download` won't have any return value, so if you need a File
64
64
  object you'll have to create it manually.
65
65
 
66
+ You can also keep the tempfile, but override the extension:
67
+
68
+ ```rb
69
+ tempfile = Down.download("http://example.com/some/file", extension: "txt")
70
+ File.extname(tempfile.path) #=> ".txt"
71
+ ```
72
+
66
73
  ### Basic authentication
67
74
 
68
75
  `Down.download` and `Down.open` will automatically detect and apply HTTP basic
@@ -157,7 +164,7 @@ You can access the response status and headers of the HTTP request that was made
157
164
  ```rb
158
165
  remote_file = Down.open("http://example.com/image.jpg")
159
166
  remote_file.data[:status] #=> 200
160
- remote_file.data[:headers] #=> { ... }
167
+ remote_file.data[:headers] #=> { "Content-Type" => "image/jpeg", ... } (header names are normalized)
161
168
  remote_file.data[:response] # returns the response object
162
169
  ```
163
170
 
@@ -212,6 +219,7 @@ the `Down::Error` subclasses. This is Down's exception hierarchy:
212
219
  * `Down::TooLarge`
213
220
  * `Down::InvalidUrl`
214
221
  * `Down::TooManyRedirects`
222
+ * `Down::NotModified`
215
223
  * `Down::ResponseError`
216
224
  * `Down::ClientError`
217
225
  * `Down::NotFound`
@@ -226,6 +234,7 @@ The following backends are available:
226
234
 
227
235
  * [Down::NetHttp](#downnethttp) (default)
228
236
  * [Down::Http](#downhttp)
237
+ * [Down::Httpx](#downhttpx)
229
238
  * [Down::Wget](#downwget)
230
239
 
231
240
  You can use the backend directly:
@@ -251,10 +260,10 @@ Down.open("...")
251
260
  ### Down::NetHttp
252
261
 
253
262
  The `Down::NetHttp` backend implements downloads using [open-uri] and
254
- [Net::HTTP].
263
+ [Net::HTTP] standard libraries.
255
264
 
256
265
  ```rb
257
- gem "down", "~> 4.4"
266
+ gem "down", "~> 5.0"
258
267
  ```
259
268
  ```rb
260
269
  require "down/net_http"
@@ -333,6 +342,18 @@ Down::NetHttp.open("http://example.com/image.jpg",
333
342
  ssl_verify_mode: OpenSSL::SSL::VERIFY_PEER)
334
343
  ```
335
344
 
345
+ #### URI normalization
346
+
347
+ If the URL isn't parseable by `URI.parse`, `Down::NetHttp` will
348
+ attempt to normalize the URL using [Addressable::URI], URI-escaping
349
+ any potentially unescaped characters. You can change the normalizer
350
+ via the `:uri_normalizer` option:
351
+
352
+ ```rb
353
+ # this skips URL normalization
354
+ Down::NetHttp.download("http://example.com/image.jpg", uri_normalizer: -> (url) { url })
355
+ ```
356
+
336
357
  #### Additional options
337
358
 
338
359
  Any additional options passed to `Down.download` will be forwarded to
@@ -358,8 +379,8 @@ net_http.open("http://example.com/image.jpg")
358
379
  The `Down::Http` backend implements downloads using the [http.rb] gem.
359
380
 
360
381
  ```rb
361
- gem "down", "~> 4.4"
362
- gem "http", "~> 4.0"
382
+ gem "down", "~> 5.0"
383
+ gem "http", "~> 5.0"
363
384
  ```
364
385
  ```rb
365
386
  require "down/http"
@@ -422,13 +443,35 @@ down = Down::Http.new(method: :post)
422
443
  down.download("http://example.org/image.jpg")
423
444
  ```
424
445
 
446
+ ### Down::Httpx
447
+
448
+ The `Down::Httpx` backend implements downloads using the [HTTPX] gem, which
449
+ supports the HTTP/2 protocol, in addition to many other features.
450
+
451
+ ```rb
452
+ gem "down", "~> 5.0"
453
+ gem "httpx", "~> 0.22"
454
+ ```
455
+ ```rb
456
+ require "down/httpx"
457
+
458
+ tempfile = Down::Httpx.download("http://nature.com/forest.jpg")
459
+ tempfile #=> #<Tempfile:/var/folders/k7/6zx6dx6x7ys3rv3srh0nyfj00000gn/T/20150925-55456-z7vxqz.jpg>
460
+
461
+ io = Down::Httpx.open("http://nature.com/forest.jpg")
462
+ io #=> #<Down::ChunkedIO ...>
463
+ ```
464
+
465
+ It's implemented in much of the same way as `Down::Http`, so be sure to check
466
+ its docs for ways to pass additional options.
467
+
425
468
  ### Down::Wget (experimental)
426
469
 
427
470
  The `Down::Wget` backend implements downloads using the `wget` command line
428
471
  utility.
429
472
 
430
473
  ```rb
431
- gem "down", "~> 4.4"
474
+ gem "down", "~> 5.0"
432
475
  gem "posix-spawn" # omit if on JRuby
433
476
  gem "http_parser.rb"
434
477
  ```
@@ -470,24 +513,30 @@ wget.open("http://nature.com/forest.jpg")
470
513
 
471
514
  ## Supported Ruby versions
472
515
 
473
- * MRI 2.2
474
516
  * MRI 2.3
475
517
  * MRI 2.4
476
518
  * MRI 2.5
477
519
  * MRI 2.6
478
- * JRuby
520
+ * MRI 2.7
521
+ * MRI 3.0
522
+ * MRI 3.1
523
+ * JRuby 9.3
479
524
 
480
525
  ## Development
481
526
 
482
- You can run tests with
527
+ Tests require that a [httpbin] server is running locally, which you can do via Docker:
528
+
529
+ ```sh
530
+ $ docker pull kennethreitz/httpbin
531
+ $ docker run -p 80:80 kennethreitz/httpbin
532
+ ```
533
+
534
+ Then you can run tests:
483
535
 
484
536
  ```
485
537
  $ bundle exec rake test
486
538
  ```
487
539
 
488
- The test suite pulls and runs [kennethreitz/httpbin] as a Docker container, so
489
- you'll need to have Docker installed and running.
490
-
491
540
  ## License
492
541
 
493
542
  [MIT](LICENSE.txt)
@@ -495,5 +544,6 @@ you'll need to have Docker installed and running.
495
544
  [open-uri]: http://ruby-doc.org/stdlib-2.3.0/libdoc/open-uri/rdoc/OpenURI.html
496
545
  [Net::HTTP]: https://ruby-doc.org/stdlib-2.4.1/libdoc/net/http/rdoc/Net/HTTP.html
497
546
  [http.rb]: https://github.com/httprb/http
547
+ [HTTPX]: https://github.com/HoneyryderChuck/httpx
498
548
  [Addressable::URI]: https://github.com/sporkmonger/addressable
499
- [kennethreitz/httpbin]: https://github.com/kennethreitz/httpbin
549
+ [httpbin]: https://github.com/postmanlabs/httpbin
data/down.gemspec CHANGED
@@ -4,7 +4,7 @@ Gem::Specification.new do |spec|
4
4
  spec.name = "down"
5
5
  spec.version = Down::VERSION
6
6
 
7
- spec.required_ruby_version = ">= 2.1"
7
+ spec.required_ruby_version = ">= 2.3"
8
8
 
9
9
  spec.summary = "Robust streaming downloads using Net::HTTP, HTTP.rb or wget."
10
10
  spec.homepage = "https://github.com/janko/down"
@@ -15,13 +15,19 @@ Gem::Specification.new do |spec|
15
15
  spec.files = Dir["README.md", "LICENSE.txt", "CHANGELOG.md", "*.gemspec", "lib/**/*.rb"]
16
16
  spec.require_path = "lib"
17
17
 
18
- spec.add_dependency "addressable", "~> 2.5"
18
+ spec.add_dependency "addressable", "~> 2.8"
19
19
 
20
20
  spec.add_development_dependency "minitest", "~> 5.8"
21
21
  spec.add_development_dependency "mocha", "~> 1.5"
22
22
  spec.add_development_dependency "rake"
23
- spec.add_development_dependency "http", "~> 4.0"
23
+ spec.add_development_dependency "httpx", "~> 0.22", ">= 0.22.2"
24
+ # http 5.0 drop support of ruby 2.3 and 2.4. We still support those versions.
25
+ if RUBY_VERSION >= "2.5"
26
+ spec.add_development_dependency "http", "~> 5.0"
27
+ else
28
+ spec.add_development_dependency "http", "~> 4.3"
29
+ end
24
30
  spec.add_development_dependency "posix-spawn" unless RUBY_ENGINE == "jruby"
25
- spec.add_development_dependency "http_parser.rb"
26
- spec.add_development_dependency "docker-api"
31
+ spec.add_development_dependency "http_parser.rb" unless RUBY_ENGINE == "jruby"
32
+ spec.add_development_dependency "warning" if RUBY_VERSION >= "2.4"
27
33
  end
data/lib/down/backend.rb CHANGED
@@ -9,12 +9,12 @@ require "fileutils"
9
9
 
10
10
  module Down
11
11
  class Backend
12
- def self.download(*args, &block)
13
- new.download(*args, &block)
12
+ def self.download(*args, **options, &block)
13
+ new.download(*args, **options, &block)
14
14
  end
15
15
 
16
- def self.open(*args, &block)
17
- new.open(*args, &block)
16
+ def self.open(*args, **options, &block)
17
+ new.open(*args, **options, &block)
18
18
  end
19
19
 
20
20
  private
@@ -29,5 +29,12 @@ module Down
29
29
 
30
30
  nil
31
31
  end
32
+
33
+ def normalize_headers(response_headers)
34
+ response_headers.inject({}) do |headers, (downcased_name, value)|
35
+ name = downcased_name.split("-").map(&:capitalize).join("-")
36
+ headers.merge!(name => value)
37
+ end
38
+ end
32
39
  end
33
40
  end
@@ -36,6 +36,8 @@ module Down
36
36
  @rewindable = rewindable
37
37
  @buffer = nil
38
38
  @position = 0
39
+ @next_chunk = nil
40
+ @closed = false
39
41
 
40
42
  retrieve_chunk # fetch first chunk so that we know whether the file is empty
41
43
  end
@@ -63,7 +65,9 @@ module Down
63
65
  def read(length = nil, outbuf = nil)
64
66
  fail IOError, "closed stream" if closed?
65
67
 
66
- data = outbuf.to_s.clear.force_encoding(Encoding::BINARY)
68
+ data = outbuf.clear.force_encoding(Encoding::BINARY) if outbuf
69
+ data ||= "".b
70
+
67
71
  remaining_length = length
68
72
 
69
73
  until remaining_length == 0 || eof?
@@ -142,7 +146,8 @@ module Down
142
146
  # or the next chunk. This is useful when you don't care about the size of
143
147
  # chunks and you want to minimize string allocations.
144
148
  #
145
- # With `length` argument returns maximum of that amount of bytes.
149
+ # With `maxlen` argument returns maximum of that amount of bytes (default
150
+ # is 16KB).
146
151
  #
147
152
  # With `outbuf` argument each call will return that same string object,
148
153
  # where the value is replaced with retrieved content.
@@ -154,7 +159,8 @@ module Down
154
159
  maxlen ||= 16*1024
155
160
 
156
161
  data = cache.read(maxlen, outbuf) if cache && !cache.eof?
157
- data ||= outbuf.to_s.clear
162
+ data ||= outbuf.clear.force_encoding(Encoding::BINARY) if outbuf
163
+ data ||= "".b
158
164
 
159
165
  return data if maxlen == 0
160
166
 
data/lib/down/errors.rb CHANGED
@@ -13,11 +13,14 @@ module Down
13
13
  # raised when the number of redirects was larger than the specified maximum
14
14
  class TooManyRedirects < Error; end
15
15
 
16
+ # raised when the requested resource has not been modified
17
+ class NotModified < Error; end
18
+
16
19
  # raised when response returned 4xx or 5xx response
17
20
  class ResponseError < Error
18
21
  attr_reader :response
19
22
 
20
- def initialize(message, response: nil)
23
+ def initialize(message, response = nil)
21
24
  super(message)
22
25
  @response = response
23
26
  end
data/lib/down/http.rb CHANGED
@@ -1,6 +1,6 @@
1
1
  # frozen-string-literal: true
2
2
 
3
- gem "http", ">= 2.1.0", "< 5"
3
+ gem "http", ">= 2.1.0", "< 6"
4
4
 
5
5
  require "http"
6
6
 
@@ -12,7 +12,7 @@ module Down
12
12
  # Provides streaming downloads implemented with HTTP.rb.
13
13
  class Http < Backend
14
14
  # Initializes the backend with common defaults.
15
- def initialize(options = {}, &block)
15
+ def initialize(**options, &block)
16
16
  @method = options.delete(:method) || :get
17
17
  @client = HTTP
18
18
  .headers("User-Agent" => "Down/#{Down::VERSION}")
@@ -25,16 +25,16 @@ module Down
25
25
 
26
26
  # Downlods the remote file to disk. Accepts HTTP.rb options via a hash or a
27
27
  # block, and some additional options as well.
28
- def download(url, max_size: nil, progress_proc: nil, content_length_proc: nil, destination: nil, **options, &block)
28
+ def download(url, max_size: nil, progress_proc: nil, content_length_proc: nil, destination: nil, extension: nil, **options, &block)
29
29
  response = request(url, **options, &block)
30
30
 
31
31
  content_length_proc.call(response.content_length) if content_length_proc && response.content_length
32
32
 
33
33
  if max_size && response.content_length && response.content_length > max_size
34
- raise Down::TooLarge, "file is too large (max is #{max_size/1024/1024}MB)"
34
+ raise Down::TooLarge, "file is too large (#{response.content_length/1024/1024}MB, max is #{max_size/1024/1024}MB)"
35
35
  end
36
36
 
37
- extname = File.extname(response.uri.path)
37
+ extname = extension ? ".#{extension}" : File.extname(response.uri.path)
38
38
  tempfile = Tempfile.new(["down-http", extname], binmode: true)
39
39
 
40
40
  stream_body(response) do |chunk|
@@ -44,7 +44,7 @@ module Down
44
44
  progress_proc.call(tempfile.size) if progress_proc
45
45
 
46
46
  if max_size && tempfile.size > max_size
47
- raise Down::TooLarge, "file is too large (max is #{max_size/1024/1024}MB)"
47
+ raise Down::TooLarge, "file is too large (#{tempfile.size/1024/1024}MB, max is #{max_size/1024/1024}MB)"
48
48
  end
49
49
  end
50
50
 
@@ -52,7 +52,7 @@ module Down
52
52
 
53
53
  tempfile.extend Down::Http::DownloadedFile
54
54
  tempfile.url = response.uri.to_s
55
- tempfile.headers = response.headers.to_h
55
+ tempfile.headers = normalize_headers(response.headers.to_h)
56
56
 
57
57
  download_result(tempfile, destination)
58
58
  rescue
@@ -71,7 +71,11 @@ module Down
71
71
  size: response.content_length,
72
72
  encoding: response.content_type.charset,
73
73
  rewindable: rewindable,
74
- data: { status: response.code, headers: response.headers.to_h, response: response },
74
+ data: {
75
+ status: response.code,
76
+ headers: normalize_headers(response.headers.to_h),
77
+ response: response
78
+ },
75
79
  )
76
80
  end
77
81
 
@@ -106,7 +110,7 @@ module Down
106
110
 
107
111
  # Raises non-sucessful response as a Down::ResponseError.
108
112
  def response_error!(response)
109
- args = [response.status.to_s, response: response]
113
+ args = [response.status.to_s, response]
110
114
 
111
115
  case response.code
112
116
  when 404 then raise Down::NotFound.new(*args)
data/lib/down/httpx.rb ADDED
@@ -0,0 +1,175 @@
1
+ # frozen-string-literal: true
2
+
3
+ require "uri"
4
+ require "tempfile"
5
+ require "httpx"
6
+
7
+ require "down/backend"
8
+
9
+
10
+ module Down
11
+ # Provides streaming downloads implemented with HTTPX.
12
+ class Httpx < Backend
13
+ # Initializes the backend
14
+
15
+ USER_AGENT = "Down/#{Down::VERSION}"
16
+
17
+ def initialize(**options, &block)
18
+ @method = options.delete(:method) || :get
19
+ headers = options.delete(:headers) || {}
20
+ @client = HTTPX
21
+ .plugin(:follow_redirects, max_redirects: 2)
22
+ .plugin(:basic_authentication)
23
+ .plugin(:stream)
24
+ .with(
25
+ headers: { "user-agent": USER_AGENT }.merge(headers),
26
+ timeout: { connect_timeout: 30, write_timeout: 30, read_timeout: 30 },
27
+ **options
28
+ )
29
+
30
+ @client = block.call(@client) if block
31
+ end
32
+
33
+
34
+ # Downlods the remote file to disk. Accepts HTTPX options via a hash or a
35
+ # block, and some additional options as well.
36
+ def download(url, max_size: nil, progress_proc: nil, content_length_proc: nil, destination: nil, extension: nil, **options, &block)
37
+ client = @client
38
+
39
+ response = request(client, url, **options, &block)
40
+
41
+ content_length = nil
42
+
43
+ if response.headers.key?("content-length")
44
+ content_length = response.headers["content-length"].to_i
45
+
46
+ content_length_proc.call(content_length) if content_length_proc
47
+
48
+ if max_size && content_length > max_size
49
+ response.close
50
+ raise Down::TooLarge, "file is too large (#{content_length/1024/1024}MB, max is #{max_size/1024/1024}MB)"
51
+ end
52
+ end
53
+
54
+ extname = extension ? ".#{extension}" : File.extname(response.uri.path)
55
+ tempfile = Tempfile.new(["down-http", extname], binmode: true)
56
+
57
+ stream_body(response) do |chunk|
58
+ tempfile.write(chunk)
59
+ chunk.clear # deallocate string
60
+
61
+ progress_proc.call(tempfile.size) if progress_proc
62
+
63
+ if max_size && tempfile.size > max_size
64
+ raise Down::TooLarge, "file is too large (#{tempfile.size/1024/1024}MB, max is #{max_size/1024/1024}MB)"
65
+ end
66
+ end
67
+
68
+ tempfile.open # flush written content
69
+
70
+ tempfile.extend DownloadedFile
71
+ tempfile.url = response.uri.to_s
72
+ tempfile.headers = normalize_headers(response.headers.to_h)
73
+ tempfile.content_type = response.content_type.mime_type
74
+ tempfile.charset = response.body.encoding
75
+
76
+ download_result(tempfile, destination)
77
+ rescue
78
+ tempfile.close! if tempfile
79
+ raise
80
+ end
81
+
82
+ # Starts retrieving the remote file and returns an IO-like object which
83
+ # downloads the response body on-demand. Accepts HTTP.rb options via a hash
84
+ # or a block.
85
+ def open(url, rewindable: true, **options, &block)
86
+ response = request(@client, url, stream: true, **options, &block)
87
+ size = response.headers["content-length"]
88
+ size = size.to_i if size
89
+ Down::ChunkedIO.new(
90
+ chunks: enum_for(:stream_body, response),
91
+ size: size,
92
+ encoding: response.body.encoding,
93
+ rewindable: rewindable,
94
+ data: {
95
+ status: response.status,
96
+ headers: normalize_headers(response.headers.to_h),
97
+ response: response
98
+ },
99
+ )
100
+ end
101
+
102
+ private
103
+
104
+ # Yields chunks of the response body to the block.
105
+ def stream_body(response, &block)
106
+ response.each(&block)
107
+ rescue => exception
108
+ request_error!(exception)
109
+ end
110
+
111
+ def request(client, url, method: @method, **options, &block)
112
+ response = send_request(client, method, url, **options, &block)
113
+ response.raise_for_status
114
+ response_error!(response) unless (200..299).include?(response.status)
115
+ response
116
+ rescue HTTPX::HTTPError
117
+ response_error!(response)
118
+ rescue => error
119
+ request_error!(error)
120
+ end
121
+
122
+ def send_request(client, method, url, **options, &block)
123
+ uri = URI(url)
124
+ client = @client
125
+ if uri.user || uri.password
126
+ client = client.basic_auth(uri.user, uri.password)
127
+ uri.user = uri.password = nil
128
+ end
129
+ client = block.call(client) if block
130
+
131
+ client.request(method, uri, stream: true, **options)
132
+ rescue => exception
133
+ request_error!(exception)
134
+ end
135
+
136
+ # Raises non-sucessful response as a Down::ResponseError.
137
+ def response_error!(response)
138
+ args = [response.status.to_s, response]
139
+
140
+ case response.status
141
+ when 300..399 then raise Down::TooManyRedirects, "too many redirects"
142
+ when 404 then raise Down::NotFound.new(*args)
143
+ when 400..499 then raise Down::ClientError.new(*args)
144
+ when 500..599 then raise Down::ServerError.new(*args)
145
+ else raise Down::ResponseError.new(*args)
146
+ end
147
+ end
148
+
149
+ # Re-raise HTTP.rb exceptions as Down::Error exceptions.
150
+ def request_error!(exception)
151
+ case exception
152
+ when URI::Error, HTTPX::UnsupportedSchemeError
153
+ raise Down::InvalidUrl, exception.message
154
+ when HTTPX::ConnectionError
155
+ raise Down::ConnectionError, exception.message
156
+ when HTTPX::TimeoutError
157
+ raise Down::TimeoutError, exception.message
158
+ when OpenSSL::SSL::SSLError
159
+ raise Down::SSLError, exception.message
160
+ else
161
+ raise exception
162
+ end
163
+ end
164
+
165
+ # Defines some additional attributes for the returned Tempfile.
166
+ module DownloadedFile
167
+ attr_accessor :url, :headers, :charset, :content_type
168
+
169
+ def original_filename
170
+ Utils.filename_from_content_disposition(headers["Content-Disposition"]) ||
171
+ Utils.filename_from_path(URI.parse(url).path)
172
+ end
173
+ end
174
+ end
175
+ end
data/lib/down/net_http.rb CHANGED
@@ -12,27 +12,35 @@ require "fileutils"
12
12
  module Down
13
13
  # Provides streaming downloads implemented with Net::HTTP and open-uri.
14
14
  class NetHttp < Backend
15
+ URI_NORMALIZER = -> (url) do
16
+ addressable_uri = Addressable::URI.parse(url)
17
+ addressable_uri.normalize.to_s
18
+ end
19
+
15
20
  # Initializes the backend with common defaults.
16
- def initialize(options = {})
17
- @options = {
18
- "User-Agent" => "Down/#{Down::VERSION}",
19
- max_redirects: 2,
20
- open_timeout: 30,
21
- read_timeout: 30,
22
- }.merge(options)
21
+ def initialize(*args, **options)
22
+ @options = merge_options({
23
+ headers: { "User-Agent" => "Down/#{Down::VERSION}" },
24
+ max_redirects: 2,
25
+ open_timeout: 30,
26
+ read_timeout: 30,
27
+ uri_normalizer: URI_NORMALIZER,
28
+ }, *args, **options)
23
29
  end
24
30
 
25
31
  # Downloads a remote file to disk using open-uri. Accepts any open-uri
26
32
  # options, and a few more.
27
- def download(url, options = {})
28
- options = @options.merge(options)
33
+ def download(url, *args, **options)
34
+ options = merge_options(@options, *args, **options)
29
35
 
30
36
  max_size = options.delete(:max_size)
31
37
  max_redirects = options.delete(:max_redirects)
32
38
  progress_proc = options.delete(:progress_proc)
33
39
  content_length_proc = options.delete(:content_length_proc)
34
40
  destination = options.delete(:destination)
35
- headers = options.delete(:headers) || {}
41
+ headers = options.delete(:headers)
42
+ uri_normalizer = options.delete(:uri_normalizer)
43
+ extension = options.delete(:extension)
36
44
 
37
45
  # Use open-uri's :content_lenth_proc or :progress_proc to raise an
38
46
  # exception early if the file is too large.
@@ -42,13 +50,13 @@ module Down
42
50
  open_uri_options = {
43
51
  content_length_proc: proc { |size|
44
52
  if size && max_size && size > max_size
45
- raise Down::TooLarge, "file is too large (max is #{max_size/1024/1024}MB)"
53
+ raise Down::TooLarge, "file is too large (#{size/1024/1024}MB, max is #{max_size/1024/1024}MB)"
46
54
  end
47
55
  content_length_proc.call(size) if content_length_proc
48
56
  },
49
57
  progress_proc: proc { |current_size|
50
58
  if max_size && current_size > max_size
51
- raise Down::TooLarge, "file is too large (max is #{max_size/1024/1024}MB)"
59
+ raise Down::TooLarge, "file is too large (#{current_size/1024/1024}MB, max is #{max_size/1024/1024}MB)"
52
60
  end
53
61
  progress_proc.call(current_size) if progress_proc
54
62
  },
@@ -74,7 +82,7 @@ module Down
74
82
  open_uri_options.merge!(options)
75
83
  open_uri_options.merge!(headers)
76
84
 
77
- uri = ensure_uri(addressable_normalize(url))
85
+ uri = ensure_uri(normalize_uri(url, uri_normalizer: uri_normalizer))
78
86
 
79
87
  # Handle basic authentication in the remote URL.
80
88
  if uri.user || uri.password
@@ -86,7 +94,8 @@ module Down
86
94
  open_uri_file = open_uri(uri, open_uri_options, follows_remaining: max_redirects)
87
95
 
88
96
  # Handle the fact that open-uri returns StringIOs for small files.
89
- tempfile = ensure_tempfile(open_uri_file, File.extname(open_uri_file.base_uri.path))
97
+ extname = extension ? ".#{extension}" : File.extname(open_uri_file.base_uri.path)
98
+ tempfile = ensure_tempfile(open_uri_file, extname)
90
99
  OpenURI::Meta.init tempfile, open_uri_file # add back open-uri methods
91
100
  tempfile.extend Down::NetHttp::DownloadedFile
92
101
 
@@ -95,13 +104,17 @@ module Down
95
104
 
96
105
  # Starts retrieving the remote file using Net::HTTP and returns an IO-like
97
106
  # object which downloads the response body on-demand.
98
- def open(url, options = {})
99
- uri = ensure_uri(addressable_normalize(url))
100
- options = @options.merge(options)
107
+ def open(url, *args, **options)
108
+ options = merge_options(@options, *args, **options)
109
+
110
+ max_redirects = options.delete(:max_redirects)
111
+ uri_normalizer = options.delete(:uri_normalizer)
112
+
113
+ uri = ensure_uri(normalize_uri(url, uri_normalizer: uri_normalizer))
101
114
 
102
115
  # Create a Fiber that halts when response headers are received.
103
116
  request = Fiber.new do
104
- net_http_request(uri, options) do |response|
117
+ net_http_request(uri, options, follows_remaining: max_redirects) do |response|
105
118
  Fiber.yield response
106
119
  end
107
120
  end
@@ -119,10 +132,7 @@ module Down
119
132
  on_close: -> { request.resume }, # close HTTP connnection
120
133
  data: {
121
134
  status: response.code.to_i,
122
- headers: response.each_header.inject({}) { |headers, (downcased_name, value)|
123
- name = downcased_name.split("-").map(&:capitalize).join("-")
124
- headers.merge!(name => value)
125
- },
135
+ headers: normalize_headers(response.each_header),
126
136
  response: response,
127
137
  },
128
138
  )
@@ -131,7 +141,7 @@ module Down
131
141
  private
132
142
 
133
143
  # Calls open-uri's URI::HTTP#open method. Additionally handles redirects.
134
- def open_uri(uri, options, follows_remaining: 0)
144
+ def open_uri(uri, options, follows_remaining:)
135
145
  uri.open(options)
136
146
  rescue OpenURI::HTTPRedirect => exception
137
147
  raise Down::TooManyRedirects, "too many redirects" if follows_remaining == 0
@@ -147,7 +157,11 @@ module Down
147
157
 
148
158
  # forward cookies on the redirect
149
159
  if !exception.io.meta["set-cookie"].to_s.empty?
150
- options["Cookie"] = exception.io.meta["set-cookie"]
160
+ options["Cookie"] ||= ''
161
+ # Add new cookies avoiding duplication
162
+ new_cookies = exception.io.meta["set-cookie"].to_s.split(';').map(&:strip)
163
+ old_cookies = options["Cookie"].split(';')
164
+ options["Cookie"] = (old_cookies | new_cookies).join(';')
151
165
  end
152
166
 
153
167
  follows_remaining -= 1
@@ -186,18 +200,18 @@ module Down
186
200
  end
187
201
 
188
202
  # Makes a Net::HTTP request and follows redirects.
189
- def net_http_request(uri, options, follows_remaining: options.fetch(:max_redirects, 2), &block)
203
+ def net_http_request(uri, options, follows_remaining:, &block)
190
204
  http, request = create_net_http(uri, options)
191
205
 
192
206
  begin
193
207
  response = http.start do
194
- http.request(request) do |response|
195
- unless response.is_a?(Net::HTTPRedirection)
196
- yield response
208
+ http.request(request) do |resp|
209
+ unless resp.is_a?(Net::HTTPRedirection)
210
+ yield resp
197
211
  # In certain cases the caller wants to download only one portion
198
212
  # of the file and close the connection, so we tell Net::HTTP that
199
213
  # it shouldn't continue retrieving it.
200
- response.instance_variable_set("@read", true)
214
+ resp.instance_variable_set("@read", true)
201
215
  end
202
216
  end
203
217
  end
@@ -205,7 +219,9 @@ module Down
205
219
  request_error!(exception)
206
220
  end
207
221
 
208
- if response.is_a?(Net::HTTPRedirection)
222
+ if response.is_a?(Net::HTTPNotModified)
223
+ raise Down::NotModified
224
+ elsif response.is_a?(Net::HTTPRedirection)
209
225
  raise Down::TooManyRedirects if follows_remaining == 0
210
226
 
211
227
  # fail if redirect URI is not a valid http or https URL
@@ -251,12 +267,13 @@ module Down
251
267
  http.read_timeout = options[:read_timeout] if options.key?(:read_timeout)
252
268
  http.open_timeout = options[:open_timeout] if options.key?(:open_timeout)
253
269
 
254
- headers = options.select { |key, value| key.is_a?(String) }
255
- headers.merge!(options[:headers]) if options[:headers]
270
+ headers = options[:headers].to_h
256
271
  headers["Accept-Encoding"] = "" # Net::HTTP's inflater causes FiberErrors
257
272
 
258
273
  get = Net::HTTP::Get.new(uri.request_uri, headers)
259
- get.basic_auth(uri.user, uri.password) if uri.user || uri.password
274
+
275
+ user, password = options[:http_basic_authentication] || [uri.user, uri.password]
276
+ get.basic_auth(user, password) if user || password
260
277
 
261
278
  [http, get]
262
279
  end
@@ -284,9 +301,10 @@ module Down
284
301
  end
285
302
 
286
303
  # Makes sure that the URL is properly encoded.
287
- def addressable_normalize(url)
288
- addressable_uri = Addressable::URI.parse(url)
289
- addressable_uri.normalize.to_s
304
+ def normalize_uri(url, uri_normalizer:)
305
+ URI(url)
306
+ rescue URI::InvalidURIError
307
+ uri_normalizer.call(url)
290
308
  end
291
309
 
292
310
  # When open-uri raises an exception, it doesn't expose the response object.
@@ -295,7 +313,11 @@ module Down
295
313
  def rebuild_response_from_open_uri_exception(exception)
296
314
  code, message = exception.io.status
297
315
 
298
- response_class = Net::HTTPResponse::CODE_TO_OBJ.fetch(code)
316
+ response_class = Net::HTTPResponse::CODE_TO_OBJ.fetch(code) do |c|
317
+ Net::HTTPResponse::CODE_CLASS_TO_OBJ.fetch(c[0]) do
318
+ Net::HTTPUnknownResponse
319
+ end
320
+ end
299
321
  response = response_class.new(nil, code, message)
300
322
 
301
323
  exception.io.metas.each do |name, values|
@@ -310,7 +332,7 @@ module Down
310
332
  code = response.code.to_i
311
333
  message = response.message.split(" ").map(&:capitalize).join(" ")
312
334
 
313
- args = ["#{code} #{message}", response: response]
335
+ args = ["#{code} #{message}", response]
314
336
 
315
337
  case response.code.to_i
316
338
  when 404 then raise Down::NotFound.new(*args)
@@ -336,6 +358,24 @@ module Down
336
358
  end
337
359
  end
338
360
 
361
+ # Merge default and ad-hoc options, merging nested headers.
362
+ def merge_options(options, headers = {}, **new_options)
363
+ # Deprecate passing headers as top-level options, taking into account
364
+ # that Ruby 2.7+ accepts kwargs with string keys.
365
+ if headers.any?
366
+ warn %([Down::NetHttp] Passing headers as top-level options has been deprecated, use the :headers option instead, e.g: `Down::NetHttp.download(headers: { "Key" => "Value", ... }, ...)`)
367
+ new_options[:headers] = headers
368
+ elsif new_options.any? { |key, value| key.is_a?(String) }
369
+ warn %([Down::NetHttp] Passing headers as top-level options has been deprecated, use the :headers option instead, e.g: `Down::NetHttp.download(headers: { "Key" => "Value", ... }, ...)`)
370
+ new_options[:headers] = new_options.select { |key, value| key.is_a?(String) }
371
+ new_options.reject! { |key, value| key.is_a?(String) }
372
+ end
373
+
374
+ options.merge(new_options) do |key, value1, value2|
375
+ key == :headers ? value1.merge(value2) : value2
376
+ end
377
+ end
378
+
339
379
  # Defines some additional attributes for the returned Tempfile (on top of what
340
380
  # OpenURI::Meta already defines).
341
381
  module DownloadedFile
data/lib/down/version.rb CHANGED
@@ -1,5 +1,5 @@
1
1
  # frozen-string-literal: true
2
2
 
3
3
  module Down
4
- VERSION = "5.0.0"
4
+ VERSION = "5.4.0"
5
5
  end
data/lib/down/wget.rb CHANGED
@@ -29,16 +29,16 @@ module Down
29
29
 
30
30
  # Downlods the remote file to disk. Accepts wget command-line options and
31
31
  # some additional options as well.
32
- def download(url, *args, max_size: nil, content_length_proc: nil, progress_proc: nil, destination: nil, **options)
32
+ def download(url, *args, max_size: nil, content_length_proc: nil, progress_proc: nil, destination: nil, extension: nil, **options)
33
33
  io = open(url, *args, **options, rewindable: false)
34
34
 
35
35
  content_length_proc.call(io.size) if content_length_proc && io.size
36
36
 
37
37
  if max_size && io.size && io.size > max_size
38
- raise Down::TooLarge, "file is too large (max is #{max_size/1024/1024}MB)"
38
+ raise Down::TooLarge, "file is too large (#{io.size/1024/1024}MB, max is #{max_size/1024/1024}MB)"
39
39
  end
40
40
 
41
- extname = File.extname(URI(url).path)
41
+ extname = extension ? ".#{extension}" : File.extname(URI(url).path)
42
42
  tempfile = Tempfile.new(["down-wget", extname], binmode: true)
43
43
 
44
44
  until io.eof?
@@ -49,7 +49,7 @@ module Down
49
49
  progress_proc.call(tempfile.size) if progress_proc
50
50
 
51
51
  if max_size && tempfile.size > max_size
52
- raise Down::TooLarge, "file is too large (max is #{max_size/1024/1024}MB)"
52
+ raise Down::TooLarge, "file is too large (#{tempfile.size/1024/1024}MB, max is #{max_size/1024/1024}MB)"
53
53
  end
54
54
  end
55
55
 
@@ -94,7 +94,7 @@ module Down
94
94
  raise Down::Error, "failed to parse response headers"
95
95
  end
96
96
 
97
- headers = parser.headers
97
+ headers = normalize_headers(parser.headers)
98
98
  status = parser.status_code
99
99
 
100
100
  content_length = headers["Content-Length"].to_i if headers["Content-Length"]
data/lib/down.rb CHANGED
@@ -6,12 +6,12 @@ require "down/net_http"
6
6
  module Down
7
7
  module_function
8
8
 
9
- def download(*args, &block)
10
- backend.download(*args, &block)
9
+ def download(*args, **options, &block)
10
+ backend.download(*args, **options, &block)
11
11
  end
12
12
 
13
- def open(*args, &block)
14
- backend.open(*args, &block)
13
+ def open(*args, **options, &block)
14
+ backend.open(*args, **options, &block)
15
15
  end
16
16
 
17
17
  # Allows setting a backend via a symbol or a downloader object.
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: down
3
3
  version: !ruby/object:Gem::Version
4
- version: 5.0.0
4
+ version: 5.4.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Janko Marohnić
8
- autorequire:
8
+ autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2019-09-26 00:00:00.000000000 Z
11
+ date: 2022-12-26 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: addressable
@@ -16,14 +16,14 @@ dependencies:
16
16
  requirements:
17
17
  - - "~>"
18
18
  - !ruby/object:Gem::Version
19
- version: '2.5'
19
+ version: '2.8'
20
20
  type: :runtime
21
21
  prerelease: false
22
22
  version_requirements: !ruby/object:Gem::Requirement
23
23
  requirements:
24
24
  - - "~>"
25
25
  - !ruby/object:Gem::Version
26
- version: '2.5'
26
+ version: '2.8'
27
27
  - !ruby/object:Gem::Dependency
28
28
  name: minitest
29
29
  requirement: !ruby/object:Gem::Requirement
@@ -66,20 +66,40 @@ dependencies:
66
66
  - - ">="
67
67
  - !ruby/object:Gem::Version
68
68
  version: '0'
69
+ - !ruby/object:Gem::Dependency
70
+ name: httpx
71
+ requirement: !ruby/object:Gem::Requirement
72
+ requirements:
73
+ - - "~>"
74
+ - !ruby/object:Gem::Version
75
+ version: '0.22'
76
+ - - ">="
77
+ - !ruby/object:Gem::Version
78
+ version: 0.22.2
79
+ type: :development
80
+ prerelease: false
81
+ version_requirements: !ruby/object:Gem::Requirement
82
+ requirements:
83
+ - - "~>"
84
+ - !ruby/object:Gem::Version
85
+ version: '0.22'
86
+ - - ">="
87
+ - !ruby/object:Gem::Version
88
+ version: 0.22.2
69
89
  - !ruby/object:Gem::Dependency
70
90
  name: http
71
91
  requirement: !ruby/object:Gem::Requirement
72
92
  requirements:
73
93
  - - "~>"
74
94
  - !ruby/object:Gem::Version
75
- version: '4.0'
95
+ version: '5.0'
76
96
  type: :development
77
97
  prerelease: false
78
98
  version_requirements: !ruby/object:Gem::Requirement
79
99
  requirements:
80
100
  - - "~>"
81
101
  - !ruby/object:Gem::Version
82
- version: '4.0'
102
+ version: '5.0'
83
103
  - !ruby/object:Gem::Dependency
84
104
  name: posix-spawn
85
105
  requirement: !ruby/object:Gem::Requirement
@@ -109,7 +129,7 @@ dependencies:
109
129
  - !ruby/object:Gem::Version
110
130
  version: '0'
111
131
  - !ruby/object:Gem::Dependency
112
- name: docker-api
132
+ name: warning
113
133
  requirement: !ruby/object:Gem::Requirement
114
134
  requirements:
115
135
  - - ">="
@@ -122,7 +142,7 @@ dependencies:
122
142
  - - ">="
123
143
  - !ruby/object:Gem::Version
124
144
  version: '0'
125
- description:
145
+ description:
126
146
  email:
127
147
  - janko.marohnic@gmail.com
128
148
  executables: []
@@ -138,6 +158,7 @@ files:
138
158
  - lib/down/chunked_io.rb
139
159
  - lib/down/errors.rb
140
160
  - lib/down/http.rb
161
+ - lib/down/httpx.rb
141
162
  - lib/down/net_http.rb
142
163
  - lib/down/utils.rb
143
164
  - lib/down/version.rb
@@ -146,7 +167,7 @@ homepage: https://github.com/janko/down
146
167
  licenses:
147
168
  - MIT
148
169
  metadata: {}
149
- post_install_message:
170
+ post_install_message:
150
171
  rdoc_options: []
151
172
  require_paths:
152
173
  - lib
@@ -154,15 +175,15 @@ required_ruby_version: !ruby/object:Gem::Requirement
154
175
  requirements:
155
176
  - - ">="
156
177
  - !ruby/object:Gem::Version
157
- version: '2.1'
178
+ version: '2.3'
158
179
  required_rubygems_version: !ruby/object:Gem::Requirement
159
180
  requirements:
160
181
  - - ">="
161
182
  - !ruby/object:Gem::Version
162
183
  version: '0'
163
184
  requirements: []
164
- rubygems_version: 3.0.3
165
- signing_key:
185
+ rubygems_version: 3.4.1
186
+ signing_key:
166
187
  specification_version: 4
167
188
  summary: Robust streaming downloads using Net::HTTP, HTTP.rb or wget.
168
189
  test_files: []