down 5.0.0 → 5.4.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: a1f7a1532b638ed92acdb3bdf74211dbd022e7a0de37d1fdbc25f665337d4bf1
4
- data.tar.gz: 7bbc2684d53e278376981b4dd4741c49bc0d8da0990ece28fb03776f39aea6fe
3
+ metadata.gz: e0544a70de3afab2a00c68df6923572a5533f05a0964b8c5186728019d04e55b
4
+ data.tar.gz: 3474da6a6c7a182aa02deb72139002ebf97ac98bee9c4a833de2b0413e373924
5
5
  SHA512:
6
- metadata.gz: 6a3e62293c5fa1e5b43a5c835af3804ab8307babfb573179057ec279239b72a16ff74b2b2e96df9da033a3e4e13be4af6adeb8ec00dd73a27f75942b646419de
7
- data.tar.gz: d267672bb2468c8998e271ff3849908c7beead00d8b9777799e6bc0abd1ab51ccf14d5c567c16f3bb30a41eb3c57b66ffbc2755d333055fba972d327a865acb4
6
+ metadata.gz: 64d22c0a25ddf60f2dea37cb03264ec5770a8c7877fb7a843052abca4b043ba9d2d0c8eed830fc2cec97bb852338b114e2907c357aaf4b68aad26a6860459f67
7
+ data.tar.gz: 992209a7e4201cab464958bac0960ba4a6dc0344cd410b7ceaec2a02b89ebc1bd876396349204bb0bfe175a279fbdb41900c9fee85eee7e98d74eec7afa2e7a0
data/CHANGELOG.md CHANGED
@@ -1,3 +1,61 @@
1
+ ## 5.4.0 (2022-12-26)
2
+
3
+ * Add new HTTPX backend, which supports HTTP/2 protocol among other features (@HoneyryderChuck)
4
+
5
+ ## 5.3.1 (2022-03-25)
6
+
7
+ * Correctly split cookie headers on `;` instead of `,` when forwarding them on redirects (@ermolaev)
8
+
9
+ ## 5.3.0 (2022-02-20)
10
+
11
+ * Add `:extension` argument to `Down.download` for overriding tempfile extension (@razum2um)
12
+
13
+ * Normalize response header names for http.rb and wget backends (@zarqman)
14
+
15
+ ## 5.2.4 (2021-09-12)
16
+
17
+ * Keep original cookies between redirections (@antprt)
18
+
19
+ ## 5.2.3 (2021-08-03)
20
+
21
+ * Bump addressable version requirement to 2.8+ to remediate vulnerability (@aldodelgado)
22
+
23
+ ## 5.2.2 (2021-05-27)
24
+
25
+ * Add info about received content length in `Down::TooLarge` error (@evheny0)
26
+
27
+ * Relax http.rb constraint to allow versions 5.x (@mgrunberg)
28
+
29
+ ## 5.2.1 (2021-04-26)
30
+
31
+ * Raise `Down::NotModified` on 304 response status in `Down::NetHttp#open` (@ellafeldmann)
32
+
33
+ ## 5.2.0 (2020-09-20)
34
+
35
+ * Add `:uri_normalizer` option to `Down::NetHttp` (@janko)
36
+
37
+ * Add `:http_basic_authentication` option to `Down::NetHttp#open` (@janko)
38
+
39
+ * Fix uninitialized instance variables warnings in `Down::ChunkedIO` (@janko)
40
+
41
+ * Handle unknown HTTP error codes in `Down::NetHttp` (@darndt)
42
+
43
+ ## 5.1.1 (2020-02-04)
44
+
45
+ * Fix keyword arguments warnings on Ruby 2.7 in `Down.download` and `Down.open` (@janko)
46
+
47
+ ## 5.1.0 (2020-01-09)
48
+
49
+ * Fix keyword arguments warnings on Ruby 2.7 (@janko)
50
+
51
+ * Fix `FrozenError` exception in `Down::ChunkedIO#readpartial` (@janko)
52
+
53
+ * Deprecate passing headers as top-level options in `Down::NetHttp` (@janko)
54
+
55
+ ## 5.0.1 (2019-12-20)
56
+
57
+ * In `Down::NetHttp` only use Addressable normalization if `URI.parse` fails (@coding-chimp)
58
+
1
59
  ## 5.0.0 (2019-09-26)
2
60
 
3
61
  * Change `ChunkedIO#each_chunk` to return chunks in original encoding (@janko)
data/README.md CHANGED
@@ -1,13 +1,13 @@
1
1
  # Down
2
2
 
3
3
  Down is a utility tool for streaming, flexible and safe downloading of remote
4
- files. It can use [open-uri] + `Net::HTTP`, [http.rb] or `wget` as the backend
5
- HTTP library.
4
+ files. It can use [open-uri] + `Net::HTTP`, [http.rb], [HTTPX], or `wget` as
5
+ the backend HTTP library.
6
6
 
7
7
  ## Installation
8
8
 
9
9
  ```rb
10
- gem "down", "~> 4.4"
10
+ gem "down", "~> 5.0"
11
11
  ```
12
12
 
13
13
  ## Downloading
@@ -63,6 +63,13 @@ Down.download("http://example.com/image.jpg", destination: "/path/to/destination
63
63
  In this case `Down.download` won't have any return value, so if you need a File
64
64
  object you'll have to create it manually.
65
65
 
66
+ You can also keep the tempfile, but override the extension:
67
+
68
+ ```rb
69
+ tempfile = Down.download("http://example.com/some/file", extension: "txt")
70
+ File.extname(tempfile.path) #=> ".txt"
71
+ ```
72
+
66
73
  ### Basic authentication
67
74
 
68
75
  `Down.download` and `Down.open` will automatically detect and apply HTTP basic
@@ -157,7 +164,7 @@ You can access the response status and headers of the HTTP request that was made
157
164
  ```rb
158
165
  remote_file = Down.open("http://example.com/image.jpg")
159
166
  remote_file.data[:status] #=> 200
160
- remote_file.data[:headers] #=> { ... }
167
+ remote_file.data[:headers] #=> { "Content-Type" => "image/jpeg", ... } (header names are normalized)
161
168
  remote_file.data[:response] # returns the response object
162
169
  ```
163
170
 
@@ -212,6 +219,7 @@ the `Down::Error` subclasses. This is Down's exception hierarchy:
212
219
  * `Down::TooLarge`
213
220
  * `Down::InvalidUrl`
214
221
  * `Down::TooManyRedirects`
222
+ * `Down::NotModified`
215
223
  * `Down::ResponseError`
216
224
  * `Down::ClientError`
217
225
  * `Down::NotFound`
@@ -226,6 +234,7 @@ The following backends are available:
226
234
 
227
235
  * [Down::NetHttp](#downnethttp) (default)
228
236
  * [Down::Http](#downhttp)
237
+ * [Down::Httpx](#downhttpx)
229
238
  * [Down::Wget](#downwget)
230
239
 
231
240
  You can use the backend directly:
@@ -251,10 +260,10 @@ Down.open("...")
251
260
  ### Down::NetHttp
252
261
 
253
262
  The `Down::NetHttp` backend implements downloads using [open-uri] and
254
- [Net::HTTP].
263
+ [Net::HTTP] standard libraries.
255
264
 
256
265
  ```rb
257
- gem "down", "~> 4.4"
266
+ gem "down", "~> 5.0"
258
267
  ```
259
268
  ```rb
260
269
  require "down/net_http"
@@ -333,6 +342,18 @@ Down::NetHttp.open("http://example.com/image.jpg",
333
342
  ssl_verify_mode: OpenSSL::SSL::VERIFY_PEER)
334
343
  ```
335
344
 
345
+ #### URI normalization
346
+
347
+ If the URL isn't parseable by `URI.parse`, `Down::NetHttp` will
348
+ attempt to normalize the URL using [Addressable::URI], URI-escaping
349
+ any potentially unescaped characters. You can change the normalizer
350
+ via the `:uri_normalizer` option:
351
+
352
+ ```rb
353
+ # this skips URL normalization
354
+ Down::NetHttp.download("http://example.com/image.jpg", uri_normalizer: -> (url) { url })
355
+ ```
356
+
336
357
  #### Additional options
337
358
 
338
359
  Any additional options passed to `Down.download` will be forwarded to
@@ -358,8 +379,8 @@ net_http.open("http://example.com/image.jpg")
358
379
  The `Down::Http` backend implements downloads using the [http.rb] gem.
359
380
 
360
381
  ```rb
361
- gem "down", "~> 4.4"
362
- gem "http", "~> 4.0"
382
+ gem "down", "~> 5.0"
383
+ gem "http", "~> 5.0"
363
384
  ```
364
385
  ```rb
365
386
  require "down/http"
@@ -422,13 +443,35 @@ down = Down::Http.new(method: :post)
422
443
  down.download("http://example.org/image.jpg")
423
444
  ```
424
445
 
446
+ ### Down::Httpx
447
+
448
+ The `Down::Httpx` backend implements downloads using the [HTTPX] gem, which
449
+ supports the HTTP/2 protocol, in addition to many other features.
450
+
451
+ ```rb
452
+ gem "down", "~> 5.0"
453
+ gem "httpx", "~> 0.22"
454
+ ```
455
+ ```rb
456
+ require "down/httpx"
457
+
458
+ tempfile = Down::Httpx.download("http://nature.com/forest.jpg")
459
+ tempfile #=> #<Tempfile:/var/folders/k7/6zx6dx6x7ys3rv3srh0nyfj00000gn/T/20150925-55456-z7vxqz.jpg>
460
+
461
+ io = Down::Httpx.open("http://nature.com/forest.jpg")
462
+ io #=> #<Down::ChunkedIO ...>
463
+ ```
464
+
465
+ It's implemented in much of the same way as `Down::Http`, so be sure to check
466
+ its docs for ways to pass additional options.
467
+
425
468
  ### Down::Wget (experimental)
426
469
 
427
470
  The `Down::Wget` backend implements downloads using the `wget` command line
428
471
  utility.
429
472
 
430
473
  ```rb
431
- gem "down", "~> 4.4"
474
+ gem "down", "~> 5.0"
432
475
  gem "posix-spawn" # omit if on JRuby
433
476
  gem "http_parser.rb"
434
477
  ```
@@ -470,24 +513,30 @@ wget.open("http://nature.com/forest.jpg")
470
513
 
471
514
  ## Supported Ruby versions
472
515
 
473
- * MRI 2.2
474
516
  * MRI 2.3
475
517
  * MRI 2.4
476
518
  * MRI 2.5
477
519
  * MRI 2.6
478
- * JRuby
520
+ * MRI 2.7
521
+ * MRI 3.0
522
+ * MRI 3.1
523
+ * JRuby 9.3
479
524
 
480
525
  ## Development
481
526
 
482
- You can run tests with
527
+ Tests require that a [httpbin] server is running locally, which you can do via Docker:
528
+
529
+ ```sh
530
+ $ docker pull kennethreitz/httpbin
531
+ $ docker run -p 80:80 kennethreitz/httpbin
532
+ ```
533
+
534
+ Then you can run tests:
483
535
 
484
536
  ```
485
537
  $ bundle exec rake test
486
538
  ```
487
539
 
488
- The test suite pulls and runs [kennethreitz/httpbin] as a Docker container, so
489
- you'll need to have Docker installed and running.
490
-
491
540
  ## License
492
541
 
493
542
  [MIT](LICENSE.txt)
@@ -495,5 +544,6 @@ you'll need to have Docker installed and running.
495
544
  [open-uri]: http://ruby-doc.org/stdlib-2.3.0/libdoc/open-uri/rdoc/OpenURI.html
496
545
  [Net::HTTP]: https://ruby-doc.org/stdlib-2.4.1/libdoc/net/http/rdoc/Net/HTTP.html
497
546
  [http.rb]: https://github.com/httprb/http
547
+ [HTTPX]: https://github.com/HoneyryderChuck/httpx
498
548
  [Addressable::URI]: https://github.com/sporkmonger/addressable
499
- [kennethreitz/httpbin]: https://github.com/kennethreitz/httpbin
549
+ [httpbin]: https://github.com/postmanlabs/httpbin
data/down.gemspec CHANGED
@@ -4,7 +4,7 @@ Gem::Specification.new do |spec|
4
4
  spec.name = "down"
5
5
  spec.version = Down::VERSION
6
6
 
7
- spec.required_ruby_version = ">= 2.1"
7
+ spec.required_ruby_version = ">= 2.3"
8
8
 
9
9
  spec.summary = "Robust streaming downloads using Net::HTTP, HTTP.rb or wget."
10
10
  spec.homepage = "https://github.com/janko/down"
@@ -15,13 +15,19 @@ Gem::Specification.new do |spec|
15
15
  spec.files = Dir["README.md", "LICENSE.txt", "CHANGELOG.md", "*.gemspec", "lib/**/*.rb"]
16
16
  spec.require_path = "lib"
17
17
 
18
- spec.add_dependency "addressable", "~> 2.5"
18
+ spec.add_dependency "addressable", "~> 2.8"
19
19
 
20
20
  spec.add_development_dependency "minitest", "~> 5.8"
21
21
  spec.add_development_dependency "mocha", "~> 1.5"
22
22
  spec.add_development_dependency "rake"
23
- spec.add_development_dependency "http", "~> 4.0"
23
+ spec.add_development_dependency "httpx", "~> 0.22", ">= 0.22.2"
24
+ # http 5.0 drop support of ruby 2.3 and 2.4. We still support those versions.
25
+ if RUBY_VERSION >= "2.5"
26
+ spec.add_development_dependency "http", "~> 5.0"
27
+ else
28
+ spec.add_development_dependency "http", "~> 4.3"
29
+ end
24
30
  spec.add_development_dependency "posix-spawn" unless RUBY_ENGINE == "jruby"
25
- spec.add_development_dependency "http_parser.rb"
26
- spec.add_development_dependency "docker-api"
31
+ spec.add_development_dependency "http_parser.rb" unless RUBY_ENGINE == "jruby"
32
+ spec.add_development_dependency "warning" if RUBY_VERSION >= "2.4"
27
33
  end
data/lib/down/backend.rb CHANGED
@@ -9,12 +9,12 @@ require "fileutils"
9
9
 
10
10
  module Down
11
11
  class Backend
12
- def self.download(*args, &block)
13
- new.download(*args, &block)
12
+ def self.download(*args, **options, &block)
13
+ new.download(*args, **options, &block)
14
14
  end
15
15
 
16
- def self.open(*args, &block)
17
- new.open(*args, &block)
16
+ def self.open(*args, **options, &block)
17
+ new.open(*args, **options, &block)
18
18
  end
19
19
 
20
20
  private
@@ -29,5 +29,12 @@ module Down
29
29
 
30
30
  nil
31
31
  end
32
+
33
+ def normalize_headers(response_headers)
34
+ response_headers.inject({}) do |headers, (downcased_name, value)|
35
+ name = downcased_name.split("-").map(&:capitalize).join("-")
36
+ headers.merge!(name => value)
37
+ end
38
+ end
32
39
  end
33
40
  end
@@ -36,6 +36,8 @@ module Down
36
36
  @rewindable = rewindable
37
37
  @buffer = nil
38
38
  @position = 0
39
+ @next_chunk = nil
40
+ @closed = false
39
41
 
40
42
  retrieve_chunk # fetch first chunk so that we know whether the file is empty
41
43
  end
@@ -63,7 +65,9 @@ module Down
63
65
  def read(length = nil, outbuf = nil)
64
66
  fail IOError, "closed stream" if closed?
65
67
 
66
- data = outbuf.to_s.clear.force_encoding(Encoding::BINARY)
68
+ data = outbuf.clear.force_encoding(Encoding::BINARY) if outbuf
69
+ data ||= "".b
70
+
67
71
  remaining_length = length
68
72
 
69
73
  until remaining_length == 0 || eof?
@@ -142,7 +146,8 @@ module Down
142
146
  # or the next chunk. This is useful when you don't care about the size of
143
147
  # chunks and you want to minimize string allocations.
144
148
  #
145
- # With `length` argument returns maximum of that amount of bytes.
149
+ # With `maxlen` argument returns maximum of that amount of bytes (default
150
+ # is 16KB).
146
151
  #
147
152
  # With `outbuf` argument each call will return that same string object,
148
153
  # where the value is replaced with retrieved content.
@@ -154,7 +159,8 @@ module Down
154
159
  maxlen ||= 16*1024
155
160
 
156
161
  data = cache.read(maxlen, outbuf) if cache && !cache.eof?
157
- data ||= outbuf.to_s.clear
162
+ data ||= outbuf.clear.force_encoding(Encoding::BINARY) if outbuf
163
+ data ||= "".b
158
164
 
159
165
  return data if maxlen == 0
160
166
 
data/lib/down/errors.rb CHANGED
@@ -13,11 +13,14 @@ module Down
13
13
  # raised when the number of redirects was larger than the specified maximum
14
14
  class TooManyRedirects < Error; end
15
15
 
16
+ # raised when the requested resource has not been modified
17
+ class NotModified < Error; end
18
+
16
19
  # raised when response returned 4xx or 5xx response
17
20
  class ResponseError < Error
18
21
  attr_reader :response
19
22
 
20
- def initialize(message, response: nil)
23
+ def initialize(message, response = nil)
21
24
  super(message)
22
25
  @response = response
23
26
  end
data/lib/down/http.rb CHANGED
@@ -1,6 +1,6 @@
1
1
  # frozen-string-literal: true
2
2
 
3
- gem "http", ">= 2.1.0", "< 5"
3
+ gem "http", ">= 2.1.0", "< 6"
4
4
 
5
5
  require "http"
6
6
 
@@ -12,7 +12,7 @@ module Down
12
12
  # Provides streaming downloads implemented with HTTP.rb.
13
13
  class Http < Backend
14
14
  # Initializes the backend with common defaults.
15
- def initialize(options = {}, &block)
15
+ def initialize(**options, &block)
16
16
  @method = options.delete(:method) || :get
17
17
  @client = HTTP
18
18
  .headers("User-Agent" => "Down/#{Down::VERSION}")
@@ -25,16 +25,16 @@ module Down
25
25
 
26
26
  # Downlods the remote file to disk. Accepts HTTP.rb options via a hash or a
27
27
  # block, and some additional options as well.
28
- def download(url, max_size: nil, progress_proc: nil, content_length_proc: nil, destination: nil, **options, &block)
28
+ def download(url, max_size: nil, progress_proc: nil, content_length_proc: nil, destination: nil, extension: nil, **options, &block)
29
29
  response = request(url, **options, &block)
30
30
 
31
31
  content_length_proc.call(response.content_length) if content_length_proc && response.content_length
32
32
 
33
33
  if max_size && response.content_length && response.content_length > max_size
34
- raise Down::TooLarge, "file is too large (max is #{max_size/1024/1024}MB)"
34
+ raise Down::TooLarge, "file is too large (#{response.content_length/1024/1024}MB, max is #{max_size/1024/1024}MB)"
35
35
  end
36
36
 
37
- extname = File.extname(response.uri.path)
37
+ extname = extension ? ".#{extension}" : File.extname(response.uri.path)
38
38
  tempfile = Tempfile.new(["down-http", extname], binmode: true)
39
39
 
40
40
  stream_body(response) do |chunk|
@@ -44,7 +44,7 @@ module Down
44
44
  progress_proc.call(tempfile.size) if progress_proc
45
45
 
46
46
  if max_size && tempfile.size > max_size
47
- raise Down::TooLarge, "file is too large (max is #{max_size/1024/1024}MB)"
47
+ raise Down::TooLarge, "file is too large (#{tempfile.size/1024/1024}MB, max is #{max_size/1024/1024}MB)"
48
48
  end
49
49
  end
50
50
 
@@ -52,7 +52,7 @@ module Down
52
52
 
53
53
  tempfile.extend Down::Http::DownloadedFile
54
54
  tempfile.url = response.uri.to_s
55
- tempfile.headers = response.headers.to_h
55
+ tempfile.headers = normalize_headers(response.headers.to_h)
56
56
 
57
57
  download_result(tempfile, destination)
58
58
  rescue
@@ -71,7 +71,11 @@ module Down
71
71
  size: response.content_length,
72
72
  encoding: response.content_type.charset,
73
73
  rewindable: rewindable,
74
- data: { status: response.code, headers: response.headers.to_h, response: response },
74
+ data: {
75
+ status: response.code,
76
+ headers: normalize_headers(response.headers.to_h),
77
+ response: response
78
+ },
75
79
  )
76
80
  end
77
81
 
@@ -106,7 +110,7 @@ module Down
106
110
 
107
111
  # Raises non-sucessful response as a Down::ResponseError.
108
112
  def response_error!(response)
109
- args = [response.status.to_s, response: response]
113
+ args = [response.status.to_s, response]
110
114
 
111
115
  case response.code
112
116
  when 404 then raise Down::NotFound.new(*args)
data/lib/down/httpx.rb ADDED
@@ -0,0 +1,175 @@
1
+ # frozen-string-literal: true
2
+
3
+ require "uri"
4
+ require "tempfile"
5
+ require "httpx"
6
+
7
+ require "down/backend"
8
+
9
+
10
+ module Down
11
+ # Provides streaming downloads implemented with HTTPX.
12
+ class Httpx < Backend
13
+ # Initializes the backend
14
+
15
+ USER_AGENT = "Down/#{Down::VERSION}"
16
+
17
+ def initialize(**options, &block)
18
+ @method = options.delete(:method) || :get
19
+ headers = options.delete(:headers) || {}
20
+ @client = HTTPX
21
+ .plugin(:follow_redirects, max_redirects: 2)
22
+ .plugin(:basic_authentication)
23
+ .plugin(:stream)
24
+ .with(
25
+ headers: { "user-agent": USER_AGENT }.merge(headers),
26
+ timeout: { connect_timeout: 30, write_timeout: 30, read_timeout: 30 },
27
+ **options
28
+ )
29
+
30
+ @client = block.call(@client) if block
31
+ end
32
+
33
+
34
+ # Downlods the remote file to disk. Accepts HTTPX options via a hash or a
35
+ # block, and some additional options as well.
36
+ def download(url, max_size: nil, progress_proc: nil, content_length_proc: nil, destination: nil, extension: nil, **options, &block)
37
+ client = @client
38
+
39
+ response = request(client, url, **options, &block)
40
+
41
+ content_length = nil
42
+
43
+ if response.headers.key?("content-length")
44
+ content_length = response.headers["content-length"].to_i
45
+
46
+ content_length_proc.call(content_length) if content_length_proc
47
+
48
+ if max_size && content_length > max_size
49
+ response.close
50
+ raise Down::TooLarge, "file is too large (#{content_length/1024/1024}MB, max is #{max_size/1024/1024}MB)"
51
+ end
52
+ end
53
+
54
+ extname = extension ? ".#{extension}" : File.extname(response.uri.path)
55
+ tempfile = Tempfile.new(["down-http", extname], binmode: true)
56
+
57
+ stream_body(response) do |chunk|
58
+ tempfile.write(chunk)
59
+ chunk.clear # deallocate string
60
+
61
+ progress_proc.call(tempfile.size) if progress_proc
62
+
63
+ if max_size && tempfile.size > max_size
64
+ raise Down::TooLarge, "file is too large (#{tempfile.size/1024/1024}MB, max is #{max_size/1024/1024}MB)"
65
+ end
66
+ end
67
+
68
+ tempfile.open # flush written content
69
+
70
+ tempfile.extend DownloadedFile
71
+ tempfile.url = response.uri.to_s
72
+ tempfile.headers = normalize_headers(response.headers.to_h)
73
+ tempfile.content_type = response.content_type.mime_type
74
+ tempfile.charset = response.body.encoding
75
+
76
+ download_result(tempfile, destination)
77
+ rescue
78
+ tempfile.close! if tempfile
79
+ raise
80
+ end
81
+
82
+ # Starts retrieving the remote file and returns an IO-like object which
83
+ # downloads the response body on-demand. Accepts HTTP.rb options via a hash
84
+ # or a block.
85
+ def open(url, rewindable: true, **options, &block)
86
+ response = request(@client, url, stream: true, **options, &block)
87
+ size = response.headers["content-length"]
88
+ size = size.to_i if size
89
+ Down::ChunkedIO.new(
90
+ chunks: enum_for(:stream_body, response),
91
+ size: size,
92
+ encoding: response.body.encoding,
93
+ rewindable: rewindable,
94
+ data: {
95
+ status: response.status,
96
+ headers: normalize_headers(response.headers.to_h),
97
+ response: response
98
+ },
99
+ )
100
+ end
101
+
102
+ private
103
+
104
+ # Yields chunks of the response body to the block.
105
+ def stream_body(response, &block)
106
+ response.each(&block)
107
+ rescue => exception
108
+ request_error!(exception)
109
+ end
110
+
111
+ def request(client, url, method: @method, **options, &block)
112
+ response = send_request(client, method, url, **options, &block)
113
+ response.raise_for_status
114
+ response_error!(response) unless (200..299).include?(response.status)
115
+ response
116
+ rescue HTTPX::HTTPError
117
+ response_error!(response)
118
+ rescue => error
119
+ request_error!(error)
120
+ end
121
+
122
+ def send_request(client, method, url, **options, &block)
123
+ uri = URI(url)
124
+ client = @client
125
+ if uri.user || uri.password
126
+ client = client.basic_auth(uri.user, uri.password)
127
+ uri.user = uri.password = nil
128
+ end
129
+ client = block.call(client) if block
130
+
131
+ client.request(method, uri, stream: true, **options)
132
+ rescue => exception
133
+ request_error!(exception)
134
+ end
135
+
136
+ # Raises non-sucessful response as a Down::ResponseError.
137
+ def response_error!(response)
138
+ args = [response.status.to_s, response]
139
+
140
+ case response.status
141
+ when 300..399 then raise Down::TooManyRedirects, "too many redirects"
142
+ when 404 then raise Down::NotFound.new(*args)
143
+ when 400..499 then raise Down::ClientError.new(*args)
144
+ when 500..599 then raise Down::ServerError.new(*args)
145
+ else raise Down::ResponseError.new(*args)
146
+ end
147
+ end
148
+
149
+ # Re-raise HTTP.rb exceptions as Down::Error exceptions.
150
+ def request_error!(exception)
151
+ case exception
152
+ when URI::Error, HTTPX::UnsupportedSchemeError
153
+ raise Down::InvalidUrl, exception.message
154
+ when HTTPX::ConnectionError
155
+ raise Down::ConnectionError, exception.message
156
+ when HTTPX::TimeoutError
157
+ raise Down::TimeoutError, exception.message
158
+ when OpenSSL::SSL::SSLError
159
+ raise Down::SSLError, exception.message
160
+ else
161
+ raise exception
162
+ end
163
+ end
164
+
165
+ # Defines some additional attributes for the returned Tempfile.
166
+ module DownloadedFile
167
+ attr_accessor :url, :headers, :charset, :content_type
168
+
169
+ def original_filename
170
+ Utils.filename_from_content_disposition(headers["Content-Disposition"]) ||
171
+ Utils.filename_from_path(URI.parse(url).path)
172
+ end
173
+ end
174
+ end
175
+ end
data/lib/down/net_http.rb CHANGED
@@ -12,27 +12,35 @@ require "fileutils"
12
12
  module Down
13
13
  # Provides streaming downloads implemented with Net::HTTP and open-uri.
14
14
  class NetHttp < Backend
15
+ URI_NORMALIZER = -> (url) do
16
+ addressable_uri = Addressable::URI.parse(url)
17
+ addressable_uri.normalize.to_s
18
+ end
19
+
15
20
  # Initializes the backend with common defaults.
16
- def initialize(options = {})
17
- @options = {
18
- "User-Agent" => "Down/#{Down::VERSION}",
19
- max_redirects: 2,
20
- open_timeout: 30,
21
- read_timeout: 30,
22
- }.merge(options)
21
+ def initialize(*args, **options)
22
+ @options = merge_options({
23
+ headers: { "User-Agent" => "Down/#{Down::VERSION}" },
24
+ max_redirects: 2,
25
+ open_timeout: 30,
26
+ read_timeout: 30,
27
+ uri_normalizer: URI_NORMALIZER,
28
+ }, *args, **options)
23
29
  end
24
30
 
25
31
  # Downloads a remote file to disk using open-uri. Accepts any open-uri
26
32
  # options, and a few more.
27
- def download(url, options = {})
28
- options = @options.merge(options)
33
+ def download(url, *args, **options)
34
+ options = merge_options(@options, *args, **options)
29
35
 
30
36
  max_size = options.delete(:max_size)
31
37
  max_redirects = options.delete(:max_redirects)
32
38
  progress_proc = options.delete(:progress_proc)
33
39
  content_length_proc = options.delete(:content_length_proc)
34
40
  destination = options.delete(:destination)
35
- headers = options.delete(:headers) || {}
41
+ headers = options.delete(:headers)
42
+ uri_normalizer = options.delete(:uri_normalizer)
43
+ extension = options.delete(:extension)
36
44
 
37
45
  # Use open-uri's :content_lenth_proc or :progress_proc to raise an
38
46
  # exception early if the file is too large.
@@ -42,13 +50,13 @@ module Down
42
50
  open_uri_options = {
43
51
  content_length_proc: proc { |size|
44
52
  if size && max_size && size > max_size
45
- raise Down::TooLarge, "file is too large (max is #{max_size/1024/1024}MB)"
53
+ raise Down::TooLarge, "file is too large (#{size/1024/1024}MB, max is #{max_size/1024/1024}MB)"
46
54
  end
47
55
  content_length_proc.call(size) if content_length_proc
48
56
  },
49
57
  progress_proc: proc { |current_size|
50
58
  if max_size && current_size > max_size
51
- raise Down::TooLarge, "file is too large (max is #{max_size/1024/1024}MB)"
59
+ raise Down::TooLarge, "file is too large (#{current_size/1024/1024}MB, max is #{max_size/1024/1024}MB)"
52
60
  end
53
61
  progress_proc.call(current_size) if progress_proc
54
62
  },
@@ -74,7 +82,7 @@ module Down
74
82
  open_uri_options.merge!(options)
75
83
  open_uri_options.merge!(headers)
76
84
 
77
- uri = ensure_uri(addressable_normalize(url))
85
+ uri = ensure_uri(normalize_uri(url, uri_normalizer: uri_normalizer))
78
86
 
79
87
  # Handle basic authentication in the remote URL.
80
88
  if uri.user || uri.password
@@ -86,7 +94,8 @@ module Down
86
94
  open_uri_file = open_uri(uri, open_uri_options, follows_remaining: max_redirects)
87
95
 
88
96
  # Handle the fact that open-uri returns StringIOs for small files.
89
- tempfile = ensure_tempfile(open_uri_file, File.extname(open_uri_file.base_uri.path))
97
+ extname = extension ? ".#{extension}" : File.extname(open_uri_file.base_uri.path)
98
+ tempfile = ensure_tempfile(open_uri_file, extname)
90
99
  OpenURI::Meta.init tempfile, open_uri_file # add back open-uri methods
91
100
  tempfile.extend Down::NetHttp::DownloadedFile
92
101
 
@@ -95,13 +104,17 @@ module Down
95
104
 
96
105
  # Starts retrieving the remote file using Net::HTTP and returns an IO-like
97
106
  # object which downloads the response body on-demand.
98
- def open(url, options = {})
99
- uri = ensure_uri(addressable_normalize(url))
100
- options = @options.merge(options)
107
+ def open(url, *args, **options)
108
+ options = merge_options(@options, *args, **options)
109
+
110
+ max_redirects = options.delete(:max_redirects)
111
+ uri_normalizer = options.delete(:uri_normalizer)
112
+
113
+ uri = ensure_uri(normalize_uri(url, uri_normalizer: uri_normalizer))
101
114
 
102
115
  # Create a Fiber that halts when response headers are received.
103
116
  request = Fiber.new do
104
- net_http_request(uri, options) do |response|
117
+ net_http_request(uri, options, follows_remaining: max_redirects) do |response|
105
118
  Fiber.yield response
106
119
  end
107
120
  end
@@ -119,10 +132,7 @@ module Down
119
132
  on_close: -> { request.resume }, # close HTTP connnection
120
133
  data: {
121
134
  status: response.code.to_i,
122
- headers: response.each_header.inject({}) { |headers, (downcased_name, value)|
123
- name = downcased_name.split("-").map(&:capitalize).join("-")
124
- headers.merge!(name => value)
125
- },
135
+ headers: normalize_headers(response.each_header),
126
136
  response: response,
127
137
  },
128
138
  )
@@ -131,7 +141,7 @@ module Down
131
141
  private
132
142
 
133
143
  # Calls open-uri's URI::HTTP#open method. Additionally handles redirects.
134
- def open_uri(uri, options, follows_remaining: 0)
144
+ def open_uri(uri, options, follows_remaining:)
135
145
  uri.open(options)
136
146
  rescue OpenURI::HTTPRedirect => exception
137
147
  raise Down::TooManyRedirects, "too many redirects" if follows_remaining == 0
@@ -147,7 +157,11 @@ module Down
147
157
 
148
158
  # forward cookies on the redirect
149
159
  if !exception.io.meta["set-cookie"].to_s.empty?
150
- options["Cookie"] = exception.io.meta["set-cookie"]
160
+ options["Cookie"] ||= ''
161
+ # Add new cookies avoiding duplication
162
+ new_cookies = exception.io.meta["set-cookie"].to_s.split(';').map(&:strip)
163
+ old_cookies = options["Cookie"].split(';')
164
+ options["Cookie"] = (old_cookies | new_cookies).join(';')
151
165
  end
152
166
 
153
167
  follows_remaining -= 1
@@ -186,18 +200,18 @@ module Down
186
200
  end
187
201
 
188
202
  # Makes a Net::HTTP request and follows redirects.
189
- def net_http_request(uri, options, follows_remaining: options.fetch(:max_redirects, 2), &block)
203
+ def net_http_request(uri, options, follows_remaining:, &block)
190
204
  http, request = create_net_http(uri, options)
191
205
 
192
206
  begin
193
207
  response = http.start do
194
- http.request(request) do |response|
195
- unless response.is_a?(Net::HTTPRedirection)
196
- yield response
208
+ http.request(request) do |resp|
209
+ unless resp.is_a?(Net::HTTPRedirection)
210
+ yield resp
197
211
  # In certain cases the caller wants to download only one portion
198
212
  # of the file and close the connection, so we tell Net::HTTP that
199
213
  # it shouldn't continue retrieving it.
200
- response.instance_variable_set("@read", true)
214
+ resp.instance_variable_set("@read", true)
201
215
  end
202
216
  end
203
217
  end
@@ -205,7 +219,9 @@ module Down
205
219
  request_error!(exception)
206
220
  end
207
221
 
208
- if response.is_a?(Net::HTTPRedirection)
222
+ if response.is_a?(Net::HTTPNotModified)
223
+ raise Down::NotModified
224
+ elsif response.is_a?(Net::HTTPRedirection)
209
225
  raise Down::TooManyRedirects if follows_remaining == 0
210
226
 
211
227
  # fail if redirect URI is not a valid http or https URL
@@ -251,12 +267,13 @@ module Down
251
267
  http.read_timeout = options[:read_timeout] if options.key?(:read_timeout)
252
268
  http.open_timeout = options[:open_timeout] if options.key?(:open_timeout)
253
269
 
254
- headers = options.select { |key, value| key.is_a?(String) }
255
- headers.merge!(options[:headers]) if options[:headers]
270
+ headers = options[:headers].to_h
256
271
  headers["Accept-Encoding"] = "" # Net::HTTP's inflater causes FiberErrors
257
272
 
258
273
  get = Net::HTTP::Get.new(uri.request_uri, headers)
259
- get.basic_auth(uri.user, uri.password) if uri.user || uri.password
274
+
275
+ user, password = options[:http_basic_authentication] || [uri.user, uri.password]
276
+ get.basic_auth(user, password) if user || password
260
277
 
261
278
  [http, get]
262
279
  end
@@ -284,9 +301,10 @@ module Down
284
301
  end
285
302
 
286
303
  # Makes sure that the URL is properly encoded.
287
- def addressable_normalize(url)
288
- addressable_uri = Addressable::URI.parse(url)
289
- addressable_uri.normalize.to_s
304
+ def normalize_uri(url, uri_normalizer:)
305
+ URI(url)
306
+ rescue URI::InvalidURIError
307
+ uri_normalizer.call(url)
290
308
  end
291
309
 
292
310
  # When open-uri raises an exception, it doesn't expose the response object.
@@ -295,7 +313,11 @@ module Down
295
313
  def rebuild_response_from_open_uri_exception(exception)
296
314
  code, message = exception.io.status
297
315
 
298
- response_class = Net::HTTPResponse::CODE_TO_OBJ.fetch(code)
316
+ response_class = Net::HTTPResponse::CODE_TO_OBJ.fetch(code) do |c|
317
+ Net::HTTPResponse::CODE_CLASS_TO_OBJ.fetch(c[0]) do
318
+ Net::HTTPUnknownResponse
319
+ end
320
+ end
299
321
  response = response_class.new(nil, code, message)
300
322
 
301
323
  exception.io.metas.each do |name, values|
@@ -310,7 +332,7 @@ module Down
310
332
  code = response.code.to_i
311
333
  message = response.message.split(" ").map(&:capitalize).join(" ")
312
334
 
313
- args = ["#{code} #{message}", response: response]
335
+ args = ["#{code} #{message}", response]
314
336
 
315
337
  case response.code.to_i
316
338
  when 404 then raise Down::NotFound.new(*args)
@@ -336,6 +358,24 @@ module Down
336
358
  end
337
359
  end
338
360
 
361
+ # Merge default and ad-hoc options, merging nested headers.
362
+ def merge_options(options, headers = {}, **new_options)
363
+ # Deprecate passing headers as top-level options, taking into account
364
+ # that Ruby 2.7+ accepts kwargs with string keys.
365
+ if headers.any?
366
+ warn %([Down::NetHttp] Passing headers as top-level options has been deprecated, use the :headers option instead, e.g: `Down::NetHttp.download(headers: { "Key" => "Value", ... }, ...)`)
367
+ new_options[:headers] = headers
368
+ elsif new_options.any? { |key, value| key.is_a?(String) }
369
+ warn %([Down::NetHttp] Passing headers as top-level options has been deprecated, use the :headers option instead, e.g: `Down::NetHttp.download(headers: { "Key" => "Value", ... }, ...)`)
370
+ new_options[:headers] = new_options.select { |key, value| key.is_a?(String) }
371
+ new_options.reject! { |key, value| key.is_a?(String) }
372
+ end
373
+
374
+ options.merge(new_options) do |key, value1, value2|
375
+ key == :headers ? value1.merge(value2) : value2
376
+ end
377
+ end
378
+
339
379
  # Defines some additional attributes for the returned Tempfile (on top of what
340
380
  # OpenURI::Meta already defines).
341
381
  module DownloadedFile
data/lib/down/version.rb CHANGED
@@ -1,5 +1,5 @@
1
1
  # frozen-string-literal: true
2
2
 
3
3
  module Down
4
- VERSION = "5.0.0"
4
+ VERSION = "5.4.0"
5
5
  end
data/lib/down/wget.rb CHANGED
@@ -29,16 +29,16 @@ module Down
29
29
 
30
30
  # Downlods the remote file to disk. Accepts wget command-line options and
31
31
  # some additional options as well.
32
- def download(url, *args, max_size: nil, content_length_proc: nil, progress_proc: nil, destination: nil, **options)
32
+ def download(url, *args, max_size: nil, content_length_proc: nil, progress_proc: nil, destination: nil, extension: nil, **options)
33
33
  io = open(url, *args, **options, rewindable: false)
34
34
 
35
35
  content_length_proc.call(io.size) if content_length_proc && io.size
36
36
 
37
37
  if max_size && io.size && io.size > max_size
38
- raise Down::TooLarge, "file is too large (max is #{max_size/1024/1024}MB)"
38
+ raise Down::TooLarge, "file is too large (#{io.size/1024/1024}MB, max is #{max_size/1024/1024}MB)"
39
39
  end
40
40
 
41
- extname = File.extname(URI(url).path)
41
+ extname = extension ? ".#{extension}" : File.extname(URI(url).path)
42
42
  tempfile = Tempfile.new(["down-wget", extname], binmode: true)
43
43
 
44
44
  until io.eof?
@@ -49,7 +49,7 @@ module Down
49
49
  progress_proc.call(tempfile.size) if progress_proc
50
50
 
51
51
  if max_size && tempfile.size > max_size
52
- raise Down::TooLarge, "file is too large (max is #{max_size/1024/1024}MB)"
52
+ raise Down::TooLarge, "file is too large (#{tempfile.size/1024/1024}MB, max is #{max_size/1024/1024}MB)"
53
53
  end
54
54
  end
55
55
 
@@ -94,7 +94,7 @@ module Down
94
94
  raise Down::Error, "failed to parse response headers"
95
95
  end
96
96
 
97
- headers = parser.headers
97
+ headers = normalize_headers(parser.headers)
98
98
  status = parser.status_code
99
99
 
100
100
  content_length = headers["Content-Length"].to_i if headers["Content-Length"]
data/lib/down.rb CHANGED
@@ -6,12 +6,12 @@ require "down/net_http"
6
6
  module Down
7
7
  module_function
8
8
 
9
- def download(*args, &block)
10
- backend.download(*args, &block)
9
+ def download(*args, **options, &block)
10
+ backend.download(*args, **options, &block)
11
11
  end
12
12
 
13
- def open(*args, &block)
14
- backend.open(*args, &block)
13
+ def open(*args, **options, &block)
14
+ backend.open(*args, **options, &block)
15
15
  end
16
16
 
17
17
  # Allows setting a backend via a symbol or a downloader object.
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: down
3
3
  version: !ruby/object:Gem::Version
4
- version: 5.0.0
4
+ version: 5.4.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Janko Marohnić
8
- autorequire:
8
+ autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2019-09-26 00:00:00.000000000 Z
11
+ date: 2022-12-26 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: addressable
@@ -16,14 +16,14 @@ dependencies:
16
16
  requirements:
17
17
  - - "~>"
18
18
  - !ruby/object:Gem::Version
19
- version: '2.5'
19
+ version: '2.8'
20
20
  type: :runtime
21
21
  prerelease: false
22
22
  version_requirements: !ruby/object:Gem::Requirement
23
23
  requirements:
24
24
  - - "~>"
25
25
  - !ruby/object:Gem::Version
26
- version: '2.5'
26
+ version: '2.8'
27
27
  - !ruby/object:Gem::Dependency
28
28
  name: minitest
29
29
  requirement: !ruby/object:Gem::Requirement
@@ -66,20 +66,40 @@ dependencies:
66
66
  - - ">="
67
67
  - !ruby/object:Gem::Version
68
68
  version: '0'
69
+ - !ruby/object:Gem::Dependency
70
+ name: httpx
71
+ requirement: !ruby/object:Gem::Requirement
72
+ requirements:
73
+ - - "~>"
74
+ - !ruby/object:Gem::Version
75
+ version: '0.22'
76
+ - - ">="
77
+ - !ruby/object:Gem::Version
78
+ version: 0.22.2
79
+ type: :development
80
+ prerelease: false
81
+ version_requirements: !ruby/object:Gem::Requirement
82
+ requirements:
83
+ - - "~>"
84
+ - !ruby/object:Gem::Version
85
+ version: '0.22'
86
+ - - ">="
87
+ - !ruby/object:Gem::Version
88
+ version: 0.22.2
69
89
  - !ruby/object:Gem::Dependency
70
90
  name: http
71
91
  requirement: !ruby/object:Gem::Requirement
72
92
  requirements:
73
93
  - - "~>"
74
94
  - !ruby/object:Gem::Version
75
- version: '4.0'
95
+ version: '5.0'
76
96
  type: :development
77
97
  prerelease: false
78
98
  version_requirements: !ruby/object:Gem::Requirement
79
99
  requirements:
80
100
  - - "~>"
81
101
  - !ruby/object:Gem::Version
82
- version: '4.0'
102
+ version: '5.0'
83
103
  - !ruby/object:Gem::Dependency
84
104
  name: posix-spawn
85
105
  requirement: !ruby/object:Gem::Requirement
@@ -109,7 +129,7 @@ dependencies:
109
129
  - !ruby/object:Gem::Version
110
130
  version: '0'
111
131
  - !ruby/object:Gem::Dependency
112
- name: docker-api
132
+ name: warning
113
133
  requirement: !ruby/object:Gem::Requirement
114
134
  requirements:
115
135
  - - ">="
@@ -122,7 +142,7 @@ dependencies:
122
142
  - - ">="
123
143
  - !ruby/object:Gem::Version
124
144
  version: '0'
125
- description:
145
+ description:
126
146
  email:
127
147
  - janko.marohnic@gmail.com
128
148
  executables: []
@@ -138,6 +158,7 @@ files:
138
158
  - lib/down/chunked_io.rb
139
159
  - lib/down/errors.rb
140
160
  - lib/down/http.rb
161
+ - lib/down/httpx.rb
141
162
  - lib/down/net_http.rb
142
163
  - lib/down/utils.rb
143
164
  - lib/down/version.rb
@@ -146,7 +167,7 @@ homepage: https://github.com/janko/down
146
167
  licenses:
147
168
  - MIT
148
169
  metadata: {}
149
- post_install_message:
170
+ post_install_message:
150
171
  rdoc_options: []
151
172
  require_paths:
152
173
  - lib
@@ -154,15 +175,15 @@ required_ruby_version: !ruby/object:Gem::Requirement
154
175
  requirements:
155
176
  - - ">="
156
177
  - !ruby/object:Gem::Version
157
- version: '2.1'
178
+ version: '2.3'
158
179
  required_rubygems_version: !ruby/object:Gem::Requirement
159
180
  requirements:
160
181
  - - ">="
161
182
  - !ruby/object:Gem::Version
162
183
  version: '0'
163
184
  requirements: []
164
- rubygems_version: 3.0.3
165
- signing_key:
185
+ rubygems_version: 3.4.1
186
+ signing_key:
166
187
  specification_version: 4
167
188
  summary: Robust streaming downloads using Net::HTTP, HTTP.rb or wget.
168
189
  test_files: []