down 4.8.1 → 5.2.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/CHANGELOG.md +44 -0
- data/README.md +64 -27
- data/down.gemspec +3 -2
- data/lib/down.rb +4 -4
- data/lib/down/backend.rb +4 -4
- data/lib/down/chunked_io.rb +75 -42
- data/lib/down/errors.rb +9 -9
- data/lib/down/http.rb +3 -7
- data/lib/down/net_http.rb +61 -25
- data/lib/down/version.rb +1 -1
- metadata +20 -6
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: d48d573f542ac195a462c3bab0ebe1546c8dd78b0a5210eed1d1fed1f66b674f
|
4
|
+
data.tar.gz: 492b997d3e889475544267d753df5fb900d28728131602e6950a57dfacb7842c
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: e0fa81667368033a51588f37ae8a5fca2198ce341268e5c6fc24a8241be1af7ce88cd4bc7e8076f69af41b3e0318a4c8733c28a9d173dcdae750b59a9e2401e6
|
7
|
+
data.tar.gz: 568632e1c75daa84a838675ae49ceb844a4500e781b8303cf3c1259974b1d01a26e3d2420e08d0d544c81ef312dc790cafdd47fcdc2ed6f3a853b91f8a99bd48
|
data/CHANGELOG.md
CHANGED
@@ -1,3 +1,47 @@
|
|
1
|
+
## 5.2.0 (2020-09-20)
|
2
|
+
|
3
|
+
* Add `:uri_normalizer` option to `Down::NetHttp` (@janko)
|
4
|
+
|
5
|
+
* Add `:http_basic_authentication` option to `Down::NetHttp#download` (@janko)
|
6
|
+
|
7
|
+
* Fix uninitialized instance variables warnings in `Down::ChunkedIO` (@janko)
|
8
|
+
|
9
|
+
* Handle unknown HTTP error codes in `Down::NetHttp` (@darndt)
|
10
|
+
|
11
|
+
## 5.1.1 (2020-02-04)
|
12
|
+
|
13
|
+
* Fix keyword arguments warnings on Ruby 2.7 in `Down.download` and `Down.open` (@janko)
|
14
|
+
|
15
|
+
## 5.1.0 (2020-01-09)
|
16
|
+
|
17
|
+
* Fix keyword arguments warnings on Ruby 2.7 (@janko)
|
18
|
+
|
19
|
+
* Fix `FrozenError` exception in `Down::ChunkedIO#readpartial` (@janko)
|
20
|
+
|
21
|
+
* Deprecate passing headers as top-level options in `Down::NetHttp` (@janko)
|
22
|
+
|
23
|
+
## 5.0.1 (2019-12-20)
|
24
|
+
|
25
|
+
* In `Down::NetHttp` only use Addressable normalization if `URI.parse` fails (@coding-chimp)
|
26
|
+
|
27
|
+
## 5.0.0 (2019-09-26)
|
28
|
+
|
29
|
+
* Change `ChunkedIO#each_chunk` to return chunks in original encoding (@janko)
|
30
|
+
|
31
|
+
* Always return binary strings in `ChunkedIO#readpartial` (@janko)
|
32
|
+
|
33
|
+
* Handle frozen chunks in `Down::ChunkedIO` (@janko)
|
34
|
+
|
35
|
+
* Change `ChunkedIO#gets` to return lines in specified encoding (@janko)
|
36
|
+
|
37
|
+
* Halve memory allocation for `ChunkedIO#gets` (@janko)
|
38
|
+
|
39
|
+
* Halve memory allocation for `ChunkedIO#read` without arguments (@janko)
|
40
|
+
|
41
|
+
* Drop support for `HTTP::Client` argument in `Down::HTTP.new` (@janko)
|
42
|
+
|
43
|
+
* Repurpose `Down::NotFound` to be raised on `404 Not Found` response (@janko)
|
44
|
+
|
1
45
|
## 4.8.1 (2019-05-01)
|
2
46
|
|
3
47
|
* Make `ChunkedIO#read`/`#readpartial` with length always return strings in binary encoding (@janko)
|
data/README.md
CHANGED
@@ -1,13 +1,13 @@
|
|
1
1
|
# Down
|
2
2
|
|
3
3
|
Down is a utility tool for streaming, flexible and safe downloading of remote
|
4
|
-
files. It can use [open-uri] + `Net::HTTP`, [
|
4
|
+
files. It can use [open-uri] + `Net::HTTP`, [http.rb] or `wget` as the backend
|
5
5
|
HTTP library.
|
6
6
|
|
7
7
|
## Installation
|
8
8
|
|
9
9
|
```rb
|
10
|
-
gem "down", "~>
|
10
|
+
gem "down", "~> 5.0"
|
11
11
|
```
|
12
12
|
|
13
13
|
## Downloading
|
@@ -57,8 +57,12 @@ specific location on disk, you can specify the `:destination` option:
|
|
57
57
|
|
58
58
|
```rb
|
59
59
|
Down.download("http://example.com/image.jpg", destination: "/path/to/destination")
|
60
|
+
#=> nil
|
60
61
|
```
|
61
62
|
|
63
|
+
In this case `Down.download` won't have any return value, so if you need a File
|
64
|
+
object you'll have to create it manually.
|
65
|
+
|
62
66
|
### Basic authentication
|
63
67
|
|
64
68
|
`Down.download` and `Down.open` will automatically detect and apply HTTP basic
|
@@ -103,6 +107,16 @@ remote_file.eof? #=> true
|
|
103
107
|
remote_file.close # closes the HTTP connection and deletes the internal Tempfile
|
104
108
|
```
|
105
109
|
|
110
|
+
The following IO methods are implemented:
|
111
|
+
|
112
|
+
* `#read` & `#readpartial`
|
113
|
+
* `#gets`
|
114
|
+
* `#seek`
|
115
|
+
* `#pos` & `#tell`
|
116
|
+
* `#eof?`
|
117
|
+
* `#rewind`
|
118
|
+
* `#close`
|
119
|
+
|
106
120
|
### Caching
|
107
121
|
|
108
122
|
By default the downloaded content is internally cached into a `Tempfile`, so
|
@@ -147,10 +161,10 @@ remote_file.data[:headers] #=> { ... }
|
|
147
161
|
remote_file.data[:response] # returns the response object
|
148
162
|
```
|
149
163
|
|
150
|
-
Note that `Down::
|
151
|
-
status was 4xx or 5xx.
|
164
|
+
Note that a `Down::ResponseError` exception will automatically be raised if
|
165
|
+
response status was 4xx or 5xx.
|
152
166
|
|
153
|
-
###
|
167
|
+
### Down::ChunkedIO
|
154
168
|
|
155
169
|
The `Down.open` performs HTTP logic and returns an instance of
|
156
170
|
`Down::ChunkedIO`. However, `Down::ChunkedIO` is a generic class that can wrap
|
@@ -196,21 +210,23 @@ the `Down::Error` subclasses. This is Down's exception hierarchy:
|
|
196
210
|
|
197
211
|
* `Down::Error`
|
198
212
|
* `Down::TooLarge`
|
199
|
-
* `Down::
|
200
|
-
|
201
|
-
|
202
|
-
* `Down::
|
203
|
-
* `Down::
|
204
|
-
|
205
|
-
|
206
|
-
|
207
|
-
|
213
|
+
* `Down::InvalidUrl`
|
214
|
+
* `Down::TooManyRedirects`
|
215
|
+
* `Down::ResponseError`
|
216
|
+
* `Down::ClientError`
|
217
|
+
* `Down::NotFound`
|
218
|
+
* `Down::ServerError`
|
219
|
+
* `Down::ConnectionError`
|
220
|
+
* `Down::TimeoutError`
|
221
|
+
* `Down::SSLError`
|
208
222
|
|
209
223
|
## Backends
|
210
224
|
|
211
|
-
|
212
|
-
|
213
|
-
|
225
|
+
The following backends are available:
|
226
|
+
|
227
|
+
* [Down::NetHttp](#downnethttp) (default)
|
228
|
+
* [Down::Http](#downhttp)
|
229
|
+
* [Down::Wget](#downwget)
|
214
230
|
|
215
231
|
You can use the backend directly:
|
216
232
|
|
@@ -232,7 +248,10 @@ Down.download("...")
|
|
232
248
|
Down.open("...")
|
233
249
|
```
|
234
250
|
|
235
|
-
###
|
251
|
+
### Down::NetHttp
|
252
|
+
|
253
|
+
The `Down::NetHttp` backend implements downloads using [open-uri] and
|
254
|
+
[Net::HTTP].
|
236
255
|
|
237
256
|
```rb
|
238
257
|
gem "down", "~> 4.4"
|
@@ -314,6 +333,18 @@ Down::NetHttp.open("http://example.com/image.jpg",
|
|
314
333
|
ssl_verify_mode: OpenSSL::SSL::VERIFY_PEER)
|
315
334
|
```
|
316
335
|
|
336
|
+
#### URI normalization
|
337
|
+
|
338
|
+
If the URL isn't parseable by `URI.parse`, `Down::NetHttp` will
|
339
|
+
attempt to normalize the URL using [Addressable::URI], URI-escaping
|
340
|
+
any potentially unescaped characters. You can change the normalizer
|
341
|
+
via the `:uri_normalizer` option:
|
342
|
+
|
343
|
+
```rb
|
344
|
+
# this skips URL normalization
|
345
|
+
Down::NetHttp.download("http://example.com/image.jpg", uri_normalizer: -> (url) { url })
|
346
|
+
```
|
347
|
+
|
317
348
|
#### Additional options
|
318
349
|
|
319
350
|
Any additional options passed to `Down.download` will be forwarded to
|
@@ -334,7 +365,9 @@ net_http.download("http://example.com/image.jpg")
|
|
334
365
|
net_http.open("http://example.com/image.jpg")
|
335
366
|
```
|
336
367
|
|
337
|
-
###
|
368
|
+
### Down::Http
|
369
|
+
|
370
|
+
The `Down::Http` backend implements downloads using the [http.rb] gem.
|
338
371
|
|
339
372
|
```rb
|
340
373
|
gem "down", "~> 4.4"
|
@@ -350,7 +383,7 @@ io = Down::Http.open("http://nature.com/forest.jpg")
|
|
350
383
|
io #=> #<Down::ChunkedIO ...>
|
351
384
|
```
|
352
385
|
|
353
|
-
Some features that give the
|
386
|
+
Some features that give the http.rb backend an advantage over `open-uri` and
|
354
387
|
`Net::HTTP` include:
|
355
388
|
|
356
389
|
* Low memory usage (**10x less** than `open-uri`/`Net::HTTP`)
|
@@ -401,7 +434,10 @@ down = Down::Http.new(method: :post)
|
|
401
434
|
down.download("http://example.org/image.jpg")
|
402
435
|
```
|
403
436
|
|
404
|
-
### Wget (experimental)
|
437
|
+
### Down::Wget (experimental)
|
438
|
+
|
439
|
+
The `Down::Wget` backend implements downloads using the `wget` command line
|
440
|
+
utility.
|
405
441
|
|
406
442
|
```rb
|
407
443
|
gem "down", "~> 4.4"
|
@@ -418,9 +454,8 @@ io = Down::Wget.open("http://nature.com/forest.jpg")
|
|
418
454
|
io #=> #<Down::ChunkedIO ...>
|
419
455
|
```
|
420
456
|
|
421
|
-
|
422
|
-
|
423
|
-
interrupted due to network failures, which is very useful when you're
|
457
|
+
One major advantage of `wget` is that it automatically resumes downloads that
|
458
|
+
were interrupted due to network failures, which is very useful when you're
|
424
459
|
downloading large files.
|
425
460
|
|
426
461
|
However, the Wget backend should still be considered experimental, as it wasn't
|
@@ -447,10 +482,12 @@ wget.open("http://nature.com/forest.jpg")
|
|
447
482
|
|
448
483
|
## Supported Ruby versions
|
449
484
|
|
450
|
-
* MRI 2.2
|
451
485
|
* MRI 2.3
|
452
486
|
* MRI 2.4
|
453
|
-
*
|
487
|
+
* MRI 2.5
|
488
|
+
* MRI 2.6
|
489
|
+
* MRI 2.7
|
490
|
+
* JRuby 9.2
|
454
491
|
|
455
492
|
## Development
|
456
493
|
|
@@ -469,6 +506,6 @@ you'll need to have Docker installed and running.
|
|
469
506
|
|
470
507
|
[open-uri]: http://ruby-doc.org/stdlib-2.3.0/libdoc/open-uri/rdoc/OpenURI.html
|
471
508
|
[Net::HTTP]: https://ruby-doc.org/stdlib-2.4.1/libdoc/net/http/rdoc/Net/HTTP.html
|
472
|
-
[
|
509
|
+
[http.rb]: https://github.com/httprb/http
|
473
510
|
[Addressable::URI]: https://github.com/sporkmonger/addressable
|
474
511
|
[kennethreitz/httpbin]: https://github.com/kennethreitz/httpbin
|
data/down.gemspec
CHANGED
@@ -4,7 +4,7 @@ Gem::Specification.new do |spec|
|
|
4
4
|
spec.name = "down"
|
5
5
|
spec.version = Down::VERSION
|
6
6
|
|
7
|
-
spec.required_ruby_version = ">= 2.
|
7
|
+
spec.required_ruby_version = ">= 2.3"
|
8
8
|
|
9
9
|
spec.summary = "Robust streaming downloads using Net::HTTP, HTTP.rb or wget."
|
10
10
|
spec.homepage = "https://github.com/janko/down"
|
@@ -20,8 +20,9 @@ Gem::Specification.new do |spec|
|
|
20
20
|
spec.add_development_dependency "minitest", "~> 5.8"
|
21
21
|
spec.add_development_dependency "mocha", "~> 1.5"
|
22
22
|
spec.add_development_dependency "rake"
|
23
|
-
spec.add_development_dependency "http", "~> 4.
|
23
|
+
spec.add_development_dependency "http", "~> 4.3"
|
24
24
|
spec.add_development_dependency "posix-spawn" unless RUBY_ENGINE == "jruby"
|
25
25
|
spec.add_development_dependency "http_parser.rb"
|
26
26
|
spec.add_development_dependency "docker-api"
|
27
|
+
spec.add_development_dependency "warning" if RUBY_VERSION >= "2.4"
|
27
28
|
end
|
data/lib/down.rb
CHANGED
@@ -6,12 +6,12 @@ require "down/net_http"
|
|
6
6
|
module Down
|
7
7
|
module_function
|
8
8
|
|
9
|
-
def download(*args, &block)
|
10
|
-
backend.download(*args, &block)
|
9
|
+
def download(*args, **options, &block)
|
10
|
+
backend.download(*args, **options, &block)
|
11
11
|
end
|
12
12
|
|
13
|
-
def open(*args, &block)
|
14
|
-
backend.open(*args, &block)
|
13
|
+
def open(*args, **options, &block)
|
14
|
+
backend.open(*args, **options, &block)
|
15
15
|
end
|
16
16
|
|
17
17
|
# Allows setting a backend via a symbol or a downloader object.
|
data/lib/down/backend.rb
CHANGED
@@ -9,12 +9,12 @@ require "fileutils"
|
|
9
9
|
|
10
10
|
module Down
|
11
11
|
class Backend
|
12
|
-
def self.download(*args, &block)
|
13
|
-
new.download(*args, &block)
|
12
|
+
def self.download(*args, **options, &block)
|
13
|
+
new.download(*args, **options, &block)
|
14
14
|
end
|
15
15
|
|
16
|
-
def self.open(*args, &block)
|
17
|
-
new.open(*args, &block)
|
16
|
+
def self.open(*args, **options, &block)
|
17
|
+
new.open(*args, **options, &block)
|
18
18
|
end
|
19
19
|
|
20
20
|
private
|
data/lib/down/chunked_io.rb
CHANGED
@@ -36,6 +36,8 @@ module Down
|
|
36
36
|
@rewindable = rewindable
|
37
37
|
@buffer = nil
|
38
38
|
@position = 0
|
39
|
+
@next_chunk = nil
|
40
|
+
@closed = false
|
39
41
|
|
40
42
|
retrieve_chunk # fetch first chunk so that we know whether the file is empty
|
41
43
|
end
|
@@ -63,21 +65,20 @@ module Down
|
|
63
65
|
def read(length = nil, outbuf = nil)
|
64
66
|
fail IOError, "closed stream" if closed?
|
65
67
|
|
66
|
-
|
68
|
+
data = outbuf.clear.force_encoding(Encoding::BINARY) if outbuf
|
69
|
+
data ||= "".b
|
67
70
|
|
68
|
-
|
69
|
-
data = readpartial(remaining_length, outbuf)
|
70
|
-
data = data.dup unless outbuf
|
71
|
-
remaining_length = length - data.bytesize if length
|
72
|
-
rescue EOFError
|
73
|
-
end
|
71
|
+
remaining_length = length
|
74
72
|
|
75
73
|
until remaining_length == 0 || eof?
|
76
|
-
data << readpartial(remaining_length)
|
74
|
+
data << readpartial(remaining_length, buffer ||= String.new)
|
77
75
|
remaining_length = length - data.bytesize if length
|
78
76
|
end
|
79
77
|
|
80
|
-
|
78
|
+
buffer.clear if buffer # deallocate string
|
79
|
+
|
80
|
+
data.force_encoding(@encoding) unless length
|
81
|
+
data unless data.empty? && length && length > 0
|
81
82
|
end
|
82
83
|
|
83
84
|
# Implements IO#gets semantics. Without arguments it retrieves lines of
|
@@ -108,27 +109,33 @@ module Down
|
|
108
109
|
|
109
110
|
separator = "\n\n" if separator.empty?
|
110
111
|
|
111
|
-
|
112
|
-
data = readpartial(limit)
|
112
|
+
data = String.new
|
113
113
|
|
114
|
-
|
115
|
-
|
116
|
-
|
117
|
-
|
114
|
+
until data.include?(separator) || data.bytesize == limit || eof?
|
115
|
+
remaining_length = limit - data.bytesize if limit
|
116
|
+
data << readpartial(remaining_length, buffer ||= String.new)
|
117
|
+
end
|
118
|
+
|
119
|
+
buffer.clear if buffer # deallocate buffer
|
120
|
+
|
121
|
+
line, extra = data.split(separator, 2)
|
122
|
+
line << separator if data.include?(separator)
|
118
123
|
|
119
|
-
|
120
|
-
line << separator if data.include?(separator)
|
124
|
+
data.clear # deallocate data
|
121
125
|
|
126
|
+
if extra
|
122
127
|
if cache
|
123
|
-
cache.pos -= extra.
|
128
|
+
cache.pos -= extra.bytesize
|
124
129
|
else
|
125
|
-
|
130
|
+
if @buffer
|
131
|
+
@buffer.prepend(extra)
|
132
|
+
else
|
133
|
+
@buffer = extra
|
134
|
+
end
|
126
135
|
end
|
127
|
-
rescue EOFError
|
128
|
-
line = nil
|
129
136
|
end
|
130
137
|
|
131
|
-
line
|
138
|
+
line.force_encoding(@encoding) if line
|
132
139
|
end
|
133
140
|
|
134
141
|
# Implements IO#readpartial semantics. If there is any content readily
|
@@ -139,33 +146,33 @@ module Down
|
|
139
146
|
# or the next chunk. This is useful when you don't care about the size of
|
140
147
|
# chunks and you want to minimize string allocations.
|
141
148
|
#
|
142
|
-
# With `
|
149
|
+
# With `maxlen` argument returns maximum of that amount of bytes (default
|
150
|
+
# is 16KB).
|
143
151
|
#
|
144
152
|
# With `outbuf` argument each call will return that same string object,
|
145
153
|
# where the value is replaced with retrieved content.
|
146
154
|
#
|
147
155
|
# Raises EOFError if end of file is reached. Raises IOError if closed.
|
148
|
-
def readpartial(
|
156
|
+
def readpartial(maxlen = nil, outbuf = nil)
|
149
157
|
fail IOError, "closed stream" if closed?
|
150
158
|
|
151
|
-
|
159
|
+
maxlen ||= 16*1024
|
152
160
|
|
153
|
-
|
161
|
+
data = cache.read(maxlen, outbuf) if cache && !cache.eof?
|
162
|
+
data ||= outbuf.clear.force_encoding(Encoding::BINARY) if outbuf
|
163
|
+
data ||= "".b
|
154
164
|
|
155
|
-
if
|
156
|
-
data = cache.read(length, outbuf)
|
157
|
-
data.force_encoding(@encoding)
|
158
|
-
end
|
165
|
+
return data if maxlen == 0
|
159
166
|
|
160
|
-
if @buffer.nil? &&
|
167
|
+
if @buffer.nil? && data.empty?
|
161
168
|
fail EOFError, "end of file reached" if chunks_depleted?
|
162
169
|
@buffer = retrieve_chunk
|
163
170
|
end
|
164
171
|
|
165
|
-
remaining_length =
|
172
|
+
remaining_length = maxlen - data.bytesize
|
166
173
|
|
167
174
|
unless @buffer.nil? || remaining_length == 0
|
168
|
-
if remaining_length
|
175
|
+
if remaining_length < @buffer.bytesize
|
169
176
|
buffered_data = @buffer.byteslice(0, remaining_length)
|
170
177
|
@buffer = @buffer.byteslice(remaining_length..-1)
|
171
178
|
else
|
@@ -173,21 +180,46 @@ module Down
|
|
173
180
|
@buffer = nil
|
174
181
|
end
|
175
182
|
|
176
|
-
|
177
|
-
data << buffered_data
|
178
|
-
else
|
179
|
-
data = buffered_data
|
180
|
-
end
|
183
|
+
data << buffered_data
|
181
184
|
|
182
185
|
cache.write(buffered_data) if cache
|
183
186
|
|
184
|
-
buffered_data.clear unless buffered_data.
|
187
|
+
buffered_data.clear unless buffered_data.frozen?
|
185
188
|
end
|
186
189
|
|
187
190
|
@position += data.bytesize
|
188
191
|
|
189
|
-
data.force_encoding(Encoding::BINARY)
|
190
|
-
|
192
|
+
data.force_encoding(Encoding::BINARY)
|
193
|
+
end
|
194
|
+
|
195
|
+
# Implements IO#seek semantics.
|
196
|
+
def seek(amount, whence = IO::SEEK_SET)
|
197
|
+
fail Errno::ESPIPE, "Illegal seek" if cache.nil?
|
198
|
+
|
199
|
+
case whence
|
200
|
+
when IO::SEEK_SET, :SET
|
201
|
+
target_pos = amount
|
202
|
+
when IO::SEEK_CUR, :CUR
|
203
|
+
target_pos = @position + amount
|
204
|
+
when IO::SEEK_END, :END
|
205
|
+
unless chunks_depleted?
|
206
|
+
cache.seek(0, IO::SEEK_END)
|
207
|
+
IO.copy_stream(self, File::NULL)
|
208
|
+
end
|
209
|
+
|
210
|
+
target_pos = cache.size + amount
|
211
|
+
else
|
212
|
+
fail ArgumentError, "invalid whence: #{whence.inspect}"
|
213
|
+
end
|
214
|
+
|
215
|
+
if target_pos <= cache.size
|
216
|
+
cache.seek(target_pos)
|
217
|
+
else
|
218
|
+
cache.seek(0, IO::SEEK_END)
|
219
|
+
IO.copy_stream(self, File::NULL, target_pos - cache.size)
|
220
|
+
end
|
221
|
+
|
222
|
+
@position = cache.pos
|
191
223
|
end
|
192
224
|
|
193
225
|
# Implements IO#pos semantics. Returns the current position of the
|
@@ -195,6 +227,7 @@ module Down
|
|
195
227
|
def pos
|
196
228
|
@position
|
197
229
|
end
|
230
|
+
alias tell pos
|
198
231
|
|
199
232
|
# Implements IO#eof? semantics. Returns whether we've reached end of file.
|
200
233
|
# It returns true if cache is at the end and there is no more content to
|
@@ -272,7 +305,7 @@ module Down
|
|
272
305
|
def retrieve_chunk
|
273
306
|
chunk = @next_chunk
|
274
307
|
@next_chunk = chunks_fiber.resume
|
275
|
-
chunk
|
308
|
+
chunk
|
276
309
|
end
|
277
310
|
|
278
311
|
# Returns whether there is any content left to retrieve.
|
data/lib/down/errors.rb
CHANGED
@@ -7,20 +7,17 @@ module Down
|
|
7
7
|
# raised when the file is larger than the specified maximum size
|
8
8
|
class TooLarge < Error; end
|
9
9
|
|
10
|
-
# raised when the file failed to be retrieved for whatever reason
|
11
|
-
class NotFound < Error; end
|
12
|
-
|
13
10
|
# raised when the given URL couldn't be parsed
|
14
|
-
class InvalidUrl <
|
11
|
+
class InvalidUrl < Error; end
|
15
12
|
|
16
13
|
# raised when the number of redirects was larger than the specified maximum
|
17
|
-
class TooManyRedirects <
|
14
|
+
class TooManyRedirects < Error; end
|
18
15
|
|
19
16
|
# raised when response returned 4xx or 5xx response
|
20
|
-
class ResponseError <
|
17
|
+
class ResponseError < Error
|
21
18
|
attr_reader :response
|
22
19
|
|
23
|
-
def initialize(message, response
|
20
|
+
def initialize(message, response = nil)
|
24
21
|
super(message)
|
25
22
|
@response = response
|
26
23
|
end
|
@@ -29,15 +26,18 @@ module Down
|
|
29
26
|
# raised when response returned 4xx response
|
30
27
|
class ClientError < ResponseError; end
|
31
28
|
|
29
|
+
# raised when response returned 404 response
|
30
|
+
class NotFound < ClientError; end
|
31
|
+
|
32
32
|
# raised when response returned 5xx response
|
33
33
|
class ServerError < ResponseError; end
|
34
34
|
|
35
35
|
# raised when there was an error connecting to the server
|
36
|
-
class ConnectionError <
|
36
|
+
class ConnectionError < Error; end
|
37
37
|
|
38
38
|
# raised when connecting to the server too longer than the specified timeout
|
39
39
|
class TimeoutError < ConnectionError; end
|
40
40
|
|
41
41
|
# raised when an SSL error was raised
|
42
|
-
class SSLError <
|
42
|
+
class SSLError < Error; end
|
43
43
|
end
|
data/lib/down/http.rb
CHANGED
@@ -12,12 +12,7 @@ module Down
|
|
12
12
|
# Provides streaming downloads implemented with HTTP.rb.
|
13
13
|
class Http < Backend
|
14
14
|
# Initializes the backend with common defaults.
|
15
|
-
def initialize(options
|
16
|
-
if options.is_a?(HTTP::Client)
|
17
|
-
warn "[Down] Passing an HTTP::Client object to Down::Http#initialize is deprecated and won't be supported in Down 5. Use the block initialization instead."
|
18
|
-
options = options.default_options.to_hash
|
19
|
-
end
|
20
|
-
|
15
|
+
def initialize(**options, &block)
|
21
16
|
@method = options.delete(:method) || :get
|
22
17
|
@client = HTTP
|
23
18
|
.headers("User-Agent" => "Down/#{Down::VERSION}")
|
@@ -111,9 +106,10 @@ module Down
|
|
111
106
|
|
112
107
|
# Raises non-sucessful response as a Down::ResponseError.
|
113
108
|
def response_error!(response)
|
114
|
-
args = [response.status.to_s, response
|
109
|
+
args = [response.status.to_s, response]
|
115
110
|
|
116
111
|
case response.code
|
112
|
+
when 404 then raise Down::NotFound.new(*args)
|
117
113
|
when 400..499 then raise Down::ClientError.new(*args)
|
118
114
|
when 500..599 then raise Down::ServerError.new(*args)
|
119
115
|
else raise Down::ResponseError.new(*args)
|
data/lib/down/net_http.rb
CHANGED
@@ -12,27 +12,34 @@ require "fileutils"
|
|
12
12
|
module Down
|
13
13
|
# Provides streaming downloads implemented with Net::HTTP and open-uri.
|
14
14
|
class NetHttp < Backend
|
15
|
+
URI_NORMALIZER = -> (url) do
|
16
|
+
addressable_uri = Addressable::URI.parse(url)
|
17
|
+
addressable_uri.normalize.to_s
|
18
|
+
end
|
19
|
+
|
15
20
|
# Initializes the backend with common defaults.
|
16
|
-
def initialize(options
|
17
|
-
@options = {
|
18
|
-
"User-Agent" => "Down/#{Down::VERSION}",
|
19
|
-
max_redirects:
|
20
|
-
open_timeout:
|
21
|
-
read_timeout:
|
22
|
-
|
21
|
+
def initialize(*args, **options)
|
22
|
+
@options = merge_options({
|
23
|
+
headers: { "User-Agent" => "Down/#{Down::VERSION}" },
|
24
|
+
max_redirects: 2,
|
25
|
+
open_timeout: 30,
|
26
|
+
read_timeout: 30,
|
27
|
+
uri_normalizer: URI_NORMALIZER,
|
28
|
+
}, *args, **options)
|
23
29
|
end
|
24
30
|
|
25
31
|
# Downloads a remote file to disk using open-uri. Accepts any open-uri
|
26
32
|
# options, and a few more.
|
27
|
-
def download(url, options
|
28
|
-
options = @options
|
33
|
+
def download(url, *args, **options)
|
34
|
+
options = merge_options(@options, *args, **options)
|
29
35
|
|
30
36
|
max_size = options.delete(:max_size)
|
31
37
|
max_redirects = options.delete(:max_redirects)
|
32
38
|
progress_proc = options.delete(:progress_proc)
|
33
39
|
content_length_proc = options.delete(:content_length_proc)
|
34
40
|
destination = options.delete(:destination)
|
35
|
-
headers = options.delete(:headers)
|
41
|
+
headers = options.delete(:headers)
|
42
|
+
uri_normalizer = options.delete(:uri_normalizer)
|
36
43
|
|
37
44
|
# Use open-uri's :content_lenth_proc or :progress_proc to raise an
|
38
45
|
# exception early if the file is too large.
|
@@ -74,7 +81,7 @@ module Down
|
|
74
81
|
open_uri_options.merge!(options)
|
75
82
|
open_uri_options.merge!(headers)
|
76
83
|
|
77
|
-
uri = ensure_uri(
|
84
|
+
uri = ensure_uri(normalize_uri(url, uri_normalizer: uri_normalizer))
|
78
85
|
|
79
86
|
# Handle basic authentication in the remote URL.
|
80
87
|
if uri.user || uri.password
|
@@ -95,13 +102,17 @@ module Down
|
|
95
102
|
|
96
103
|
# Starts retrieving the remote file using Net::HTTP and returns an IO-like
|
97
104
|
# object which downloads the response body on-demand.
|
98
|
-
def open(url, options
|
99
|
-
|
100
|
-
|
105
|
+
def open(url, *args, **options)
|
106
|
+
options = merge_options(@options, *args, **options)
|
107
|
+
|
108
|
+
max_redirects = options.delete(:max_redirects)
|
109
|
+
uri_normalizer = options.delete(:uri_normalizer)
|
110
|
+
|
111
|
+
uri = ensure_uri(normalize_uri(url, uri_normalizer: uri_normalizer))
|
101
112
|
|
102
113
|
# Create a Fiber that halts when response headers are received.
|
103
114
|
request = Fiber.new do
|
104
|
-
net_http_request(uri, options) do |response|
|
115
|
+
net_http_request(uri, options, follows_remaining: max_redirects) do |response|
|
105
116
|
Fiber.yield response
|
106
117
|
end
|
107
118
|
end
|
@@ -131,7 +142,7 @@ module Down
|
|
131
142
|
private
|
132
143
|
|
133
144
|
# Calls open-uri's URI::HTTP#open method. Additionally handles redirects.
|
134
|
-
def open_uri(uri, options, follows_remaining:
|
145
|
+
def open_uri(uri, options, follows_remaining:)
|
135
146
|
uri.open(options)
|
136
147
|
rescue OpenURI::HTTPRedirect => exception
|
137
148
|
raise Down::TooManyRedirects, "too many redirects" if follows_remaining == 0
|
@@ -186,7 +197,7 @@ module Down
|
|
186
197
|
end
|
187
198
|
|
188
199
|
# Makes a Net::HTTP request and follows redirects.
|
189
|
-
def net_http_request(uri, options, follows_remaining
|
200
|
+
def net_http_request(uri, options, follows_remaining:, &block)
|
190
201
|
http, request = create_net_http(uri, options)
|
191
202
|
|
192
203
|
begin
|
@@ -251,12 +262,13 @@ module Down
|
|
251
262
|
http.read_timeout = options[:read_timeout] if options.key?(:read_timeout)
|
252
263
|
http.open_timeout = options[:open_timeout] if options.key?(:open_timeout)
|
253
264
|
|
254
|
-
headers = options.
|
255
|
-
headers.merge!(options[:headers]) if options[:headers]
|
265
|
+
headers = options[:headers].to_h
|
256
266
|
headers["Accept-Encoding"] = "" # Net::HTTP's inflater causes FiberErrors
|
257
267
|
|
258
268
|
get = Net::HTTP::Get.new(uri.request_uri, headers)
|
259
|
-
|
269
|
+
|
270
|
+
user, password = options[:http_basic_authentication] || [uri.user, uri.password]
|
271
|
+
get.basic_auth(user, password) if user || password
|
260
272
|
|
261
273
|
[http, get]
|
262
274
|
end
|
@@ -284,9 +296,10 @@ module Down
|
|
284
296
|
end
|
285
297
|
|
286
298
|
# Makes sure that the URL is properly encoded.
|
287
|
-
def
|
288
|
-
|
289
|
-
|
299
|
+
def normalize_uri(url, uri_normalizer:)
|
300
|
+
URI(url)
|
301
|
+
rescue URI::InvalidURIError
|
302
|
+
uri_normalizer.call(url)
|
290
303
|
end
|
291
304
|
|
292
305
|
# When open-uri raises an exception, it doesn't expose the response object.
|
@@ -295,7 +308,11 @@ module Down
|
|
295
308
|
def rebuild_response_from_open_uri_exception(exception)
|
296
309
|
code, message = exception.io.status
|
297
310
|
|
298
|
-
response_class = Net::HTTPResponse::CODE_TO_OBJ.fetch(code)
|
311
|
+
response_class = Net::HTTPResponse::CODE_TO_OBJ.fetch(code) do |code|
|
312
|
+
Net::HTTPResponse::CODE_CLASS_TO_OBJ.fetch(code[0]) do
|
313
|
+
Net::HTTPUnknownResponse
|
314
|
+
end
|
315
|
+
end
|
299
316
|
response = response_class.new(nil, code, message)
|
300
317
|
|
301
318
|
exception.io.metas.each do |name, values|
|
@@ -310,9 +327,10 @@ module Down
|
|
310
327
|
code = response.code.to_i
|
311
328
|
message = response.message.split(" ").map(&:capitalize).join(" ")
|
312
329
|
|
313
|
-
args = ["#{code} #{message}", response
|
330
|
+
args = ["#{code} #{message}", response]
|
314
331
|
|
315
332
|
case response.code.to_i
|
333
|
+
when 404 then raise Down::NotFound.new(*args)
|
316
334
|
when 400..499 then raise Down::ClientError.new(*args)
|
317
335
|
when 500..599 then raise Down::ServerError.new(*args)
|
318
336
|
else raise Down::ResponseError.new(*args)
|
@@ -335,6 +353,24 @@ module Down
|
|
335
353
|
end
|
336
354
|
end
|
337
355
|
|
356
|
+
# Merge default and ad-hoc options, merging nested headers.
|
357
|
+
def merge_options(options, headers = {}, **new_options)
|
358
|
+
# Deprecate passing headers as top-level options, taking into account
|
359
|
+
# that Ruby 2.7+ accepts kwargs with string keys.
|
360
|
+
if headers.any?
|
361
|
+
warn %([Down::NetHttp] Passing headers as top-level options has been deprecated, use the :headers option instead, e.g: `Down::NetHttp.download(headers: { "Key" => "Value", ... }, ...)`)
|
362
|
+
new_options[:headers] = headers
|
363
|
+
elsif new_options.any? { |key, value| key.is_a?(String) }
|
364
|
+
warn %([Down::NetHttp] Passing headers as top-level options has been deprecated, use the :headers option instead, e.g: `Down::NetHttp.download(headers: { "Key" => "Value", ... }, ...)`)
|
365
|
+
new_options[:headers] = new_options.select { |key, value| key.is_a?(String) }
|
366
|
+
new_options.reject! { |key, value| key.is_a?(String) }
|
367
|
+
end
|
368
|
+
|
369
|
+
options.merge(new_options) do |key, value1, value2|
|
370
|
+
key == :headers ? value1.merge(value2) : value2
|
371
|
+
end
|
372
|
+
end
|
373
|
+
|
338
374
|
# Defines some additional attributes for the returned Tempfile (on top of what
|
339
375
|
# OpenURI::Meta already defines).
|
340
376
|
module DownloadedFile
|
data/lib/down/version.rb
CHANGED
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: down
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version:
|
4
|
+
version: 5.2.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Janko Marohnić
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date:
|
11
|
+
date: 2020-09-20 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: addressable
|
@@ -72,14 +72,14 @@ dependencies:
|
|
72
72
|
requirements:
|
73
73
|
- - "~>"
|
74
74
|
- !ruby/object:Gem::Version
|
75
|
-
version: '4.
|
75
|
+
version: '4.3'
|
76
76
|
type: :development
|
77
77
|
prerelease: false
|
78
78
|
version_requirements: !ruby/object:Gem::Requirement
|
79
79
|
requirements:
|
80
80
|
- - "~>"
|
81
81
|
- !ruby/object:Gem::Version
|
82
|
-
version: '4.
|
82
|
+
version: '4.3'
|
83
83
|
- !ruby/object:Gem::Dependency
|
84
84
|
name: posix-spawn
|
85
85
|
requirement: !ruby/object:Gem::Requirement
|
@@ -122,6 +122,20 @@ dependencies:
|
|
122
122
|
- - ">="
|
123
123
|
- !ruby/object:Gem::Version
|
124
124
|
version: '0'
|
125
|
+
- !ruby/object:Gem::Dependency
|
126
|
+
name: warning
|
127
|
+
requirement: !ruby/object:Gem::Requirement
|
128
|
+
requirements:
|
129
|
+
- - ">="
|
130
|
+
- !ruby/object:Gem::Version
|
131
|
+
version: '0'
|
132
|
+
type: :development
|
133
|
+
prerelease: false
|
134
|
+
version_requirements: !ruby/object:Gem::Requirement
|
135
|
+
requirements:
|
136
|
+
- - ">="
|
137
|
+
- !ruby/object:Gem::Version
|
138
|
+
version: '0'
|
125
139
|
description:
|
126
140
|
email:
|
127
141
|
- janko.marohnic@gmail.com
|
@@ -154,14 +168,14 @@ required_ruby_version: !ruby/object:Gem::Requirement
|
|
154
168
|
requirements:
|
155
169
|
- - ">="
|
156
170
|
- !ruby/object:Gem::Version
|
157
|
-
version: '2.
|
171
|
+
version: '2.3'
|
158
172
|
required_rubygems_version: !ruby/object:Gem::Requirement
|
159
173
|
requirements:
|
160
174
|
- - ">="
|
161
175
|
- !ruby/object:Gem::Version
|
162
176
|
version: '0'
|
163
177
|
requirements: []
|
164
|
-
rubygems_version: 3.
|
178
|
+
rubygems_version: 3.1.1
|
165
179
|
signing_key:
|
166
180
|
specification_version: 4
|
167
181
|
summary: Robust streaming downloads using Net::HTTP, HTTP.rb or wget.
|