down 4.8.1 → 5.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +44 -0
- data/README.md +64 -27
- data/down.gemspec +3 -2
- data/lib/down.rb +4 -4
- data/lib/down/backend.rb +4 -4
- data/lib/down/chunked_io.rb +75 -42
- data/lib/down/errors.rb +9 -9
- data/lib/down/http.rb +3 -7
- data/lib/down/net_http.rb +61 -25
- data/lib/down/version.rb +1 -1
- metadata +20 -6
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: d48d573f542ac195a462c3bab0ebe1546c8dd78b0a5210eed1d1fed1f66b674f
|
|
4
|
+
data.tar.gz: 492b997d3e889475544267d753df5fb900d28728131602e6950a57dfacb7842c
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: e0fa81667368033a51588f37ae8a5fca2198ce341268e5c6fc24a8241be1af7ce88cd4bc7e8076f69af41b3e0318a4c8733c28a9d173dcdae750b59a9e2401e6
|
|
7
|
+
data.tar.gz: 568632e1c75daa84a838675ae49ceb844a4500e781b8303cf3c1259974b1d01a26e3d2420e08d0d544c81ef312dc790cafdd47fcdc2ed6f3a853b91f8a99bd48
|
data/CHANGELOG.md
CHANGED
|
@@ -1,3 +1,47 @@
|
|
|
1
|
+
## 5.2.0 (2020-09-20)
|
|
2
|
+
|
|
3
|
+
* Add `:uri_normalizer` option to `Down::NetHttp` (@janko)
|
|
4
|
+
|
|
5
|
+
* Add `:http_basic_authentication` option to `Down::NetHttp#download` (@janko)
|
|
6
|
+
|
|
7
|
+
* Fix uninitialized instance variables warnings in `Down::ChunkedIO` (@janko)
|
|
8
|
+
|
|
9
|
+
* Handle unknown HTTP error codes in `Down::NetHttp` (@darndt)
|
|
10
|
+
|
|
11
|
+
## 5.1.1 (2020-02-04)
|
|
12
|
+
|
|
13
|
+
* Fix keyword arguments warnings on Ruby 2.7 in `Down.download` and `Down.open` (@janko)
|
|
14
|
+
|
|
15
|
+
## 5.1.0 (2020-01-09)
|
|
16
|
+
|
|
17
|
+
* Fix keyword arguments warnings on Ruby 2.7 (@janko)
|
|
18
|
+
|
|
19
|
+
* Fix `FrozenError` exception in `Down::ChunkedIO#readpartial` (@janko)
|
|
20
|
+
|
|
21
|
+
* Deprecate passing headers as top-level options in `Down::NetHttp` (@janko)
|
|
22
|
+
|
|
23
|
+
## 5.0.1 (2019-12-20)
|
|
24
|
+
|
|
25
|
+
* In `Down::NetHttp` only use Addressable normalization if `URI.parse` fails (@coding-chimp)
|
|
26
|
+
|
|
27
|
+
## 5.0.0 (2019-09-26)
|
|
28
|
+
|
|
29
|
+
* Change `ChunkedIO#each_chunk` to return chunks in original encoding (@janko)
|
|
30
|
+
|
|
31
|
+
* Always return binary strings in `ChunkedIO#readpartial` (@janko)
|
|
32
|
+
|
|
33
|
+
* Handle frozen chunks in `Down::ChunkedIO` (@janko)
|
|
34
|
+
|
|
35
|
+
* Change `ChunkedIO#gets` to return lines in specified encoding (@janko)
|
|
36
|
+
|
|
37
|
+
* Halve memory allocation for `ChunkedIO#gets` (@janko)
|
|
38
|
+
|
|
39
|
+
* Halve memory allocation for `ChunkedIO#read` without arguments (@janko)
|
|
40
|
+
|
|
41
|
+
* Drop support for `HTTP::Client` argument in `Down::HTTP.new` (@janko)
|
|
42
|
+
|
|
43
|
+
* Repurpose `Down::NotFound` to be raised on `404 Not Found` response (@janko)
|
|
44
|
+
|
|
1
45
|
## 4.8.1 (2019-05-01)
|
|
2
46
|
|
|
3
47
|
* Make `ChunkedIO#read`/`#readpartial` with length always return strings in binary encoding (@janko)
|
data/README.md
CHANGED
|
@@ -1,13 +1,13 @@
|
|
|
1
1
|
# Down
|
|
2
2
|
|
|
3
3
|
Down is a utility tool for streaming, flexible and safe downloading of remote
|
|
4
|
-
files. It can use [open-uri] + `Net::HTTP`, [
|
|
4
|
+
files. It can use [open-uri] + `Net::HTTP`, [http.rb] or `wget` as the backend
|
|
5
5
|
HTTP library.
|
|
6
6
|
|
|
7
7
|
## Installation
|
|
8
8
|
|
|
9
9
|
```rb
|
|
10
|
-
gem "down", "~>
|
|
10
|
+
gem "down", "~> 5.0"
|
|
11
11
|
```
|
|
12
12
|
|
|
13
13
|
## Downloading
|
|
@@ -57,8 +57,12 @@ specific location on disk, you can specify the `:destination` option:
|
|
|
57
57
|
|
|
58
58
|
```rb
|
|
59
59
|
Down.download("http://example.com/image.jpg", destination: "/path/to/destination")
|
|
60
|
+
#=> nil
|
|
60
61
|
```
|
|
61
62
|
|
|
63
|
+
In this case `Down.download` won't have any return value, so if you need a File
|
|
64
|
+
object you'll have to create it manually.
|
|
65
|
+
|
|
62
66
|
### Basic authentication
|
|
63
67
|
|
|
64
68
|
`Down.download` and `Down.open` will automatically detect and apply HTTP basic
|
|
@@ -103,6 +107,16 @@ remote_file.eof? #=> true
|
|
|
103
107
|
remote_file.close # closes the HTTP connection and deletes the internal Tempfile
|
|
104
108
|
```
|
|
105
109
|
|
|
110
|
+
The following IO methods are implemented:
|
|
111
|
+
|
|
112
|
+
* `#read` & `#readpartial`
|
|
113
|
+
* `#gets`
|
|
114
|
+
* `#seek`
|
|
115
|
+
* `#pos` & `#tell`
|
|
116
|
+
* `#eof?`
|
|
117
|
+
* `#rewind`
|
|
118
|
+
* `#close`
|
|
119
|
+
|
|
106
120
|
### Caching
|
|
107
121
|
|
|
108
122
|
By default the downloaded content is internally cached into a `Tempfile`, so
|
|
@@ -147,10 +161,10 @@ remote_file.data[:headers] #=> { ... }
|
|
|
147
161
|
remote_file.data[:response] # returns the response object
|
|
148
162
|
```
|
|
149
163
|
|
|
150
|
-
Note that `Down::
|
|
151
|
-
status was 4xx or 5xx.
|
|
164
|
+
Note that a `Down::ResponseError` exception will automatically be raised if
|
|
165
|
+
response status was 4xx or 5xx.
|
|
152
166
|
|
|
153
|
-
###
|
|
167
|
+
### Down::ChunkedIO
|
|
154
168
|
|
|
155
169
|
The `Down.open` performs HTTP logic and returns an instance of
|
|
156
170
|
`Down::ChunkedIO`. However, `Down::ChunkedIO` is a generic class that can wrap
|
|
@@ -196,21 +210,23 @@ the `Down::Error` subclasses. This is Down's exception hierarchy:
|
|
|
196
210
|
|
|
197
211
|
* `Down::Error`
|
|
198
212
|
* `Down::TooLarge`
|
|
199
|
-
* `Down::
|
|
200
|
-
|
|
201
|
-
|
|
202
|
-
* `Down::
|
|
203
|
-
* `Down::
|
|
204
|
-
|
|
205
|
-
|
|
206
|
-
|
|
207
|
-
|
|
213
|
+
* `Down::InvalidUrl`
|
|
214
|
+
* `Down::TooManyRedirects`
|
|
215
|
+
* `Down::ResponseError`
|
|
216
|
+
* `Down::ClientError`
|
|
217
|
+
* `Down::NotFound`
|
|
218
|
+
* `Down::ServerError`
|
|
219
|
+
* `Down::ConnectionError`
|
|
220
|
+
* `Down::TimeoutError`
|
|
221
|
+
* `Down::SSLError`
|
|
208
222
|
|
|
209
223
|
## Backends
|
|
210
224
|
|
|
211
|
-
|
|
212
|
-
|
|
213
|
-
|
|
225
|
+
The following backends are available:
|
|
226
|
+
|
|
227
|
+
* [Down::NetHttp](#downnethttp) (default)
|
|
228
|
+
* [Down::Http](#downhttp)
|
|
229
|
+
* [Down::Wget](#downwget)
|
|
214
230
|
|
|
215
231
|
You can use the backend directly:
|
|
216
232
|
|
|
@@ -232,7 +248,10 @@ Down.download("...")
|
|
|
232
248
|
Down.open("...")
|
|
233
249
|
```
|
|
234
250
|
|
|
235
|
-
###
|
|
251
|
+
### Down::NetHttp
|
|
252
|
+
|
|
253
|
+
The `Down::NetHttp` backend implements downloads using [open-uri] and
|
|
254
|
+
[Net::HTTP].
|
|
236
255
|
|
|
237
256
|
```rb
|
|
238
257
|
gem "down", "~> 4.4"
|
|
@@ -314,6 +333,18 @@ Down::NetHttp.open("http://example.com/image.jpg",
|
|
|
314
333
|
ssl_verify_mode: OpenSSL::SSL::VERIFY_PEER)
|
|
315
334
|
```
|
|
316
335
|
|
|
336
|
+
#### URI normalization
|
|
337
|
+
|
|
338
|
+
If the URL isn't parseable by `URI.parse`, `Down::NetHttp` will
|
|
339
|
+
attempt to normalize the URL using [Addressable::URI], URI-escaping
|
|
340
|
+
any potentially unescaped characters. You can change the normalizer
|
|
341
|
+
via the `:uri_normalizer` option:
|
|
342
|
+
|
|
343
|
+
```rb
|
|
344
|
+
# this skips URL normalization
|
|
345
|
+
Down::NetHttp.download("http://example.com/image.jpg", uri_normalizer: -> (url) { url })
|
|
346
|
+
```
|
|
347
|
+
|
|
317
348
|
#### Additional options
|
|
318
349
|
|
|
319
350
|
Any additional options passed to `Down.download` will be forwarded to
|
|
@@ -334,7 +365,9 @@ net_http.download("http://example.com/image.jpg")
|
|
|
334
365
|
net_http.open("http://example.com/image.jpg")
|
|
335
366
|
```
|
|
336
367
|
|
|
337
|
-
###
|
|
368
|
+
### Down::Http
|
|
369
|
+
|
|
370
|
+
The `Down::Http` backend implements downloads using the [http.rb] gem.
|
|
338
371
|
|
|
339
372
|
```rb
|
|
340
373
|
gem "down", "~> 4.4"
|
|
@@ -350,7 +383,7 @@ io = Down::Http.open("http://nature.com/forest.jpg")
|
|
|
350
383
|
io #=> #<Down::ChunkedIO ...>
|
|
351
384
|
```
|
|
352
385
|
|
|
353
|
-
Some features that give the
|
|
386
|
+
Some features that give the http.rb backend an advantage over `open-uri` and
|
|
354
387
|
`Net::HTTP` include:
|
|
355
388
|
|
|
356
389
|
* Low memory usage (**10x less** than `open-uri`/`Net::HTTP`)
|
|
@@ -401,7 +434,10 @@ down = Down::Http.new(method: :post)
|
|
|
401
434
|
down.download("http://example.org/image.jpg")
|
|
402
435
|
```
|
|
403
436
|
|
|
404
|
-
### Wget (experimental)
|
|
437
|
+
### Down::Wget (experimental)
|
|
438
|
+
|
|
439
|
+
The `Down::Wget` backend implements downloads using the `wget` command line
|
|
440
|
+
utility.
|
|
405
441
|
|
|
406
442
|
```rb
|
|
407
443
|
gem "down", "~> 4.4"
|
|
@@ -418,9 +454,8 @@ io = Down::Wget.open("http://nature.com/forest.jpg")
|
|
|
418
454
|
io #=> #<Down::ChunkedIO ...>
|
|
419
455
|
```
|
|
420
456
|
|
|
421
|
-
|
|
422
|
-
|
|
423
|
-
interrupted due to network failures, which is very useful when you're
|
|
457
|
+
One major advantage of `wget` is that it automatically resumes downloads that
|
|
458
|
+
were interrupted due to network failures, which is very useful when you're
|
|
424
459
|
downloading large files.
|
|
425
460
|
|
|
426
461
|
However, the Wget backend should still be considered experimental, as it wasn't
|
|
@@ -447,10 +482,12 @@ wget.open("http://nature.com/forest.jpg")
|
|
|
447
482
|
|
|
448
483
|
## Supported Ruby versions
|
|
449
484
|
|
|
450
|
-
* MRI 2.2
|
|
451
485
|
* MRI 2.3
|
|
452
486
|
* MRI 2.4
|
|
453
|
-
*
|
|
487
|
+
* MRI 2.5
|
|
488
|
+
* MRI 2.6
|
|
489
|
+
* MRI 2.7
|
|
490
|
+
* JRuby 9.2
|
|
454
491
|
|
|
455
492
|
## Development
|
|
456
493
|
|
|
@@ -469,6 +506,6 @@ you'll need to have Docker installed and running.
|
|
|
469
506
|
|
|
470
507
|
[open-uri]: http://ruby-doc.org/stdlib-2.3.0/libdoc/open-uri/rdoc/OpenURI.html
|
|
471
508
|
[Net::HTTP]: https://ruby-doc.org/stdlib-2.4.1/libdoc/net/http/rdoc/Net/HTTP.html
|
|
472
|
-
[
|
|
509
|
+
[http.rb]: https://github.com/httprb/http
|
|
473
510
|
[Addressable::URI]: https://github.com/sporkmonger/addressable
|
|
474
511
|
[kennethreitz/httpbin]: https://github.com/kennethreitz/httpbin
|
data/down.gemspec
CHANGED
|
@@ -4,7 +4,7 @@ Gem::Specification.new do |spec|
|
|
|
4
4
|
spec.name = "down"
|
|
5
5
|
spec.version = Down::VERSION
|
|
6
6
|
|
|
7
|
-
spec.required_ruby_version = ">= 2.
|
|
7
|
+
spec.required_ruby_version = ">= 2.3"
|
|
8
8
|
|
|
9
9
|
spec.summary = "Robust streaming downloads using Net::HTTP, HTTP.rb or wget."
|
|
10
10
|
spec.homepage = "https://github.com/janko/down"
|
|
@@ -20,8 +20,9 @@ Gem::Specification.new do |spec|
|
|
|
20
20
|
spec.add_development_dependency "minitest", "~> 5.8"
|
|
21
21
|
spec.add_development_dependency "mocha", "~> 1.5"
|
|
22
22
|
spec.add_development_dependency "rake"
|
|
23
|
-
spec.add_development_dependency "http", "~> 4.
|
|
23
|
+
spec.add_development_dependency "http", "~> 4.3"
|
|
24
24
|
spec.add_development_dependency "posix-spawn" unless RUBY_ENGINE == "jruby"
|
|
25
25
|
spec.add_development_dependency "http_parser.rb"
|
|
26
26
|
spec.add_development_dependency "docker-api"
|
|
27
|
+
spec.add_development_dependency "warning" if RUBY_VERSION >= "2.4"
|
|
27
28
|
end
|
data/lib/down.rb
CHANGED
|
@@ -6,12 +6,12 @@ require "down/net_http"
|
|
|
6
6
|
module Down
|
|
7
7
|
module_function
|
|
8
8
|
|
|
9
|
-
def download(*args, &block)
|
|
10
|
-
backend.download(*args, &block)
|
|
9
|
+
def download(*args, **options, &block)
|
|
10
|
+
backend.download(*args, **options, &block)
|
|
11
11
|
end
|
|
12
12
|
|
|
13
|
-
def open(*args, &block)
|
|
14
|
-
backend.open(*args, &block)
|
|
13
|
+
def open(*args, **options, &block)
|
|
14
|
+
backend.open(*args, **options, &block)
|
|
15
15
|
end
|
|
16
16
|
|
|
17
17
|
# Allows setting a backend via a symbol or a downloader object.
|
data/lib/down/backend.rb
CHANGED
|
@@ -9,12 +9,12 @@ require "fileutils"
|
|
|
9
9
|
|
|
10
10
|
module Down
|
|
11
11
|
class Backend
|
|
12
|
-
def self.download(*args, &block)
|
|
13
|
-
new.download(*args, &block)
|
|
12
|
+
def self.download(*args, **options, &block)
|
|
13
|
+
new.download(*args, **options, &block)
|
|
14
14
|
end
|
|
15
15
|
|
|
16
|
-
def self.open(*args, &block)
|
|
17
|
-
new.open(*args, &block)
|
|
16
|
+
def self.open(*args, **options, &block)
|
|
17
|
+
new.open(*args, **options, &block)
|
|
18
18
|
end
|
|
19
19
|
|
|
20
20
|
private
|
data/lib/down/chunked_io.rb
CHANGED
|
@@ -36,6 +36,8 @@ module Down
|
|
|
36
36
|
@rewindable = rewindable
|
|
37
37
|
@buffer = nil
|
|
38
38
|
@position = 0
|
|
39
|
+
@next_chunk = nil
|
|
40
|
+
@closed = false
|
|
39
41
|
|
|
40
42
|
retrieve_chunk # fetch first chunk so that we know whether the file is empty
|
|
41
43
|
end
|
|
@@ -63,21 +65,20 @@ module Down
|
|
|
63
65
|
def read(length = nil, outbuf = nil)
|
|
64
66
|
fail IOError, "closed stream" if closed?
|
|
65
67
|
|
|
66
|
-
|
|
68
|
+
data = outbuf.clear.force_encoding(Encoding::BINARY) if outbuf
|
|
69
|
+
data ||= "".b
|
|
67
70
|
|
|
68
|
-
|
|
69
|
-
data = readpartial(remaining_length, outbuf)
|
|
70
|
-
data = data.dup unless outbuf
|
|
71
|
-
remaining_length = length - data.bytesize if length
|
|
72
|
-
rescue EOFError
|
|
73
|
-
end
|
|
71
|
+
remaining_length = length
|
|
74
72
|
|
|
75
73
|
until remaining_length == 0 || eof?
|
|
76
|
-
data << readpartial(remaining_length)
|
|
74
|
+
data << readpartial(remaining_length, buffer ||= String.new)
|
|
77
75
|
remaining_length = length - data.bytesize if length
|
|
78
76
|
end
|
|
79
77
|
|
|
80
|
-
|
|
78
|
+
buffer.clear if buffer # deallocate string
|
|
79
|
+
|
|
80
|
+
data.force_encoding(@encoding) unless length
|
|
81
|
+
data unless data.empty? && length && length > 0
|
|
81
82
|
end
|
|
82
83
|
|
|
83
84
|
# Implements IO#gets semantics. Without arguments it retrieves lines of
|
|
@@ -108,27 +109,33 @@ module Down
|
|
|
108
109
|
|
|
109
110
|
separator = "\n\n" if separator.empty?
|
|
110
111
|
|
|
111
|
-
|
|
112
|
-
data = readpartial(limit)
|
|
112
|
+
data = String.new
|
|
113
113
|
|
|
114
|
-
|
|
115
|
-
|
|
116
|
-
|
|
117
|
-
|
|
114
|
+
until data.include?(separator) || data.bytesize == limit || eof?
|
|
115
|
+
remaining_length = limit - data.bytesize if limit
|
|
116
|
+
data << readpartial(remaining_length, buffer ||= String.new)
|
|
117
|
+
end
|
|
118
|
+
|
|
119
|
+
buffer.clear if buffer # deallocate buffer
|
|
120
|
+
|
|
121
|
+
line, extra = data.split(separator, 2)
|
|
122
|
+
line << separator if data.include?(separator)
|
|
118
123
|
|
|
119
|
-
|
|
120
|
-
line << separator if data.include?(separator)
|
|
124
|
+
data.clear # deallocate data
|
|
121
125
|
|
|
126
|
+
if extra
|
|
122
127
|
if cache
|
|
123
|
-
cache.pos -= extra.
|
|
128
|
+
cache.pos -= extra.bytesize
|
|
124
129
|
else
|
|
125
|
-
|
|
130
|
+
if @buffer
|
|
131
|
+
@buffer.prepend(extra)
|
|
132
|
+
else
|
|
133
|
+
@buffer = extra
|
|
134
|
+
end
|
|
126
135
|
end
|
|
127
|
-
rescue EOFError
|
|
128
|
-
line = nil
|
|
129
136
|
end
|
|
130
137
|
|
|
131
|
-
line
|
|
138
|
+
line.force_encoding(@encoding) if line
|
|
132
139
|
end
|
|
133
140
|
|
|
134
141
|
# Implements IO#readpartial semantics. If there is any content readily
|
|
@@ -139,33 +146,33 @@ module Down
|
|
|
139
146
|
# or the next chunk. This is useful when you don't care about the size of
|
|
140
147
|
# chunks and you want to minimize string allocations.
|
|
141
148
|
#
|
|
142
|
-
# With `
|
|
149
|
+
# With `maxlen` argument returns maximum of that amount of bytes (default
|
|
150
|
+
# is 16KB).
|
|
143
151
|
#
|
|
144
152
|
# With `outbuf` argument each call will return that same string object,
|
|
145
153
|
# where the value is replaced with retrieved content.
|
|
146
154
|
#
|
|
147
155
|
# Raises EOFError if end of file is reached. Raises IOError if closed.
|
|
148
|
-
def readpartial(
|
|
156
|
+
def readpartial(maxlen = nil, outbuf = nil)
|
|
149
157
|
fail IOError, "closed stream" if closed?
|
|
150
158
|
|
|
151
|
-
|
|
159
|
+
maxlen ||= 16*1024
|
|
152
160
|
|
|
153
|
-
|
|
161
|
+
data = cache.read(maxlen, outbuf) if cache && !cache.eof?
|
|
162
|
+
data ||= outbuf.clear.force_encoding(Encoding::BINARY) if outbuf
|
|
163
|
+
data ||= "".b
|
|
154
164
|
|
|
155
|
-
if
|
|
156
|
-
data = cache.read(length, outbuf)
|
|
157
|
-
data.force_encoding(@encoding)
|
|
158
|
-
end
|
|
165
|
+
return data if maxlen == 0
|
|
159
166
|
|
|
160
|
-
if @buffer.nil? &&
|
|
167
|
+
if @buffer.nil? && data.empty?
|
|
161
168
|
fail EOFError, "end of file reached" if chunks_depleted?
|
|
162
169
|
@buffer = retrieve_chunk
|
|
163
170
|
end
|
|
164
171
|
|
|
165
|
-
remaining_length =
|
|
172
|
+
remaining_length = maxlen - data.bytesize
|
|
166
173
|
|
|
167
174
|
unless @buffer.nil? || remaining_length == 0
|
|
168
|
-
if remaining_length
|
|
175
|
+
if remaining_length < @buffer.bytesize
|
|
169
176
|
buffered_data = @buffer.byteslice(0, remaining_length)
|
|
170
177
|
@buffer = @buffer.byteslice(remaining_length..-1)
|
|
171
178
|
else
|
|
@@ -173,21 +180,46 @@ module Down
|
|
|
173
180
|
@buffer = nil
|
|
174
181
|
end
|
|
175
182
|
|
|
176
|
-
|
|
177
|
-
data << buffered_data
|
|
178
|
-
else
|
|
179
|
-
data = buffered_data
|
|
180
|
-
end
|
|
183
|
+
data << buffered_data
|
|
181
184
|
|
|
182
185
|
cache.write(buffered_data) if cache
|
|
183
186
|
|
|
184
|
-
buffered_data.clear unless buffered_data.
|
|
187
|
+
buffered_data.clear unless buffered_data.frozen?
|
|
185
188
|
end
|
|
186
189
|
|
|
187
190
|
@position += data.bytesize
|
|
188
191
|
|
|
189
|
-
data.force_encoding(Encoding::BINARY)
|
|
190
|
-
|
|
192
|
+
data.force_encoding(Encoding::BINARY)
|
|
193
|
+
end
|
|
194
|
+
|
|
195
|
+
# Implements IO#seek semantics.
|
|
196
|
+
def seek(amount, whence = IO::SEEK_SET)
|
|
197
|
+
fail Errno::ESPIPE, "Illegal seek" if cache.nil?
|
|
198
|
+
|
|
199
|
+
case whence
|
|
200
|
+
when IO::SEEK_SET, :SET
|
|
201
|
+
target_pos = amount
|
|
202
|
+
when IO::SEEK_CUR, :CUR
|
|
203
|
+
target_pos = @position + amount
|
|
204
|
+
when IO::SEEK_END, :END
|
|
205
|
+
unless chunks_depleted?
|
|
206
|
+
cache.seek(0, IO::SEEK_END)
|
|
207
|
+
IO.copy_stream(self, File::NULL)
|
|
208
|
+
end
|
|
209
|
+
|
|
210
|
+
target_pos = cache.size + amount
|
|
211
|
+
else
|
|
212
|
+
fail ArgumentError, "invalid whence: #{whence.inspect}"
|
|
213
|
+
end
|
|
214
|
+
|
|
215
|
+
if target_pos <= cache.size
|
|
216
|
+
cache.seek(target_pos)
|
|
217
|
+
else
|
|
218
|
+
cache.seek(0, IO::SEEK_END)
|
|
219
|
+
IO.copy_stream(self, File::NULL, target_pos - cache.size)
|
|
220
|
+
end
|
|
221
|
+
|
|
222
|
+
@position = cache.pos
|
|
191
223
|
end
|
|
192
224
|
|
|
193
225
|
# Implements IO#pos semantics. Returns the current position of the
|
|
@@ -195,6 +227,7 @@ module Down
|
|
|
195
227
|
def pos
|
|
196
228
|
@position
|
|
197
229
|
end
|
|
230
|
+
alias tell pos
|
|
198
231
|
|
|
199
232
|
# Implements IO#eof? semantics. Returns whether we've reached end of file.
|
|
200
233
|
# It returns true if cache is at the end and there is no more content to
|
|
@@ -272,7 +305,7 @@ module Down
|
|
|
272
305
|
def retrieve_chunk
|
|
273
306
|
chunk = @next_chunk
|
|
274
307
|
@next_chunk = chunks_fiber.resume
|
|
275
|
-
chunk
|
|
308
|
+
chunk
|
|
276
309
|
end
|
|
277
310
|
|
|
278
311
|
# Returns whether there is any content left to retrieve.
|
data/lib/down/errors.rb
CHANGED
|
@@ -7,20 +7,17 @@ module Down
|
|
|
7
7
|
# raised when the file is larger than the specified maximum size
|
|
8
8
|
class TooLarge < Error; end
|
|
9
9
|
|
|
10
|
-
# raised when the file failed to be retrieved for whatever reason
|
|
11
|
-
class NotFound < Error; end
|
|
12
|
-
|
|
13
10
|
# raised when the given URL couldn't be parsed
|
|
14
|
-
class InvalidUrl <
|
|
11
|
+
class InvalidUrl < Error; end
|
|
15
12
|
|
|
16
13
|
# raised when the number of redirects was larger than the specified maximum
|
|
17
|
-
class TooManyRedirects <
|
|
14
|
+
class TooManyRedirects < Error; end
|
|
18
15
|
|
|
19
16
|
# raised when response returned 4xx or 5xx response
|
|
20
|
-
class ResponseError <
|
|
17
|
+
class ResponseError < Error
|
|
21
18
|
attr_reader :response
|
|
22
19
|
|
|
23
|
-
def initialize(message, response
|
|
20
|
+
def initialize(message, response = nil)
|
|
24
21
|
super(message)
|
|
25
22
|
@response = response
|
|
26
23
|
end
|
|
@@ -29,15 +26,18 @@ module Down
|
|
|
29
26
|
# raised when response returned 4xx response
|
|
30
27
|
class ClientError < ResponseError; end
|
|
31
28
|
|
|
29
|
+
# raised when response returned 404 response
|
|
30
|
+
class NotFound < ClientError; end
|
|
31
|
+
|
|
32
32
|
# raised when response returned 5xx response
|
|
33
33
|
class ServerError < ResponseError; end
|
|
34
34
|
|
|
35
35
|
# raised when there was an error connecting to the server
|
|
36
|
-
class ConnectionError <
|
|
36
|
+
class ConnectionError < Error; end
|
|
37
37
|
|
|
38
38
|
# raised when connecting to the server too longer than the specified timeout
|
|
39
39
|
class TimeoutError < ConnectionError; end
|
|
40
40
|
|
|
41
41
|
# raised when an SSL error was raised
|
|
42
|
-
class SSLError <
|
|
42
|
+
class SSLError < Error; end
|
|
43
43
|
end
|
data/lib/down/http.rb
CHANGED
|
@@ -12,12 +12,7 @@ module Down
|
|
|
12
12
|
# Provides streaming downloads implemented with HTTP.rb.
|
|
13
13
|
class Http < Backend
|
|
14
14
|
# Initializes the backend with common defaults.
|
|
15
|
-
def initialize(options
|
|
16
|
-
if options.is_a?(HTTP::Client)
|
|
17
|
-
warn "[Down] Passing an HTTP::Client object to Down::Http#initialize is deprecated and won't be supported in Down 5. Use the block initialization instead."
|
|
18
|
-
options = options.default_options.to_hash
|
|
19
|
-
end
|
|
20
|
-
|
|
15
|
+
def initialize(**options, &block)
|
|
21
16
|
@method = options.delete(:method) || :get
|
|
22
17
|
@client = HTTP
|
|
23
18
|
.headers("User-Agent" => "Down/#{Down::VERSION}")
|
|
@@ -111,9 +106,10 @@ module Down
|
|
|
111
106
|
|
|
112
107
|
# Raises non-sucessful response as a Down::ResponseError.
|
|
113
108
|
def response_error!(response)
|
|
114
|
-
args = [response.status.to_s, response
|
|
109
|
+
args = [response.status.to_s, response]
|
|
115
110
|
|
|
116
111
|
case response.code
|
|
112
|
+
when 404 then raise Down::NotFound.new(*args)
|
|
117
113
|
when 400..499 then raise Down::ClientError.new(*args)
|
|
118
114
|
when 500..599 then raise Down::ServerError.new(*args)
|
|
119
115
|
else raise Down::ResponseError.new(*args)
|
data/lib/down/net_http.rb
CHANGED
|
@@ -12,27 +12,34 @@ require "fileutils"
|
|
|
12
12
|
module Down
|
|
13
13
|
# Provides streaming downloads implemented with Net::HTTP and open-uri.
|
|
14
14
|
class NetHttp < Backend
|
|
15
|
+
URI_NORMALIZER = -> (url) do
|
|
16
|
+
addressable_uri = Addressable::URI.parse(url)
|
|
17
|
+
addressable_uri.normalize.to_s
|
|
18
|
+
end
|
|
19
|
+
|
|
15
20
|
# Initializes the backend with common defaults.
|
|
16
|
-
def initialize(options
|
|
17
|
-
@options = {
|
|
18
|
-
"User-Agent" => "Down/#{Down::VERSION}",
|
|
19
|
-
max_redirects:
|
|
20
|
-
open_timeout:
|
|
21
|
-
read_timeout:
|
|
22
|
-
|
|
21
|
+
def initialize(*args, **options)
|
|
22
|
+
@options = merge_options({
|
|
23
|
+
headers: { "User-Agent" => "Down/#{Down::VERSION}" },
|
|
24
|
+
max_redirects: 2,
|
|
25
|
+
open_timeout: 30,
|
|
26
|
+
read_timeout: 30,
|
|
27
|
+
uri_normalizer: URI_NORMALIZER,
|
|
28
|
+
}, *args, **options)
|
|
23
29
|
end
|
|
24
30
|
|
|
25
31
|
# Downloads a remote file to disk using open-uri. Accepts any open-uri
|
|
26
32
|
# options, and a few more.
|
|
27
|
-
def download(url, options
|
|
28
|
-
options = @options
|
|
33
|
+
def download(url, *args, **options)
|
|
34
|
+
options = merge_options(@options, *args, **options)
|
|
29
35
|
|
|
30
36
|
max_size = options.delete(:max_size)
|
|
31
37
|
max_redirects = options.delete(:max_redirects)
|
|
32
38
|
progress_proc = options.delete(:progress_proc)
|
|
33
39
|
content_length_proc = options.delete(:content_length_proc)
|
|
34
40
|
destination = options.delete(:destination)
|
|
35
|
-
headers = options.delete(:headers)
|
|
41
|
+
headers = options.delete(:headers)
|
|
42
|
+
uri_normalizer = options.delete(:uri_normalizer)
|
|
36
43
|
|
|
37
44
|
# Use open-uri's :content_lenth_proc or :progress_proc to raise an
|
|
38
45
|
# exception early if the file is too large.
|
|
@@ -74,7 +81,7 @@ module Down
|
|
|
74
81
|
open_uri_options.merge!(options)
|
|
75
82
|
open_uri_options.merge!(headers)
|
|
76
83
|
|
|
77
|
-
uri = ensure_uri(
|
|
84
|
+
uri = ensure_uri(normalize_uri(url, uri_normalizer: uri_normalizer))
|
|
78
85
|
|
|
79
86
|
# Handle basic authentication in the remote URL.
|
|
80
87
|
if uri.user || uri.password
|
|
@@ -95,13 +102,17 @@ module Down
|
|
|
95
102
|
|
|
96
103
|
# Starts retrieving the remote file using Net::HTTP and returns an IO-like
|
|
97
104
|
# object which downloads the response body on-demand.
|
|
98
|
-
def open(url, options
|
|
99
|
-
|
|
100
|
-
|
|
105
|
+
def open(url, *args, **options)
|
|
106
|
+
options = merge_options(@options, *args, **options)
|
|
107
|
+
|
|
108
|
+
max_redirects = options.delete(:max_redirects)
|
|
109
|
+
uri_normalizer = options.delete(:uri_normalizer)
|
|
110
|
+
|
|
111
|
+
uri = ensure_uri(normalize_uri(url, uri_normalizer: uri_normalizer))
|
|
101
112
|
|
|
102
113
|
# Create a Fiber that halts when response headers are received.
|
|
103
114
|
request = Fiber.new do
|
|
104
|
-
net_http_request(uri, options) do |response|
|
|
115
|
+
net_http_request(uri, options, follows_remaining: max_redirects) do |response|
|
|
105
116
|
Fiber.yield response
|
|
106
117
|
end
|
|
107
118
|
end
|
|
@@ -131,7 +142,7 @@ module Down
|
|
|
131
142
|
private
|
|
132
143
|
|
|
133
144
|
# Calls open-uri's URI::HTTP#open method. Additionally handles redirects.
|
|
134
|
-
def open_uri(uri, options, follows_remaining:
|
|
145
|
+
def open_uri(uri, options, follows_remaining:)
|
|
135
146
|
uri.open(options)
|
|
136
147
|
rescue OpenURI::HTTPRedirect => exception
|
|
137
148
|
raise Down::TooManyRedirects, "too many redirects" if follows_remaining == 0
|
|
@@ -186,7 +197,7 @@ module Down
|
|
|
186
197
|
end
|
|
187
198
|
|
|
188
199
|
# Makes a Net::HTTP request and follows redirects.
|
|
189
|
-
def net_http_request(uri, options, follows_remaining
|
|
200
|
+
def net_http_request(uri, options, follows_remaining:, &block)
|
|
190
201
|
http, request = create_net_http(uri, options)
|
|
191
202
|
|
|
192
203
|
begin
|
|
@@ -251,12 +262,13 @@ module Down
|
|
|
251
262
|
http.read_timeout = options[:read_timeout] if options.key?(:read_timeout)
|
|
252
263
|
http.open_timeout = options[:open_timeout] if options.key?(:open_timeout)
|
|
253
264
|
|
|
254
|
-
headers = options.
|
|
255
|
-
headers.merge!(options[:headers]) if options[:headers]
|
|
265
|
+
headers = options[:headers].to_h
|
|
256
266
|
headers["Accept-Encoding"] = "" # Net::HTTP's inflater causes FiberErrors
|
|
257
267
|
|
|
258
268
|
get = Net::HTTP::Get.new(uri.request_uri, headers)
|
|
259
|
-
|
|
269
|
+
|
|
270
|
+
user, password = options[:http_basic_authentication] || [uri.user, uri.password]
|
|
271
|
+
get.basic_auth(user, password) if user || password
|
|
260
272
|
|
|
261
273
|
[http, get]
|
|
262
274
|
end
|
|
@@ -284,9 +296,10 @@ module Down
|
|
|
284
296
|
end
|
|
285
297
|
|
|
286
298
|
# Makes sure that the URL is properly encoded.
|
|
287
|
-
def
|
|
288
|
-
|
|
289
|
-
|
|
299
|
+
def normalize_uri(url, uri_normalizer:)
|
|
300
|
+
URI(url)
|
|
301
|
+
rescue URI::InvalidURIError
|
|
302
|
+
uri_normalizer.call(url)
|
|
290
303
|
end
|
|
291
304
|
|
|
292
305
|
# When open-uri raises an exception, it doesn't expose the response object.
|
|
@@ -295,7 +308,11 @@ module Down
|
|
|
295
308
|
def rebuild_response_from_open_uri_exception(exception)
|
|
296
309
|
code, message = exception.io.status
|
|
297
310
|
|
|
298
|
-
response_class = Net::HTTPResponse::CODE_TO_OBJ.fetch(code)
|
|
311
|
+
response_class = Net::HTTPResponse::CODE_TO_OBJ.fetch(code) do |code|
|
|
312
|
+
Net::HTTPResponse::CODE_CLASS_TO_OBJ.fetch(code[0]) do
|
|
313
|
+
Net::HTTPUnknownResponse
|
|
314
|
+
end
|
|
315
|
+
end
|
|
299
316
|
response = response_class.new(nil, code, message)
|
|
300
317
|
|
|
301
318
|
exception.io.metas.each do |name, values|
|
|
@@ -310,9 +327,10 @@ module Down
|
|
|
310
327
|
code = response.code.to_i
|
|
311
328
|
message = response.message.split(" ").map(&:capitalize).join(" ")
|
|
312
329
|
|
|
313
|
-
args = ["#{code} #{message}", response
|
|
330
|
+
args = ["#{code} #{message}", response]
|
|
314
331
|
|
|
315
332
|
case response.code.to_i
|
|
333
|
+
when 404 then raise Down::NotFound.new(*args)
|
|
316
334
|
when 400..499 then raise Down::ClientError.new(*args)
|
|
317
335
|
when 500..599 then raise Down::ServerError.new(*args)
|
|
318
336
|
else raise Down::ResponseError.new(*args)
|
|
@@ -335,6 +353,24 @@ module Down
|
|
|
335
353
|
end
|
|
336
354
|
end
|
|
337
355
|
|
|
356
|
+
# Merge default and ad-hoc options, merging nested headers.
|
|
357
|
+
def merge_options(options, headers = {}, **new_options)
|
|
358
|
+
# Deprecate passing headers as top-level options, taking into account
|
|
359
|
+
# that Ruby 2.7+ accepts kwargs with string keys.
|
|
360
|
+
if headers.any?
|
|
361
|
+
warn %([Down::NetHttp] Passing headers as top-level options has been deprecated, use the :headers option instead, e.g: `Down::NetHttp.download(headers: { "Key" => "Value", ... }, ...)`)
|
|
362
|
+
new_options[:headers] = headers
|
|
363
|
+
elsif new_options.any? { |key, value| key.is_a?(String) }
|
|
364
|
+
warn %([Down::NetHttp] Passing headers as top-level options has been deprecated, use the :headers option instead, e.g: `Down::NetHttp.download(headers: { "Key" => "Value", ... }, ...)`)
|
|
365
|
+
new_options[:headers] = new_options.select { |key, value| key.is_a?(String) }
|
|
366
|
+
new_options.reject! { |key, value| key.is_a?(String) }
|
|
367
|
+
end
|
|
368
|
+
|
|
369
|
+
options.merge(new_options) do |key, value1, value2|
|
|
370
|
+
key == :headers ? value1.merge(value2) : value2
|
|
371
|
+
end
|
|
372
|
+
end
|
|
373
|
+
|
|
338
374
|
# Defines some additional attributes for the returned Tempfile (on top of what
|
|
339
375
|
# OpenURI::Meta already defines).
|
|
340
376
|
module DownloadedFile
|
data/lib/down/version.rb
CHANGED
metadata
CHANGED
|
@@ -1,14 +1,14 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: down
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version:
|
|
4
|
+
version: 5.2.0
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Janko Marohnić
|
|
8
8
|
autorequire:
|
|
9
9
|
bindir: bin
|
|
10
10
|
cert_chain: []
|
|
11
|
-
date:
|
|
11
|
+
date: 2020-09-20 00:00:00.000000000 Z
|
|
12
12
|
dependencies:
|
|
13
13
|
- !ruby/object:Gem::Dependency
|
|
14
14
|
name: addressable
|
|
@@ -72,14 +72,14 @@ dependencies:
|
|
|
72
72
|
requirements:
|
|
73
73
|
- - "~>"
|
|
74
74
|
- !ruby/object:Gem::Version
|
|
75
|
-
version: '4.
|
|
75
|
+
version: '4.3'
|
|
76
76
|
type: :development
|
|
77
77
|
prerelease: false
|
|
78
78
|
version_requirements: !ruby/object:Gem::Requirement
|
|
79
79
|
requirements:
|
|
80
80
|
- - "~>"
|
|
81
81
|
- !ruby/object:Gem::Version
|
|
82
|
-
version: '4.
|
|
82
|
+
version: '4.3'
|
|
83
83
|
- !ruby/object:Gem::Dependency
|
|
84
84
|
name: posix-spawn
|
|
85
85
|
requirement: !ruby/object:Gem::Requirement
|
|
@@ -122,6 +122,20 @@ dependencies:
|
|
|
122
122
|
- - ">="
|
|
123
123
|
- !ruby/object:Gem::Version
|
|
124
124
|
version: '0'
|
|
125
|
+
- !ruby/object:Gem::Dependency
|
|
126
|
+
name: warning
|
|
127
|
+
requirement: !ruby/object:Gem::Requirement
|
|
128
|
+
requirements:
|
|
129
|
+
- - ">="
|
|
130
|
+
- !ruby/object:Gem::Version
|
|
131
|
+
version: '0'
|
|
132
|
+
type: :development
|
|
133
|
+
prerelease: false
|
|
134
|
+
version_requirements: !ruby/object:Gem::Requirement
|
|
135
|
+
requirements:
|
|
136
|
+
- - ">="
|
|
137
|
+
- !ruby/object:Gem::Version
|
|
138
|
+
version: '0'
|
|
125
139
|
description:
|
|
126
140
|
email:
|
|
127
141
|
- janko.marohnic@gmail.com
|
|
@@ -154,14 +168,14 @@ required_ruby_version: !ruby/object:Gem::Requirement
|
|
|
154
168
|
requirements:
|
|
155
169
|
- - ">="
|
|
156
170
|
- !ruby/object:Gem::Version
|
|
157
|
-
version: '2.
|
|
171
|
+
version: '2.3'
|
|
158
172
|
required_rubygems_version: !ruby/object:Gem::Requirement
|
|
159
173
|
requirements:
|
|
160
174
|
- - ">="
|
|
161
175
|
- !ruby/object:Gem::Version
|
|
162
176
|
version: '0'
|
|
163
177
|
requirements: []
|
|
164
|
-
rubygems_version: 3.
|
|
178
|
+
rubygems_version: 3.1.1
|
|
165
179
|
signing_key:
|
|
166
180
|
specification_version: 4
|
|
167
181
|
summary: Robust streaming downloads using Net::HTTP, HTTP.rb or wget.
|