down 4.8.1 → 5.0.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/CHANGELOG.md +18 -0
- data/README.md +49 -24
- data/lib/down/chunked_io.rb +68 -41
- data/lib/down/errors.rb +8 -8
- data/lib/down/http.rb +1 -5
- data/lib/down/net_http.rb +1 -0
- data/lib/down/version.rb +1 -1
- metadata +2 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: a1f7a1532b638ed92acdb3bdf74211dbd022e7a0de37d1fdbc25f665337d4bf1
|
4
|
+
data.tar.gz: 7bbc2684d53e278376981b4dd4741c49bc0d8da0990ece28fb03776f39aea6fe
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 6a3e62293c5fa1e5b43a5c835af3804ab8307babfb573179057ec279239b72a16ff74b2b2e96df9da033a3e4e13be4af6adeb8ec00dd73a27f75942b646419de
|
7
|
+
data.tar.gz: d267672bb2468c8998e271ff3849908c7beead00d8b9777799e6bc0abd1ab51ccf14d5c567c16f3bb30a41eb3c57b66ffbc2755d333055fba972d327a865acb4
|
data/CHANGELOG.md
CHANGED
@@ -1,3 +1,21 @@
|
|
1
|
+
## 5.0.0 (2019-09-26)
|
2
|
+
|
3
|
+
* Change `ChunkedIO#each_chunk` to return chunks in original encoding (@janko)
|
4
|
+
|
5
|
+
* Always return binary strings in `ChunkedIO#readpartial` (@janko)
|
6
|
+
|
7
|
+
* Handle frozen chunks in `Down::ChunkedIO` (@janko)
|
8
|
+
|
9
|
+
* Change `ChunkedIO#gets` to return lines in specified encoding (@janko)
|
10
|
+
|
11
|
+
* Halve memory allocation for `ChunkedIO#gets` (@janko)
|
12
|
+
|
13
|
+
* Halve memory allocation for `ChunkedIO#read` without arguments (@janko)
|
14
|
+
|
15
|
+
* Drop support for `HTTP::Client` argument in `Down::HTTP.new` (@janko)
|
16
|
+
|
17
|
+
* Repurpose `Down::NotFound` to be raised on `404 Not Found` response (@janko)
|
18
|
+
|
1
19
|
## 4.8.1 (2019-05-01)
|
2
20
|
|
3
21
|
* Make `ChunkedIO#read`/`#readpartial` with length always return strings in binary encoding (@janko)
|
data/README.md
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
# Down
|
2
2
|
|
3
3
|
Down is a utility tool for streaming, flexible and safe downloading of remote
|
4
|
-
files. It can use [open-uri] + `Net::HTTP`, [
|
4
|
+
files. It can use [open-uri] + `Net::HTTP`, [http.rb] or `wget` as the backend
|
5
5
|
HTTP library.
|
6
6
|
|
7
7
|
## Installation
|
@@ -57,8 +57,12 @@ specific location on disk, you can specify the `:destination` option:
|
|
57
57
|
|
58
58
|
```rb
|
59
59
|
Down.download("http://example.com/image.jpg", destination: "/path/to/destination")
|
60
|
+
#=> nil
|
60
61
|
```
|
61
62
|
|
63
|
+
In this case `Down.download` won't have any return value, so if you need a File
|
64
|
+
object you'll have to create it manually.
|
65
|
+
|
62
66
|
### Basic authentication
|
63
67
|
|
64
68
|
`Down.download` and `Down.open` will automatically detect and apply HTTP basic
|
@@ -103,6 +107,16 @@ remote_file.eof? #=> true
|
|
103
107
|
remote_file.close # closes the HTTP connection and deletes the internal Tempfile
|
104
108
|
```
|
105
109
|
|
110
|
+
The following IO methods are implemented:
|
111
|
+
|
112
|
+
* `#read` & `#readpartial`
|
113
|
+
* `#gets`
|
114
|
+
* `#seek`
|
115
|
+
* `#pos` & `#tell`
|
116
|
+
* `#eof?`
|
117
|
+
* `#rewind`
|
118
|
+
* `#close`
|
119
|
+
|
106
120
|
### Caching
|
107
121
|
|
108
122
|
By default the downloaded content is internally cached into a `Tempfile`, so
|
@@ -147,10 +161,10 @@ remote_file.data[:headers] #=> { ... }
|
|
147
161
|
remote_file.data[:response] # returns the response object
|
148
162
|
```
|
149
163
|
|
150
|
-
Note that `Down::
|
151
|
-
status was 4xx or 5xx.
|
164
|
+
Note that a `Down::ResponseError` exception will automatically be raised if
|
165
|
+
response status was 4xx or 5xx.
|
152
166
|
|
153
|
-
###
|
167
|
+
### Down::ChunkedIO
|
154
168
|
|
155
169
|
The `Down.open` performs HTTP logic and returns an instance of
|
156
170
|
`Down::ChunkedIO`. However, `Down::ChunkedIO` is a generic class that can wrap
|
@@ -196,21 +210,23 @@ the `Down::Error` subclasses. This is Down's exception hierarchy:
|
|
196
210
|
|
197
211
|
* `Down::Error`
|
198
212
|
* `Down::TooLarge`
|
199
|
-
* `Down::
|
200
|
-
|
201
|
-
|
202
|
-
* `Down::
|
203
|
-
* `Down::
|
204
|
-
|
205
|
-
|
206
|
-
|
207
|
-
|
213
|
+
* `Down::InvalidUrl`
|
214
|
+
* `Down::TooManyRedirects`
|
215
|
+
* `Down::ResponseError`
|
216
|
+
* `Down::ClientError`
|
217
|
+
* `Down::NotFound`
|
218
|
+
* `Down::ServerError`
|
219
|
+
* `Down::ConnectionError`
|
220
|
+
* `Down::TimeoutError`
|
221
|
+
* `Down::SSLError`
|
208
222
|
|
209
223
|
## Backends
|
210
224
|
|
211
|
-
|
212
|
-
|
213
|
-
|
225
|
+
The following backends are available:
|
226
|
+
|
227
|
+
* [Down::NetHttp](#downnethttp) (default)
|
228
|
+
* [Down::Http](#downhttp)
|
229
|
+
* [Down::Wget](#downwget)
|
214
230
|
|
215
231
|
You can use the backend directly:
|
216
232
|
|
@@ -232,7 +248,10 @@ Down.download("...")
|
|
232
248
|
Down.open("...")
|
233
249
|
```
|
234
250
|
|
235
|
-
###
|
251
|
+
### Down::NetHttp
|
252
|
+
|
253
|
+
The `Down::NetHttp` backend implements downloads using [open-uri] and
|
254
|
+
[Net::HTTP].
|
236
255
|
|
237
256
|
```rb
|
238
257
|
gem "down", "~> 4.4"
|
@@ -334,7 +353,9 @@ net_http.download("http://example.com/image.jpg")
|
|
334
353
|
net_http.open("http://example.com/image.jpg")
|
335
354
|
```
|
336
355
|
|
337
|
-
###
|
356
|
+
### Down::Http
|
357
|
+
|
358
|
+
The `Down::Http` backend implements downloads using the [http.rb] gem.
|
338
359
|
|
339
360
|
```rb
|
340
361
|
gem "down", "~> 4.4"
|
@@ -350,7 +371,7 @@ io = Down::Http.open("http://nature.com/forest.jpg")
|
|
350
371
|
io #=> #<Down::ChunkedIO ...>
|
351
372
|
```
|
352
373
|
|
353
|
-
Some features that give the
|
374
|
+
Some features that give the http.rb backend an advantage over `open-uri` and
|
354
375
|
`Net::HTTP` include:
|
355
376
|
|
356
377
|
* Low memory usage (**10x less** than `open-uri`/`Net::HTTP`)
|
@@ -401,7 +422,10 @@ down = Down::Http.new(method: :post)
|
|
401
422
|
down.download("http://example.org/image.jpg")
|
402
423
|
```
|
403
424
|
|
404
|
-
### Wget (experimental)
|
425
|
+
### Down::Wget (experimental)
|
426
|
+
|
427
|
+
The `Down::Wget` backend implements downloads using the `wget` command line
|
428
|
+
utility.
|
405
429
|
|
406
430
|
```rb
|
407
431
|
gem "down", "~> 4.4"
|
@@ -418,9 +442,8 @@ io = Down::Wget.open("http://nature.com/forest.jpg")
|
|
418
442
|
io #=> #<Down::ChunkedIO ...>
|
419
443
|
```
|
420
444
|
|
421
|
-
|
422
|
-
|
423
|
-
interrupted due to network failures, which is very useful when you're
|
445
|
+
One major advantage of `wget` is that it automatically resumes downloads that
|
446
|
+
were interrupted due to network failures, which is very useful when you're
|
424
447
|
downloading large files.
|
425
448
|
|
426
449
|
However, the Wget backend should still be considered experimental, as it wasn't
|
@@ -450,6 +473,8 @@ wget.open("http://nature.com/forest.jpg")
|
|
450
473
|
* MRI 2.2
|
451
474
|
* MRI 2.3
|
452
475
|
* MRI 2.4
|
476
|
+
* MRI 2.5
|
477
|
+
* MRI 2.6
|
453
478
|
* JRuby
|
454
479
|
|
455
480
|
## Development
|
@@ -469,6 +494,6 @@ you'll need to have Docker installed and running.
|
|
469
494
|
|
470
495
|
[open-uri]: http://ruby-doc.org/stdlib-2.3.0/libdoc/open-uri/rdoc/OpenURI.html
|
471
496
|
[Net::HTTP]: https://ruby-doc.org/stdlib-2.4.1/libdoc/net/http/rdoc/Net/HTTP.html
|
472
|
-
[
|
497
|
+
[http.rb]: https://github.com/httprb/http
|
473
498
|
[Addressable::URI]: https://github.com/sporkmonger/addressable
|
474
499
|
[kennethreitz/httpbin]: https://github.com/kennethreitz/httpbin
|
data/lib/down/chunked_io.rb
CHANGED
@@ -63,21 +63,18 @@ module Down
|
|
63
63
|
def read(length = nil, outbuf = nil)
|
64
64
|
fail IOError, "closed stream" if closed?
|
65
65
|
|
66
|
+
data = outbuf.to_s.clear.force_encoding(Encoding::BINARY)
|
66
67
|
remaining_length = length
|
67
68
|
|
68
|
-
begin
|
69
|
-
data = readpartial(remaining_length, outbuf)
|
70
|
-
data = data.dup unless outbuf
|
71
|
-
remaining_length = length - data.bytesize if length
|
72
|
-
rescue EOFError
|
73
|
-
end
|
74
|
-
|
75
69
|
until remaining_length == 0 || eof?
|
76
|
-
data << readpartial(remaining_length)
|
70
|
+
data << readpartial(remaining_length, buffer ||= String.new)
|
77
71
|
remaining_length = length - data.bytesize if length
|
78
72
|
end
|
79
73
|
|
80
|
-
|
74
|
+
buffer.clear if buffer # deallocate string
|
75
|
+
|
76
|
+
data.force_encoding(@encoding) unless length
|
77
|
+
data unless data.empty? && length && length > 0
|
81
78
|
end
|
82
79
|
|
83
80
|
# Implements IO#gets semantics. Without arguments it retrieves lines of
|
@@ -108,27 +105,33 @@ module Down
|
|
108
105
|
|
109
106
|
separator = "\n\n" if separator.empty?
|
110
107
|
|
111
|
-
|
112
|
-
data = readpartial(limit)
|
108
|
+
data = String.new
|
113
109
|
|
114
|
-
|
115
|
-
|
116
|
-
|
117
|
-
|
110
|
+
until data.include?(separator) || data.bytesize == limit || eof?
|
111
|
+
remaining_length = limit - data.bytesize if limit
|
112
|
+
data << readpartial(remaining_length, buffer ||= String.new)
|
113
|
+
end
|
118
114
|
|
119
|
-
|
120
|
-
line << separator if data.include?(separator)
|
115
|
+
buffer.clear if buffer # deallocate buffer
|
121
116
|
|
117
|
+
line, extra = data.split(separator, 2)
|
118
|
+
line << separator if data.include?(separator)
|
119
|
+
|
120
|
+
data.clear # deallocate data
|
121
|
+
|
122
|
+
if extra
|
122
123
|
if cache
|
123
|
-
cache.pos -= extra.
|
124
|
+
cache.pos -= extra.bytesize
|
124
125
|
else
|
125
|
-
|
126
|
+
if @buffer
|
127
|
+
@buffer.prepend(extra)
|
128
|
+
else
|
129
|
+
@buffer = extra
|
130
|
+
end
|
126
131
|
end
|
127
|
-
rescue EOFError
|
128
|
-
line = nil
|
129
132
|
end
|
130
133
|
|
131
|
-
line
|
134
|
+
line.force_encoding(@encoding) if line
|
132
135
|
end
|
133
136
|
|
134
137
|
# Implements IO#readpartial semantics. If there is any content readily
|
@@ -145,27 +148,25 @@ module Down
|
|
145
148
|
# where the value is replaced with retrieved content.
|
146
149
|
#
|
147
150
|
# Raises EOFError if end of file is reached. Raises IOError if closed.
|
148
|
-
def readpartial(
|
151
|
+
def readpartial(maxlen = nil, outbuf = nil)
|
149
152
|
fail IOError, "closed stream" if closed?
|
150
153
|
|
151
|
-
|
154
|
+
maxlen ||= 16*1024
|
152
155
|
|
153
|
-
|
156
|
+
data = cache.read(maxlen, outbuf) if cache && !cache.eof?
|
157
|
+
data ||= outbuf.to_s.clear
|
154
158
|
|
155
|
-
if
|
156
|
-
data = cache.read(length, outbuf)
|
157
|
-
data.force_encoding(@encoding)
|
158
|
-
end
|
159
|
+
return data if maxlen == 0
|
159
160
|
|
160
|
-
if @buffer.nil? &&
|
161
|
+
if @buffer.nil? && data.empty?
|
161
162
|
fail EOFError, "end of file reached" if chunks_depleted?
|
162
163
|
@buffer = retrieve_chunk
|
163
164
|
end
|
164
165
|
|
165
|
-
remaining_length =
|
166
|
+
remaining_length = maxlen - data.bytesize
|
166
167
|
|
167
168
|
unless @buffer.nil? || remaining_length == 0
|
168
|
-
if remaining_length
|
169
|
+
if remaining_length < @buffer.bytesize
|
169
170
|
buffered_data = @buffer.byteslice(0, remaining_length)
|
170
171
|
@buffer = @buffer.byteslice(remaining_length..-1)
|
171
172
|
else
|
@@ -173,21 +174,46 @@ module Down
|
|
173
174
|
@buffer = nil
|
174
175
|
end
|
175
176
|
|
176
|
-
|
177
|
-
data << buffered_data
|
178
|
-
else
|
179
|
-
data = buffered_data
|
180
|
-
end
|
177
|
+
data << buffered_data
|
181
178
|
|
182
179
|
cache.write(buffered_data) if cache
|
183
180
|
|
184
|
-
buffered_data.clear unless buffered_data.
|
181
|
+
buffered_data.clear unless buffered_data.frozen?
|
185
182
|
end
|
186
183
|
|
187
184
|
@position += data.bytesize
|
188
185
|
|
189
|
-
data.force_encoding(Encoding::BINARY)
|
190
|
-
|
186
|
+
data.force_encoding(Encoding::BINARY)
|
187
|
+
end
|
188
|
+
|
189
|
+
# Implements IO#seek semantics.
|
190
|
+
def seek(amount, whence = IO::SEEK_SET)
|
191
|
+
fail Errno::ESPIPE, "Illegal seek" if cache.nil?
|
192
|
+
|
193
|
+
case whence
|
194
|
+
when IO::SEEK_SET, :SET
|
195
|
+
target_pos = amount
|
196
|
+
when IO::SEEK_CUR, :CUR
|
197
|
+
target_pos = @position + amount
|
198
|
+
when IO::SEEK_END, :END
|
199
|
+
unless chunks_depleted?
|
200
|
+
cache.seek(0, IO::SEEK_END)
|
201
|
+
IO.copy_stream(self, File::NULL)
|
202
|
+
end
|
203
|
+
|
204
|
+
target_pos = cache.size + amount
|
205
|
+
else
|
206
|
+
fail ArgumentError, "invalid whence: #{whence.inspect}"
|
207
|
+
end
|
208
|
+
|
209
|
+
if target_pos <= cache.size
|
210
|
+
cache.seek(target_pos)
|
211
|
+
else
|
212
|
+
cache.seek(0, IO::SEEK_END)
|
213
|
+
IO.copy_stream(self, File::NULL, target_pos - cache.size)
|
214
|
+
end
|
215
|
+
|
216
|
+
@position = cache.pos
|
191
217
|
end
|
192
218
|
|
193
219
|
# Implements IO#pos semantics. Returns the current position of the
|
@@ -195,6 +221,7 @@ module Down
|
|
195
221
|
def pos
|
196
222
|
@position
|
197
223
|
end
|
224
|
+
alias tell pos
|
198
225
|
|
199
226
|
# Implements IO#eof? semantics. Returns whether we've reached end of file.
|
200
227
|
# It returns true if cache is at the end and there is no more content to
|
@@ -272,7 +299,7 @@ module Down
|
|
272
299
|
def retrieve_chunk
|
273
300
|
chunk = @next_chunk
|
274
301
|
@next_chunk = chunks_fiber.resume
|
275
|
-
chunk
|
302
|
+
chunk
|
276
303
|
end
|
277
304
|
|
278
305
|
# Returns whether there is any content left to retrieve.
|
data/lib/down/errors.rb
CHANGED
@@ -7,17 +7,14 @@ module Down
|
|
7
7
|
# raised when the file is larger than the specified maximum size
|
8
8
|
class TooLarge < Error; end
|
9
9
|
|
10
|
-
# raised when the file failed to be retrieved for whatever reason
|
11
|
-
class NotFound < Error; end
|
12
|
-
|
13
10
|
# raised when the given URL couldn't be parsed
|
14
|
-
class InvalidUrl <
|
11
|
+
class InvalidUrl < Error; end
|
15
12
|
|
16
13
|
# raised when the number of redirects was larger than the specified maximum
|
17
|
-
class TooManyRedirects <
|
14
|
+
class TooManyRedirects < Error; end
|
18
15
|
|
19
16
|
# raised when response returned 4xx or 5xx response
|
20
|
-
class ResponseError <
|
17
|
+
class ResponseError < Error
|
21
18
|
attr_reader :response
|
22
19
|
|
23
20
|
def initialize(message, response: nil)
|
@@ -29,15 +26,18 @@ module Down
|
|
29
26
|
# raised when response returned 4xx response
|
30
27
|
class ClientError < ResponseError; end
|
31
28
|
|
29
|
+
# raised when response returned 404 response
|
30
|
+
class NotFound < ClientError; end
|
31
|
+
|
32
32
|
# raised when response returned 5xx response
|
33
33
|
class ServerError < ResponseError; end
|
34
34
|
|
35
35
|
# raised when there was an error connecting to the server
|
36
|
-
class ConnectionError <
|
36
|
+
class ConnectionError < Error; end
|
37
37
|
|
38
38
|
# raised when connecting to the server too longer than the specified timeout
|
39
39
|
class TimeoutError < ConnectionError; end
|
40
40
|
|
41
41
|
# raised when an SSL error was raised
|
42
|
-
class SSLError <
|
42
|
+
class SSLError < Error; end
|
43
43
|
end
|
data/lib/down/http.rb
CHANGED
@@ -13,11 +13,6 @@ module Down
|
|
13
13
|
class Http < Backend
|
14
14
|
# Initializes the backend with common defaults.
|
15
15
|
def initialize(options = {}, &block)
|
16
|
-
if options.is_a?(HTTP::Client)
|
17
|
-
warn "[Down] Passing an HTTP::Client object to Down::Http#initialize is deprecated and won't be supported in Down 5. Use the block initialization instead."
|
18
|
-
options = options.default_options.to_hash
|
19
|
-
end
|
20
|
-
|
21
16
|
@method = options.delete(:method) || :get
|
22
17
|
@client = HTTP
|
23
18
|
.headers("User-Agent" => "Down/#{Down::VERSION}")
|
@@ -114,6 +109,7 @@ module Down
|
|
114
109
|
args = [response.status.to_s, response: response]
|
115
110
|
|
116
111
|
case response.code
|
112
|
+
when 404 then raise Down::NotFound.new(*args)
|
117
113
|
when 400..499 then raise Down::ClientError.new(*args)
|
118
114
|
when 500..599 then raise Down::ServerError.new(*args)
|
119
115
|
else raise Down::ResponseError.new(*args)
|
data/lib/down/net_http.rb
CHANGED
@@ -313,6 +313,7 @@ module Down
|
|
313
313
|
args = ["#{code} #{message}", response: response]
|
314
314
|
|
315
315
|
case response.code.to_i
|
316
|
+
when 404 then raise Down::NotFound.new(*args)
|
316
317
|
when 400..499 then raise Down::ClientError.new(*args)
|
317
318
|
when 500..599 then raise Down::ServerError.new(*args)
|
318
319
|
else raise Down::ResponseError.new(*args)
|
data/lib/down/version.rb
CHANGED
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: down
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version:
|
4
|
+
version: 5.0.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Janko Marohnić
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2019-
|
11
|
+
date: 2019-09-26 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: addressable
|