tus-server 0.10.2 → 1.0.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: a8418797bd1a0d54cfe083534e48a4b014763cf1
4
- data.tar.gz: c56a965e748adbd182700f38d32f8a4a530efee8
3
+ metadata.gz: 9dfd9cb38d7ef46fe3ac684a7ee3569f66e97c84
4
+ data.tar.gz: 19b75e3d0c222d47c46432f1fd198ab9e27f56bb
5
5
  SHA512:
6
- metadata.gz: 338316cbd7d8ae243e40a15bbca5b04070b70b6973ad618da9f7524a58af2586254eb73415ec6830d26bf43d470a13eb975b3cc9cf170a915ec8aed0d2309982
7
- data.tar.gz: 053cd94a93e0b2f57914240c45c9ed50494a6191acfb62f2b1c01d4331bc6047a249d4dff65367dc8bc29bef15aad9a4f7ff0b8e1bc85edb997bc96b755b6f79
6
+ metadata.gz: 48c902ba0c5ff439869a93b3b4fa851fe17644c7a41978b25041f4955b5c0697e9a725ece84a5ee1d294acd7c631ad04b9d4348f327a64c1ca0002549b1db509
7
+ data.tar.gz: 482543723d79aeb1df06a9fb429d8f39584e4891905e795c32df660f860413ffcac4408892114a85c67e4b1c867e1d6b9462157d6f495c9399d3d641e921f6b8
data/README.md CHANGED
@@ -19,8 +19,7 @@ gem "tus-server"
19
19
 
20
20
  Tus-ruby-server provides a `Tus::Server` Roda app, which you can run in your
21
21
  `config.ru`. That way you can run `Tus::Server` both as a standalone app or as
22
- part of your main app (though it's recommended to run it as a standalone app,
23
- as explained in the "Performance considerations" section of this README).
22
+ part of your main app.
24
23
 
25
24
  ```rb
26
25
  # config.ru
@@ -33,6 +32,9 @@ end
33
32
  run YourApp
34
33
  ```
35
34
 
35
+ While this is the most flexible option, it's not optimal in terms of
36
+ performance; see the [Goliath](#goliath) section for an alternative approach.
37
+
36
38
  Now you can tell your tus client library (e.g. [tus-js-client]) to use this
37
39
  endpoint:
38
40
 
@@ -40,7 +42,7 @@ endpoint:
40
42
  // using tus-js-client
41
43
  new tus.Upload(file, {
42
44
  endpoint: "http://localhost:9292/files",
43
- chunkSize: 5*1024*1024, // 5MB
45
+ chunkSize: 5*1024*1024, // required unless using Goliath
44
46
  // ...
45
47
  })
46
48
  ```
@@ -49,6 +51,42 @@ After the upload is complete, you'll probably want to attach the uploaded file
49
51
  to a database record. [Shrine] is one file attachment library that integrates
50
52
  nicely with tus-ruby-server, see [shrine-tus-demo] for an example integration.
51
53
 
54
+ ### Goliath
55
+
56
+ [Goliath] is the ideal web server to run tus-ruby-server on, because by
57
+ utilizing [EventMachine] it's asnychronous both in reading the request body and
58
+ writing to the response body, so it's not affected by slow clients. Goliath
59
+ also allows tus-ruby-server to handle interrupted requests, by saving data that
60
+ has been uploaded until the interruption. This means that with Goliath it's
61
+ **not** mandatory for client to chunk the upload into multiple requests in
62
+ order to achieve resumable uploads, which would be the case for most other web
63
+ servers.
64
+
65
+ Tus-ruby-server ships with Goliath integration, you just need to require it in
66
+ a Ruby file and run that file, and that will automatically start up Goliath.
67
+
68
+ ```rb
69
+ # Gemfile
70
+ gem "tus-server", "~> 1.0"
71
+ gem "goliath"
72
+ gem "async-rack", ">= 0.5.1"
73
+ ```
74
+ ```rb
75
+ # tus.rb
76
+ require "tus/server/goliath"
77
+
78
+ # any additional Tus::Server configuration you want to put in here
79
+ ```
80
+ ```sh
81
+ $ ruby tus.rb --stdout # enable logging
82
+ ```
83
+
84
+ Any options provided after the Ruby file will be passed in to the Goliath
85
+ server, see [this wiki][goliath server options] for all available options that
86
+ Goliath supports. As shown above, running tus-ruby-server on Goliath means you
87
+ have to run it separately from your main app (unless your main app is also on
88
+ Goliath).
89
+
52
90
  ## Storage
53
91
 
54
92
  ### Filesystem
@@ -207,8 +245,8 @@ Tus::Server.opts[:storage].expire_files(expiration_date)
207
245
  ## Download
208
246
 
209
247
  In addition to implementing the tus protocol, tus-ruby-server also comes with a
210
- GET endpoint for downloading the uploaded file, which streams the file directly
211
- from the storage.
248
+ GET endpoint for downloading the uploaded file, which streams the file from the
249
+ storage into the response body.
212
250
 
213
251
  The endpoint will automatically use the following `Upload-Metadata` values if
214
252
  they're available:
@@ -237,88 +275,13 @@ The following checksum algorithms are supported for the `checksum` extension:
237
275
  * MD5
238
276
  * CRC32
239
277
 
240
- ## Performance considerations
241
-
242
- ### Buffering
243
-
244
- When handling file uploads it's important not be be vulnerable to slow-write
245
- clients. That means you need to make sure that your web/application server
246
- buffers the request body locally before handing the request to the request
247
- worker.
248
-
249
- If the request body is not buffered and is read directly from the socket when
250
- it has already reached your Rack application, your application throughput will
251
- be severly impacted, because the workers will spend majority of their time
252
- waiting for request body to be read from the socket, and in that time they
253
- won't be able to serve new requests.
254
-
255
- Puma will automatically buffer the whole request body in a Tempfile, before
256
- fowarding the request to your Rack app. Unicorn and Passenger will not do that,
257
- so it's highly recommended to put a frontend server like Nginx in front of
258
- those web servers, and configure it to buffer the request body.
259
-
260
- ### Chunking
261
-
262
- The tus protocol specifies
263
-
264
- > The Server SHOULD always attempt to store as much of the received data as possible.
265
-
266
- The tus-ruby-server Rack application supports saving partial data for if the
267
- PATCH request gets interrupted before all data has been sent, but I'm not aware
268
- of any Rack-compliant web server that will forward interrupted requests to the
269
- Rack app.
270
-
271
- This means that for resumable upload to be possible with tus-ruby-server in
272
- general, the file must be uploaded in multiple chunks; the client shouldn't
273
- rely that server will store any data if the PATCH request was interrupted.
274
-
275
- ```js
276
- // using tus-js-client
277
- new tus.Upload(file, {
278
- endpoint: "http://localhost:9292/files",
279
- chunkSize: 5*1024*1024, // required option
280
- // ...
281
- })
282
- ```
283
-
284
- ### Downloading
285
-
286
- Tus-ruby-server has a download endpoint which streams the uploaded file to the
287
- client. Unfortunately, with most classic web servers this endpoint will be
288
- vulnerable to slow-read clients, because the worker is only done once the whole
289
- response body has been received by the client. Web servers that are not
290
- vulnerable to slow-read clients include [Goliath]/[Thin] ([EventMachine]) and
291
- [Reel] ([Celluloid::IO]).
292
-
293
- So, depending on your requirements, you might want to avoid displaying the
294
- uploaded file in the browser (making the user download the file directly from
295
- the tus server), until it has been moved to a permanent storage. You might also
296
- want to consider copying finished uploads to permanent storage directly from
297
- the underlying tus storage, instead of downloading them through the app.
298
-
299
278
  ## Tests
300
279
 
301
280
  Run tests with
302
281
 
303
- ```
304
- $ rake test
305
- ```
306
-
307
- The S3 tests are excluded by default, but you can include them by setting the
308
- `$S3` environment variable.
309
-
310
- ```
311
- $ S3=1 rake test
312
- ```
313
-
314
- For running S3 tests you need to create an `.env` with the S3 credentials:
315
-
316
282
  ```sh
317
- # .env
318
- S3_BUCKET="..."
319
- S3_REGION="..."
320
- S3_ACCESS_KEY_ID="..."
321
- S3_SECRET_ACCESS_KEY="..."
283
+ $ bundle exec rake test # unit tests
284
+ $ bundle exec cucumber # acceptance tests
322
285
  ```
323
286
 
324
287
  ## Inspiration
@@ -346,8 +309,6 @@ The tus-ruby-server was inspired by [rubytus].
346
309
  [`Aws::S3::Client#initialize`]: http://docs.aws.amazon.com/sdkforruby/api/Aws/S3/Client.html#initialize-instance_method
347
310
  [`Aws::S3::Client#create_multipart_upload`]: http://docs.aws.amazon.com/sdkforruby/api/Aws/S3/Client.html#create_multipart_upload-instance_method
348
311
  [Range requests]: https://tools.ietf.org/html/rfc7233
349
- [EventMachine]: https://github.com/eventmachine/eventmachine
350
- [Reel]: https://github.com/celluloid/reel
351
312
  [Goliath]: https://github.com/postrank-labs/goliath
352
- [Thin]: https://github.com/macournoyer/thin
353
- [Celluloid::IO]: https://github.com/celluloid/celluloid-io
313
+ [EventMachine]: https://github.com/eventmachine/eventmachine
314
+ [goliath server options]: https://github.com/postrank-labs/goliath/wiki/Server
data/lib/tus/checksum.rb CHANGED
@@ -1,9 +1,7 @@
1
- require "base64"
2
- require "digest"
3
- require "zlib"
4
-
5
1
  module Tus
6
2
  class Checksum
3
+ CHUNK_SIZE = 16*1024
4
+
7
5
  attr_reader :algorithm
8
6
 
9
7
  def self.generate(algorithm, input)
@@ -47,14 +45,21 @@ module Tus
47
45
  end
48
46
 
49
47
  def generate_crc32(io)
50
- crc = 0
51
- crc = Zlib.crc32(io.read(16*1024, buffer ||= ""), crc) until io.eof?
52
- Base64.encode64(crc.to_s)
48
+ require "zlib"
49
+ require "base64"
50
+ crc = Zlib.crc32("")
51
+ while (data = io.read(CHUNK_SIZE, buffer ||= ""))
52
+ crc = Zlib.crc32(data, crc)
53
+ end
54
+ Base64.strict_encode64(crc.to_s)
53
55
  end
54
56
 
55
57
  def digest(name, io)
58
+ require "digest"
56
59
  digest = Digest.const_get(name).new
57
- digest.update(io.read(16*1024, buffer ||= "")) until io.eof?
60
+ while (data = io.read(CHUNK_SIZE, buffer ||= ""))
61
+ digest.update(data)
62
+ end
58
63
  digest.base64digest
59
64
  end
60
65
  end
data/lib/tus/errors.rb CHANGED
@@ -1,4 +1,5 @@
1
1
  module Tus
2
- Error = Class.new(StandardError)
3
- NotFound = Class.new(Error)
2
+ Error = Class.new(StandardError)
3
+ NotFound = Class.new(Error)
4
+ MaxSizeExceeded = Class.new(Error)
4
5
  end
data/lib/tus/input.rb CHANGED
@@ -1,31 +1,33 @@
1
+ require "tus/errors"
2
+
1
3
  module Tus
2
4
  class Input
3
- def initialize(input)
5
+ def initialize(input, limit: nil)
4
6
  @input = input
5
- @bytes_read = 0
7
+ @limit = limit
8
+ @pos = 0
6
9
  end
7
10
 
8
- def read(*args)
9
- result = @input.read(*args)
10
- @bytes_read += result.bytesize if result.is_a?(String)
11
- result
11
+ def read(length = nil, outbuf = nil)
12
+ data = @input.read(length, outbuf)
13
+
14
+ @pos += data.bytesize if data
15
+ raise MaxSizeExceeded if @limit && @pos > @limit
16
+
17
+ data
18
+ rescue => exception
19
+ raise unless exception.class.name == "Unicorn::ClientShutdown"
20
+ outbuf = outbuf.to_s.clear
21
+ outbuf unless length
12
22
  end
13
23
 
14
- def eof?
15
- @bytes_read == size
24
+ def pos
25
+ @pos
16
26
  end
17
27
 
18
28
  def rewind
19
29
  @input.rewind
20
- @bytes_read = 0
21
- end
22
-
23
- def size
24
- if defined?(Rack::Lint) && @input.is_a?(Rack::Lint::InputWrapper)
25
- @input.instance_variable_get("@input").size
26
- else
27
- @input.size
28
- end
30
+ @pos = 0
29
31
  end
30
32
 
31
33
  def close
@@ -0,0 +1,68 @@
1
+ require "tus/server"
2
+ require "goliath"
3
+
4
+ class Tus::Server::Goliath < Goliath::API
5
+ # Called as soon as request headers are parsed.
6
+ def on_headers(env, headers)
7
+ # the write end of the pipe is written in #on_body, and the read end is read by Tus::Server
8
+ env["tus.input-reader"], env["tus.input-writer"] = IO.pipe
9
+ # use a thread so that request is being processed in parallel
10
+ env["tus.request-thread"] = Thread.new do
11
+ call_tus_server env.merge("rack.input" => env["tus.input-reader"])
12
+ end
13
+ end
14
+
15
+ # Called on each request body chunk received from the client.
16
+ def on_body(env, data)
17
+ # append data to the write end of the pipe if open, otherwise do nothing
18
+ env["tus.input-writer"].write(data) unless env["tus.input-writer"].closed?
19
+ rescue Errno::EPIPE
20
+ # read end of the pipe has been closed, so we close the write end as well
21
+ env["tus.input-writer"].close
22
+ end
23
+
24
+ # Called at the end of the request (after #response is called), but also on
25
+ # client disconnect (in which case #response isn't called), so we want to do
26
+ # the same finalization in both methods.
27
+ def on_close(env)
28
+ finalize(env)
29
+ end
30
+
31
+ # Called after all the data has been received from the client.
32
+ def response(env)
33
+ status, headers, body = finalize(env)
34
+
35
+ env[STREAM_START].call(status, headers)
36
+
37
+ operation = proc { body.each { |chunk| env.stream_send(chunk) } }
38
+ callback = proc { env.stream_close }
39
+
40
+ EM.defer(operation, callback) # use an outside thread pool for streaming
41
+
42
+ nil
43
+ end
44
+
45
+ private
46
+
47
+ # Calls the actual Roda application with the slightly modified env hash.
48
+ def call_tus_server(env)
49
+ Tus::Server.call env.merge(
50
+ "rack.url_scheme" => (env["options"][:ssl] ? "https" : "http"), # https://github.com/postrank-labs/goliath/issues/210
51
+ "async.callback" => nil, # prevent Roda from calling EventMachine when streaming
52
+ )
53
+ end
54
+
55
+ # This method needs to be idempotent, because it can be called twice (on
56
+ # normal requests both #response and #on_close will be called, and on client
57
+ # disconnect only #on_close will be called).
58
+ def finalize(env)
59
+ # closing the write end of the pipe will mark EOF on the read end
60
+ env["tus.input-writer"].close unless env["tus.input-writer"].closed?
61
+ # wait for the request to finish
62
+ result = env["tus.request-thread"].value
63
+ # close read end of the pipe, since nothing is going to read from it anymore
64
+ env["tus.input-reader"].close unless env["tus.input-reader"].closed?
65
+ # return rack response
66
+ result
67
+ end
68
+ end
data/lib/tus/server.rb CHANGED
@@ -32,7 +32,6 @@ module Tus
32
32
  plugin :request_headers
33
33
  plugin :not_allowed
34
34
  plugin :streaming
35
- plugin :error_handler
36
35
 
37
36
  route do |r|
38
37
  if request.headers["X-HTTP-Method-Override"]
@@ -74,6 +73,8 @@ module Tus
74
73
  )
75
74
 
76
75
  if info.concatenation?
76
+ validate_partial_uploads!(info.partial_uploads)
77
+
77
78
  length = storage.concatenate(uid, info.partial_uploads, info.to_h)
78
79
  info["Upload-Length"] = length.to_s
79
80
  info["Upload-Offset"] = length.to_s
@@ -102,7 +103,11 @@ module Tus
102
103
  no_content!
103
104
  end
104
105
 
105
- info = Tus::Info.new(storage.read_info(uid))
106
+ begin
107
+ info = Tus::Info.new(storage.read_info(uid))
108
+ rescue Tus::NotFound
109
+ error!(404, "Upload Not Found")
110
+ end
106
111
 
107
112
  r.head do
108
113
  response.headers.update(info.headers)
@@ -112,23 +117,27 @@ module Tus
112
117
  end
113
118
 
114
119
  r.patch do
115
- input = Tus::Input.new(request.body)
116
-
117
120
  if info.defer_length? && request.headers["Upload-Length"]
118
121
  validate_upload_length!
119
122
 
120
- info["Upload-Length"] = request.headers["Upload-Length"]
123
+ info["Upload-Length"] = request.headers["Upload-Length"]
121
124
  info["Upload-Defer-Length"] = nil
122
125
  end
123
126
 
127
+ input = get_input(info)
128
+
124
129
  validate_content_type!
125
- validate_content_length!(info.offset, info.length)
126
- validate_upload_offset!(info.offset)
130
+ validate_upload_offset!(info)
131
+ validate_content_length!(request.content_length.to_i, info) if request.content_length
127
132
  validate_upload_checksum!(input) if request.headers["Upload-Checksum"]
128
133
 
129
- storage.patch_file(uid, input, info.to_h)
134
+ begin
135
+ bytes_uploaded = storage.patch_file(uid, input, info.to_h)
136
+ rescue Tus::MaxSizeExceeded
137
+ validate_content_length!(input.pos, info)
138
+ end
130
139
 
131
- info["Upload-Offset"] = (info.offset + input.size).to_s
140
+ info["Upload-Offset"] = (info.offset + bytes_uploaded).to_s
132
141
  info["Upload-Expires"] = (Time.now + expiration_time).httpdate
133
142
 
134
143
  if info.offset == info.length # last chunk
@@ -142,13 +151,13 @@ module Tus
142
151
  end
143
152
 
144
153
  r.get do
145
- validate_upload_finished!(info.length, info.offset)
154
+ validate_upload_finished!(info)
146
155
  range = handle_range_request!(info.length)
147
156
 
148
157
  response.headers["Content-Length"] = (range.end - range.begin + 1).to_s
149
158
 
150
159
  metadata = info.metadata
151
- response.headers["Content-Disposition"] = opts[:disposition]
160
+ response.headers["Content-Disposition"] = opts[:disposition]
152
161
  response.headers["Content-Disposition"] += "; filename=\"#{metadata["filename"]}\"" if metadata["filename"]
153
162
  response.headers["Content-Type"] = metadata["content_type"] || "application/octet-stream"
154
163
 
@@ -167,9 +176,12 @@ module Tus
167
176
  end
168
177
  end
169
178
 
170
- error do |exception|
171
- not_found! if exception.is_a?(Tus::NotFound)
172
- raise
179
+ def get_input(info)
180
+ offset = info.offset
181
+ total = info.length || max_size
182
+ limit = total - offset if total
183
+
184
+ Tus::Input.new(request.body, limit: limit)
173
185
  end
174
186
 
175
187
  def validate_content_type!
@@ -197,29 +209,29 @@ module Tus
197
209
  end
198
210
  end
199
211
 
200
- def validate_upload_offset!(current_offset)
212
+ def validate_upload_offset!(info)
201
213
  upload_offset = request.headers["Upload-Offset"]
202
214
 
203
215
  error!(400, "Missing Upload-Offset header") if upload_offset.to_s == ""
204
216
  error!(400, "Invalid Upload-Offset header") if upload_offset =~ /\D/
205
217
  error!(400, "Invalid Upload-Offset header") if upload_offset.to_i < 0
206
218
 
207
- if upload_offset.to_i != current_offset
219
+ if upload_offset.to_i != info.offset
208
220
  error!(409, "Upload-Offset header doesn't match current offset")
209
221
  end
210
222
  end
211
223
 
212
- def validate_content_length!(current_offset, length)
213
- if length
214
- error!(403, "Cannot modify completed upload") if current_offset == length
215
- error!(413, "Size of this chunk surpasses Upload-Length") if Integer(request.content_length) + current_offset > length
224
+ def validate_content_length!(size, info)
225
+ if info.length
226
+ error!(403, "Cannot modify completed upload") if info.offset == info.length
227
+ error!(413, "Size of this chunk surpasses Upload-Length") if info.offset + size > info.length
216
228
  elsif max_size
217
- error!(413, "Size of this chunk surpasses Tus-Max-Size") if Integer(request.content_length) + current_offset > max_size
229
+ error!(413, "Size of this chunk surpasses Tus-Max-Size") if info.offset + size > max_size
218
230
  end
219
231
  end
220
232
 
221
- def validate_upload_finished!(length, current_offset)
222
- error!(403, "Cannot download unfinished upload") unless length && current_offset && length == current_offset
233
+ def validate_upload_finished!(info)
234
+ error!(403, "Cannot download unfinished upload") unless info.length == info.offset
223
235
  end
224
236
 
225
237
  def validate_upload_metadata!
@@ -249,6 +261,30 @@ module Tus
249
261
  end
250
262
  end
251
263
 
264
+ def validate_partial_uploads!(part_uids)
265
+ queue = Queue.new
266
+ part_uids.each { |part_uid| queue << part_uid }
267
+
268
+ threads = 10.times.map do
269
+ Thread.new do
270
+ results = []
271
+ loop do
272
+ part_uid = queue.deq(true) rescue break
273
+ part_info = storage.read_info(part_uid)
274
+ results << part_info["Upload-Concat"]
275
+ end
276
+ results
277
+ end
278
+ end
279
+
280
+ upload_concat_values = threads.flat_map(&:value)
281
+ unless upload_concat_values.all? { |value| value == "partial" }
282
+ error!(400, "One or more uploads were not partial")
283
+ end
284
+ rescue Tus::NotFound
285
+ error!(404, "One or more partial uploads were not found")
286
+ end
287
+
252
288
  def validate_upload_checksum!(input)
253
289
  algorithm, checksum = request.headers["Upload-Checksum"].split(" ")
254
290
 
@@ -256,7 +292,7 @@ module Tus
256
292
  error!(400, "Invalid Upload-Checksum header") unless SUPPORTED_CHECKSUM_ALGORITHMS.include?(algorithm)
257
293
 
258
294
  generated_checksum = Tus::Checksum.generate(algorithm, input)
259
- error!(460, "Checksum from Upload-Checksum header doesn't match generated") if generated_checksum != checksum
295
+ error!(460, "Upload-Checksum value doesn't match generated checksum") if generated_checksum != checksum
260
296
  end
261
297
 
262
298
  # "Range" header handling logic copied from Rack::File
@@ -312,10 +348,6 @@ module Tus
312
348
  request.halt
313
349
  end
314
350
 
315
- def not_found!(message = "Upload not found")
316
- error!(404, message)
317
- end
318
-
319
351
  def error!(status, message)
320
352
  response.status = status
321
353
  response.write(message) unless request.head?
@@ -49,35 +49,28 @@ module Tus
49
49
  end
50
50
 
51
51
  def patch_file(uid, input, info = {})
52
- exists!(uid)
53
-
54
52
  file_path(uid).open("ab") { |file| IO.copy_stream(input, file) }
55
53
  end
56
54
 
57
55
  def read_info(uid)
58
- exists!(uid)
56
+ raise Tus::NotFound if !file_path(uid).exist?
59
57
 
60
58
  JSON.parse(info_path(uid).binread)
61
59
  end
62
60
 
63
61
  def update_info(uid, info)
64
- exists!(uid)
65
-
66
62
  info_path(uid).binwrite(JSON.generate(info))
67
63
  end
68
64
 
69
65
  def get_file(uid, info = {}, range: nil)
70
- exists!(uid)
71
-
72
66
  file = file_path(uid).open("rb")
73
- range ||= 0..(file.size - 1)
74
- length = range.end - range.begin + 1
67
+ length = range ? range.size : file.size
75
68
 
76
69
  # Create an Enumerator which will yield chunks of the requested file
77
70
  # content, allowing tus server to efficiently stream requested content
78
71
  # to the client.
79
72
  chunks = Enumerator.new do |yielder|
80
- file.seek(range.begin)
73
+ file.seek(range.begin) if range
81
74
  remaining_length = length
82
75
 
83
76
  while remaining_length > 0
@@ -89,11 +82,7 @@ module Tus
89
82
 
90
83
  # We return a response object that responds to #each, #length and #close,
91
84
  # which the tus server can return directly as the Rack response.
92
- Response.new(
93
- chunks: chunks,
94
- length: length,
95
- close: ->{file.close},
96
- )
85
+ Response.new(chunks: chunks, length: length, close: -> { file.close })
97
86
  end
98
87
 
99
88
  def delete_file(uid, info = {})
@@ -116,12 +105,8 @@ module Tus
116
105
  FileUtils.rm_f paths
117
106
  end
118
107
 
119
- def exists!(uid)
120
- raise Tus::NotFound if !file_path(uid).exist?
121
- end
122
-
123
108
  def file_path(uid)
124
- directory.join("#{uid}.file")
109
+ directory.join("#{uid}")
125
110
  end
126
111
 
127
112
  def info_path(uid)
@@ -8,6 +8,8 @@ require "digest"
8
8
  module Tus
9
9
  module Storage
10
10
  class Gridfs
11
+ BATCH_SIZE = 5 * 1024 * 1024
12
+
11
13
  attr_reader :client, :prefix, :bucket, :chunk_size
12
14
 
13
15
  def initialize(client:, prefix: "fs", chunk_size: 256*1024)
@@ -47,7 +49,10 @@ module Tus
47
49
  grid_infos.inject(0) do |offset, grid_info|
48
50
  result = chunks_collection
49
51
  .find(files_id: grid_info[:_id])
50
- .update_many("$set" => {files_id: grid_file.id}, "$inc" => {n: offset})
52
+ .update_many(
53
+ "$set" => { files_id: grid_file.id },
54
+ "$inc" => { n: offset },
55
+ )
51
56
 
52
57
  offset += result.modified_count
53
58
  end
@@ -60,47 +65,72 @@ module Tus
60
65
  end
61
66
 
62
67
  def patch_file(uid, input, info = {})
63
- grid_info = find_grid_info!(uid)
68
+ grid_info = files_collection.find(filename: uid).first
69
+ current_length = grid_info[:length]
70
+ chunk_size = grid_info[:chunkSize]
71
+ bytes_saved = 0
72
+
73
+ bytes_saved += patch_last_chunk(input, grid_info) if current_length % chunk_size != 0
74
+
75
+ chunks_enumerator = Enumerator.new do |yielder|
76
+ while (data = input.read(chunk_size))
77
+ yielder << data
78
+ end
79
+ end
64
80
 
65
- patch_last_chunk(input, grid_info)
81
+ chunks_in_batch = (BATCH_SIZE.to_f / chunk_size).ceil
82
+ chunks_offset = chunks_collection.count(files_id: grid_info[:_id]) - 1
66
83
 
67
- grid_chunks = split_into_grid_chunks(input, grid_info)
68
- chunks_collection.insert_many(grid_chunks)
69
- grid_chunks.each { |grid_chunk| grid_chunk.data.data.clear } # deallocate strings
84
+ chunks_enumerator.each_slice(chunks_in_batch) do |chunks|
85
+ grid_chunks = chunks.map do |data|
86
+ Mongo::Grid::File::Chunk.new(
87
+ data: BSON::Binary.new(data),
88
+ files_id: grid_info[:_id],
89
+ n: chunks_offset += 1,
90
+ )
91
+ end
92
+
93
+ chunks_collection.insert_many(grid_chunks)
94
+
95
+ # Update the total length and refresh the upload date on each update,
96
+ # which are used in #get_file, #concatenate and #expire_files.
97
+ files_collection.find(filename: uid).update_one(
98
+ "$inc" => { length: chunks.map(&:bytesize).inject(0, :+) },
99
+ "$set" => { uploadDate: Time.now.utc },
100
+ )
101
+ bytes_saved += chunks.map(&:bytesize).inject(0, :+)
102
+
103
+ chunks.each(&:clear) # deallocate strings
104
+ end
70
105
 
71
- # Update the total length and refresh the upload date on each update,
72
- # which are used in #get_file, #concatenate and #expire_files.
73
- files_collection.find(filename: uid).update_one("$set" => {
74
- length: grid_info[:length] + input.size,
75
- uploadDate: Time.now.utc,
76
- })
106
+ bytes_saved
77
107
  end
78
108
 
79
109
  def read_info(uid)
80
- grid_info = find_grid_info!(uid)
110
+ grid_info = files_collection.find(filename: uid).first or raise Tus::NotFound
81
111
 
82
112
  grid_info[:metadata]
83
113
  end
84
114
 
85
115
  def update_info(uid, info)
86
- grid_info = find_grid_info!(uid)
116
+ grid_info = files_collection.find(filename: uid).first
87
117
 
88
118
  files_collection.update_one({filename: uid}, {"$set" => {metadata: info}})
89
119
  end
90
120
 
91
121
  def get_file(uid, info = {}, range: nil)
92
- grid_info = find_grid_info!(uid)
122
+ grid_info = files_collection.find(filename: uid).first
93
123
 
94
- range ||= 0..(grid_info[:length] - 1)
95
- length = range.end - range.begin + 1
124
+ length = range ? range.size : grid_info[:length]
96
125
 
97
- chunk_start = range.begin / grid_info[:chunkSize]
98
- chunk_stop = range.end / grid_info[:chunkSize]
126
+ filter = { files_id: grid_info[:_id] }
99
127
 
100
- filter = {
101
- files_id: grid_info[:_id],
102
- n: {"$gte" => chunk_start, "$lte" => chunk_stop}
103
- }
128
+ if range
129
+ chunk_start = range.begin / grid_info[:chunkSize]
130
+ chunk_stop = range.end / grid_info[:chunkSize]
131
+
132
+ filter[:n] = {"$gte" => chunk_start, "$lte" => chunk_stop}
133
+ end
104
134
 
105
135
  # Query only the subset of chunks specified by the range query. We
106
136
  # cannot use Mongo::FsBucket#open_download_stream here because it
@@ -137,11 +167,7 @@ module Tus
137
167
 
138
168
  # We return a response object that responds to #each, #length and #close,
139
169
  # which the tus server can return directly as the Rack response.
140
- Response.new(
141
- chunks: chunks,
142
- length: length,
143
- close: ->{chunks_view.close_query},
144
- )
170
+ Response.new(chunks: chunks, length: length, close: ->{chunks_view.close_query})
145
171
  end
146
172
 
147
173
  def delete_file(uid, info = {})
@@ -168,29 +194,19 @@ module Tus
168
194
  grid_file
169
195
  end
170
196
 
171
- def split_into_grid_chunks(io, grid_info)
172
- grid_info[:md5] = Digest::MD5.new # hack for `Chunk.split` updating MD5
173
- grid_info = Mongo::Grid::File::Info.new(Mongo::Options::Mapper.transform(grid_info, Mongo::Grid::File::Info::MAPPINGS.invert))
174
- offset = chunks_collection.count(files_id: grid_info.id)
175
-
176
- Mongo::Grid::File::Chunk.split(io, grid_info, offset)
177
- end
178
-
179
197
  def patch_last_chunk(input, grid_info)
180
- if grid_info[:length] % grid_info[:chunkSize] != 0
181
- last_chunk = chunks_collection.find(files_id: grid_info[:_id]).sort(n: -1).limit(1).first
182
- data = last_chunk[:data].data
183
- data << input.read(grid_info[:chunkSize] - data.length)
198
+ last_chunk = chunks_collection.find(files_id: grid_info[:_id]).sort(n: -1).limit(1).first
199
+ data = last_chunk[:data].data
200
+ patch = input.read(grid_info[:chunkSize] - data.bytesize)
201
+ data << patch
184
202
 
185
- chunks_collection.find(files_id: grid_info[:_id], n: last_chunk[:n])
186
- .update_one("$set" => {data: BSON::Binary.new(data)})
203
+ chunks_collection.find(files_id: grid_info[:_id], n: last_chunk[:n])
204
+ .update_one("$set" => { data: BSON::Binary.new(data) })
187
205
 
188
- data.clear # deallocate string
189
- end
190
- end
206
+ files_collection.find(_id: grid_info[:_id])
207
+ .update_one("$inc" => { length: patch.bytesize })
191
208
 
192
- def find_grid_info!(uid)
193
- files_collection.find(filename: uid).first or raise Tus::NotFound
209
+ patch.bytesize
194
210
  end
195
211
 
196
212
  def validate_parts!(grid_infos, part_uids)
@@ -1,18 +1,19 @@
1
1
  require "aws-sdk"
2
2
 
3
3
  require "tus/info"
4
- require "tus/checksum"
5
4
  require "tus/errors"
6
5
 
7
6
  require "json"
8
- require "cgi/util"
7
+ require "cgi"
8
+ require "fiber"
9
+ require "stringio"
9
10
 
10
11
  Aws.eager_autoload!(services: ["S3"])
11
12
 
12
13
  module Tus
13
14
  module Storage
14
15
  class S3
15
- MIN_PART_SIZE = 5 * 1024 * 1024
16
+ MIN_PART_SIZE = 5 * 1024 * 1024 # 5MB is the minimum part size for S3 multipart uploads
16
17
 
17
18
  attr_reader :client, :bucket, :prefix, :upload_options
18
19
 
@@ -20,7 +21,7 @@ module Tus
20
21
  resource = Aws::S3::Resource.new(**client_options)
21
22
 
22
23
  @client = resource.client
23
- @bucket = resource.bucket(bucket)
24
+ @bucket = resource.bucket(bucket) or fail(ArgumentError, "the :bucket option was nil")
24
25
  @prefix = prefix
25
26
  @upload_options = upload_options
26
27
  @thread_count = thread_count
@@ -33,8 +34,12 @@ module Tus
33
34
  options[:content_type] = tus_info.metadata["content_type"]
34
35
 
35
36
  if filename = tus_info.metadata["filename"]
37
+ # Aws-sdk doesn't sign non-ASCII characters correctly, and browsers
38
+ # will automatically URI-decode filenames.
39
+ filename = CGI.escape(filename).gsub("+", " ")
40
+
36
41
  options[:content_disposition] ||= "inline"
37
- options[:content_disposition] += "; filename=\"#{CGI.escape(filename).gsub("+", " ")}\""
42
+ options[:content_disposition] += "; filename=\"#{filename}\""
38
43
  end
39
44
 
40
45
  multipart_upload = object(uid).initiate_multipart_upload(options)
@@ -51,9 +56,7 @@ module Tus
51
56
  objects = part_uids.map { |part_uid| object(part_uid) }
52
57
  parts = copy_parts(objects, multipart_upload)
53
58
 
54
- parts.each do |part|
55
- info["multipart_parts"] << { "part_number" => part[:part_number], "etag" => part[:etag] }
56
- end
59
+ info["multipart_parts"].concat parts
57
60
 
58
61
  finalize_file(uid, info)
59
62
 
@@ -68,21 +71,44 @@ module Tus
68
71
  end
69
72
 
70
73
  def patch_file(uid, input, info = {})
71
- upload_id = info["multipart_id"]
72
- part_number = info["multipart_parts"].count + 1
74
+ tus_info = Tus::Info.new(info)
73
75
 
74
- multipart_upload = object(uid).multipart_upload(upload_id)
75
- multipart_part = multipart_upload.part(part_number)
76
- md5 = Tus::Checksum.new("md5").generate(input)
76
+ upload_id = info["multipart_id"]
77
+ part_offset = info["multipart_parts"].count
78
+ bytes_uploaded = 0
77
79
 
78
- response = multipart_part.upload(body: input, content_md5: md5)
80
+ jobs = []
81
+ chunk = StringIO.new(input.read(MIN_PART_SIZE).to_s)
79
82
 
80
- info["multipart_parts"] << {
81
- "part_number" => part_number,
82
- "etag" => response.etag[/"(.+)"/, 1],
83
- }
84
- rescue Aws::S3::Errors::NoSuchUpload
85
- raise Tus::NotFound
83
+ loop do
84
+ next_chunk = StringIO.new(input.read(MIN_PART_SIZE).to_s)
85
+
86
+ # merge next chunk into previous if it's smaller than minimum chunk size
87
+ if next_chunk.size < MIN_PART_SIZE
88
+ chunk = StringIO.new(chunk.string + next_chunk.string)
89
+ next_chunk.close
90
+ next_chunk = nil
91
+ end
92
+
93
+ # abort if chunk is smaller than 5MB and is not the last chunk
94
+ if chunk.size < MIN_PART_SIZE
95
+ break if (tus_info.length && tus_info.offset) &&
96
+ chunk.size + tus_info.offset < tus_info.length
97
+ end
98
+
99
+ thread = upload_part_thread(chunk, uid, upload_id, part_offset += 1)
100
+ jobs << [thread, chunk]
101
+
102
+ chunk = next_chunk or break
103
+ end
104
+
105
+ jobs.each do |thread, body|
106
+ info["multipart_parts"] << thread.value
107
+ bytes_uploaded += body.size
108
+ body.close
109
+ end
110
+
111
+ bytes_uploaded
86
112
  end
87
113
 
88
114
  def finalize_file(uid, info = {})
@@ -92,7 +118,7 @@ module Tus
92
118
  end
93
119
 
94
120
  multipart_upload = object(uid).multipart_upload(upload_id)
95
- multipart_upload.complete(multipart_upload: {parts: parts})
121
+ multipart_upload.complete(multipart_upload: { parts: parts })
96
122
 
97
123
  info.delete("multipart_id")
98
124
  info.delete("multipart_parts")
@@ -110,26 +136,15 @@ module Tus
110
136
  end
111
137
 
112
138
  def get_file(uid, info = {}, range: nil)
113
- object = object(uid)
114
- range = "bytes=#{range.begin}-#{range.end}" if range
115
-
116
- raw_chunks = object.enum_for(:get, range: range)
117
-
118
- # Start the request to be notified if the object doesn't exist, and to
119
- # get Aws::S3::Object#content_length.
120
- first_chunk = raw_chunks.next
139
+ tus_info = Tus::Info.new(info)
121
140
 
122
- chunks = Enumerator.new do |yielder|
123
- yielder << first_chunk
124
- loop { yielder << raw_chunks.next }
125
- end
141
+ length = range ? range.size : tus_info.length
142
+ range = "bytes=#{range.begin}-#{range.end}" if range
143
+ chunks = object(uid).enum_for(:get, range: range)
126
144
 
127
- Response.new(
128
- chunks: chunks,
129
- length: object.content_length,
130
- )
131
- rescue Aws::S3::Errors::NoSuchKey
132
- raise Tus::NotFound
145
+ # We return a response object that responds to #each, #length and #close,
146
+ # which the tus server can return directly as the Rack response.
147
+ Response.new(chunks: chunks, length: length)
133
148
  end
134
149
 
135
150
  def delete_file(uid, info = {})
@@ -151,7 +166,9 @@ module Tus
151
166
  delete(old_objects)
152
167
 
153
168
  bucket.multipart_uploads.each do |multipart_upload|
154
- next unless multipart_upload.initiated <= expiration_date
169
+ # no need to check multipart uploads initiated before expiration date
170
+ next if multipart_upload.initiated > expiration_date
171
+
155
172
  most_recent_part = multipart_upload.parts.sort_by(&:last_modified).last
156
173
  if most_recent_part.nil? || most_recent_part.last_modified <= expiration_date
157
174
  abort_multipart_upload(multipart_upload)
@@ -161,10 +178,23 @@ module Tus
161
178
 
162
179
  private
163
180
 
181
+ def upload_part_thread(body, key, upload_id, part_number)
182
+ Thread.new { upload_part(body, key, upload_id, part_number) }
183
+ end
184
+
185
+ def upload_part(body, key, upload_id, part_number)
186
+ multipart_upload = object(key).multipart_upload(upload_id)
187
+ multipart_part = multipart_upload.part(part_number)
188
+
189
+ response = multipart_part.upload(body: body)
190
+
191
+ { "part_number" => part_number, "etag" => response.etag }
192
+ end
193
+
164
194
  def delete(objects)
165
195
  # S3 can delete maximum of 1000 objects in a single request
166
196
  objects.each_slice(1000) do |objects_batch|
167
- delete_params = {objects: objects_batch.map { |object| {key: object.key} }}
197
+ delete_params = { objects: objects_batch.map { |object| { key: object.key } } }
168
198
  bucket.delete_objects(delete: delete_params)
169
199
  end
170
200
  end
@@ -187,7 +217,7 @@ module Tus
187
217
 
188
218
  threads = @thread_count.times.map { copy_part_thread(queue) }
189
219
 
190
- threads.flat_map(&:value).sort_by { |part| part[:part_number] }
220
+ threads.flat_map(&:value).sort_by { |part| part["part_number"] }
191
221
  end
192
222
 
193
223
  def compute_parts(objects, multipart_upload)
@@ -204,7 +234,6 @@ module Tus
204
234
 
205
235
  def copy_part_thread(queue)
206
236
  Thread.new do
207
- Thread.current.abort_on_exception = true
208
237
  begin
209
238
  results = []
210
239
  loop do
@@ -212,9 +241,9 @@ module Tus
212
241
  results << copy_part(part)
213
242
  end
214
243
  results
215
- rescue => error
244
+ rescue
216
245
  queue.clear
217
- raise error
246
+ raise
218
247
  end
219
248
  end
220
249
  end
@@ -222,7 +251,7 @@ module Tus
222
251
  def copy_part(part)
223
252
  response = client.upload_part_copy(part)
224
253
 
225
- { part_number: part[:part_number], etag: response.copy_part_result.etag }
254
+ { "part_number" => part[:part_number], "etag" => response.copy_part_result.etag }
226
255
  end
227
256
 
228
257
  def object(key)
@@ -239,12 +268,28 @@ module Tus
239
268
  @length
240
269
  end
241
270
 
242
- def each(&block)
243
- @chunks.each(&block)
271
+ def each
272
+ return enum_for(__method__) unless block_given?
273
+
274
+ while (chunk = chunks_fiber.resume)
275
+ yield chunk
276
+ end
244
277
  end
245
278
 
246
279
  def close
247
- # aws-sdk doesn't provide an API to terminate the HTTP connection
280
+ chunks_fiber.resume(:close) if chunks_fiber.alive?
281
+ end
282
+
283
+ private
284
+
285
+ def chunks_fiber
286
+ @chunks_fiber ||= Fiber.new do
287
+ @chunks.each do |chunk|
288
+ action = Fiber.yield chunk
289
+ break if action == :close
290
+ end
291
+ nil
292
+ end
248
293
  end
249
294
  end
250
295
  end
data/tus-server.gemspec CHANGED
@@ -1,6 +1,6 @@
1
1
  Gem::Specification.new do |gem|
2
2
  gem.name = "tus-server"
3
- gem.version = "0.10.2"
3
+ gem.version = "1.0.0"
4
4
 
5
5
  gem.required_ruby_version = ">= 2.1"
6
6
 
@@ -19,7 +19,10 @@ Gem::Specification.new do |gem|
19
19
  gem.add_development_dependency "rake", "~> 11.1"
20
20
  gem.add_development_dependency "minitest", "~> 5.8"
21
21
  gem.add_development_dependency "rack-test_app"
22
+ gem.add_development_dependency "cucumber"
23
+ gem.add_development_dependency "unicorn"
22
24
  gem.add_development_dependency "mongo"
23
25
  gem.add_development_dependency "aws-sdk", "~> 2.0"
24
- gem.add_development_dependency "dotenv"
26
+ gem.add_development_dependency "goliath"
27
+ gem.add_development_dependency "async-rack", ">= 0.5.1"
25
28
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: tus-server
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.10.2
4
+ version: 1.0.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Janko Marohnić
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2017-04-19 00:00:00.000000000 Z
11
+ date: 2017-07-17 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: roda
@@ -66,6 +66,34 @@ dependencies:
66
66
  - - ">="
67
67
  - !ruby/object:Gem::Version
68
68
  version: '0'
69
+ - !ruby/object:Gem::Dependency
70
+ name: cucumber
71
+ requirement: !ruby/object:Gem::Requirement
72
+ requirements:
73
+ - - ">="
74
+ - !ruby/object:Gem::Version
75
+ version: '0'
76
+ type: :development
77
+ prerelease: false
78
+ version_requirements: !ruby/object:Gem::Requirement
79
+ requirements:
80
+ - - ">="
81
+ - !ruby/object:Gem::Version
82
+ version: '0'
83
+ - !ruby/object:Gem::Dependency
84
+ name: unicorn
85
+ requirement: !ruby/object:Gem::Requirement
86
+ requirements:
87
+ - - ">="
88
+ - !ruby/object:Gem::Version
89
+ version: '0'
90
+ type: :development
91
+ prerelease: false
92
+ version_requirements: !ruby/object:Gem::Requirement
93
+ requirements:
94
+ - - ">="
95
+ - !ruby/object:Gem::Version
96
+ version: '0'
69
97
  - !ruby/object:Gem::Dependency
70
98
  name: mongo
71
99
  requirement: !ruby/object:Gem::Requirement
@@ -95,7 +123,7 @@ dependencies:
95
123
  - !ruby/object:Gem::Version
96
124
  version: '2.0'
97
125
  - !ruby/object:Gem::Dependency
98
- name: dotenv
126
+ name: goliath
99
127
  requirement: !ruby/object:Gem::Requirement
100
128
  requirements:
101
129
  - - ">="
@@ -108,6 +136,20 @@ dependencies:
108
136
  - - ">="
109
137
  - !ruby/object:Gem::Version
110
138
  version: '0'
139
+ - !ruby/object:Gem::Dependency
140
+ name: async-rack
141
+ requirement: !ruby/object:Gem::Requirement
142
+ requirements:
143
+ - - ">="
144
+ - !ruby/object:Gem::Version
145
+ version: 0.5.1
146
+ type: :development
147
+ prerelease: false
148
+ version_requirements: !ruby/object:Gem::Requirement
149
+ requirements:
150
+ - - ">="
151
+ - !ruby/object:Gem::Version
152
+ version: 0.5.1
111
153
  description:
112
154
  email:
113
155
  - janko.marohnic@gmail.com
@@ -123,6 +165,7 @@ files:
123
165
  - lib/tus/info.rb
124
166
  - lib/tus/input.rb
125
167
  - lib/tus/server.rb
168
+ - lib/tus/server/goliath.rb
126
169
  - lib/tus/storage/filesystem.rb
127
170
  - lib/tus/storage/gridfs.rb
128
171
  - lib/tus/storage/s3.rb
@@ -147,7 +190,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
147
190
  version: '0'
148
191
  requirements: []
149
192
  rubyforge_project:
150
- rubygems_version: 2.5.1
193
+ rubygems_version: 2.6.11
151
194
  signing_key:
152
195
  specification_version: 4
153
196
  summary: Ruby server implementation of tus.io, the open protocol for resumable file