tus-server 0.2.0 → 0.9.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: d611f90e1f935c7caa84d040fe50cc3a9c49e6f6
4
- data.tar.gz: a4095bfb0a87f0241336ec3ce78cb4f5b4442609
3
+ metadata.gz: 06a1d11d97acd07210d2292e19e57be4c1c13071
4
+ data.tar.gz: 3b985f19d49493356931d50fdc1556b00e2c9688
5
5
  SHA512:
6
- metadata.gz: 12860dfff41aa4178d926618915e7cdbc6198599b139eb5a9dc2117ca4dbfed165fda92ff63620d70389562b222227831f3d7ee070f2b12562bb555424c75462
7
- data.tar.gz: db9fcba3e6d4f40dffc90cab9d8bea3921727fb60ebba7cd8926b872ce305cedc9766eadcdeafee2f2e040f394b8555d21dc3b2bbd307a3d5d0e55a17f8ccf20
6
+ metadata.gz: 8f25c106249b1bad3a8ee3787ee3d93dbff14b2b1453a55c1d74cddab8a195ad0014eed28c0ddf19f15caaa6fde610ddca0fce42ca4739cc1af1e60939d03091
7
+ data.tar.gz: 2675c30b3548d8a6a00976695152129c9b5dd58349e09c9eb2f8b153a10252b39b4aaf2d3d05777b28fc1e7831b56c9e84f533d55c2e434f017347f7dd1f6d3d
data/README.md CHANGED
@@ -17,7 +17,10 @@ gem "tus-server"
17
17
 
18
18
  ## Usage
19
19
 
20
- You can run the `Tus::Server` in your `config.ru`:
20
+ Tus-ruby-server provides a `Tus::Server` Roda app, which you can run in your
21
+ `config.ru`. That way you can run `Tus::Server` both as a standalone app or as
22
+ part of your main app (though it's recommended to run it as a standalone app,
23
+ as explained in the "Performance considerations" section of this README).
21
24
 
22
25
  ```rb
23
26
  # config.ru
@@ -37,88 +40,190 @@ endpoint:
37
40
  // using tus-js-client
38
41
  new tus.Upload(file, {
39
42
  endpoint: "http://localhost:9292/files",
40
- chunkSize: 15*1024*1024, // 15 MB
43
+ chunkSize: 5*1024*1024, // 5MB
41
44
  // ...
42
45
  })
43
46
  ```
44
47
 
45
- After upload is complete, you'll probably want to attach the uploaded files to
46
- database records. [Shrine] is one file attachment library that supports this,
47
- see [shrine-tus-demo] on how you can integrate the two.
48
+ After the upload is complete, you'll probably want to attach the uploaded file
49
+ to a database record. [Shrine] is one file attachment library that integrates
50
+ nicely with tus-ruby-server, see [shrine-tus-demo] for an example integration.
48
51
 
49
- ### Metadata
52
+ ## Storage
50
53
 
51
- As per tus protocol, you can assign custom metadata when creating a file using
52
- the `Upload-Metadata` header. When retrieving the file via a GET request,
53
- tus-ruby-server will use
54
+ ### Filesystem
54
55
 
55
- * `content_type` -- for setting the `Content-Type` header
56
- * `filename` -- for setting the `Content-Disposition` header
57
-
58
- Both of these are optional, and will be used if available.
59
-
60
- ### Storage
61
-
62
- By default `Tus::Server` saves partial and complete files on the filesystem,
63
- inside the `data/` directory. You can easily change the directory:
56
+ By default `Tus::Server` stores uploaded files to disk, in the `data/`
57
+ directory. You can configure a different directory:
64
58
 
65
59
  ```rb
66
- require "tus/server"
60
+ require "tus/storage/filesystem"
67
61
 
68
62
  Tus::Server.opts[:storage] = Tus::Storage::Filesystem.new("public/cache")
69
63
  ```
70
64
 
71
- The downside of storing files on the filesystem is that it isn't distributed,
72
- so for resumable uploads to work you have to host the tus application on a
73
- single server.
65
+ One downside of filesystem storage is that by default it doesn't work if you
66
+ want to run tus-ruby-servers on multiple servers, you'd have to set up a shared
67
+ filesystem between the servers. Another downside is that you have to make sure
68
+ your servers have enough disk space. Also, if you're using Heroku, you cannot
69
+ store files on the filesystem as they won't persist.
70
+
71
+ All these are reasons why you might store uploaded data on a different storage,
72
+ and luckily tus-ruby-server ships with two more storages.
74
73
 
75
- However, tus-ruby-server also ships with MongoDB [GridFS] storage, which among
76
- other things is convenient for a multi-server setup. It requires the [Mongo]
77
- gem:
74
+ ### MongoDB GridFS
75
+
76
+ MongoDB has a specification for storing and retrieving large files, called
77
+ "[GridFS]". Tus-ruby-server ships with `Tus::Storage::Gridfs` that you can
78
+ use, which uses the [Mongo] gem.
78
79
 
79
80
  ```rb
80
- gem "mongo"
81
+ gem "mongo", ">= 2.2.2", "< 3"
81
82
  ```
82
83
 
83
84
  ```rb
84
- require "tus/server"
85
85
  require "tus/storage/gridfs"
86
86
 
87
87
  client = Mongo::Client.new("mongodb://127.0.0.1:27017/mydb")
88
88
  Tus::Server.opts[:storage] = Tus::Storage::Gridfs.new(client: client)
89
89
  ```
90
90
 
91
- You can also write your own storage, you just need to implement the same
92
- public interface that `Tus::Storage::Filesystem` and `Tus::Storage::Gridfs` do.
91
+ The Gridfs specification requires that all chunks are of equal size, except the
92
+ last chunk. `Tus::Storage::Gridfs` will by default automatically make the
93
+ Gridfs chunk size equal to the size of the first uploaded chunk. This means
94
+ that all of the uploaded chunks need to be of equal size (except the last
95
+ chunk).
96
+
97
+ If you don't want the Gridfs chunk size to be equal to the size of the uploaded
98
+ chunks, you can hardcode the chunk size that will be used for all uploads.
99
+
100
+ ```rb
101
+ Tus::Storage::Gridfs.new(client: client, chunk_size: 256*1024) # 256 KB
102
+ ```
103
+
104
+ Just note that in this case the size of each uploaded chunk (except the last
105
+ one) needs to be a multiple of the `:chunk_size`.
106
+
107
+ ### Amazon S3
108
+
109
+ Amazon S3 is probably one of the most popular services for storing files, and
110
+ tus-ruby-server ships with `Tus::Storage::S3` which utilizes S3's multipart API
111
+ to upload files, and depends on the [aws-sdk] gem.
112
+
113
+ ```rb
114
+ gem "aws-sdk", "~> 2.1"
115
+ ```
93
116
 
94
- ### Maximum size
117
+ ```rb
118
+ require "tus/storage/s3"
119
+
120
+ Tus::Server.opts[:storage] = Tus::Storage::S3.new(
121
+ access_key_id: "abc",
122
+ secret_access_key: "xyz",
123
+ region: "eu-west-1",
124
+ bucket: "my-app",
125
+ )
126
+ ```
127
+
128
+ It might seem at first that using a remote storage like Amazon S3 will slow
129
+ down the overall upload, but the time it takes for the client to upload the
130
+ file to the Rack app is in general *much* longer than the time for the server
131
+ to upload that chunk to S3, because of the differences in the Internet
132
+ connection speed between the user's computer and server.
133
+
134
+ One thing to note is that S3's multipart API requires each chunk except the last
135
+ one to be 5MB or larger, so that is the minimum chunk size that you can specify
136
+ on your tus client if you want to use the S3 storage.
137
+
138
+ If you want to files to be stored in a certain subdirectory, you can specify
139
+ a `:prefix` in the storage configuration.
140
+
141
+ ```rb
142
+ Tus::Storage::S3.new(prefix: "tus", **options)
143
+ ```
144
+
145
+ You can also specify additional options that will be fowarded to
146
+ [`Aws::S3::Client#create_multipart_upload`] using `:upload_options`.
147
+
148
+ ```rb
149
+ Tus::Storage::S3.new(upload_options: {content_disposition: "attachment"}, **options)
150
+ ```
151
+
152
+ All other options will be forwarded to [`Aws::S3::Client#initialize`], so you
153
+ can for example change the `:endpoint` to use S3's accelerate host:
154
+
155
+ ```rb
156
+ Tus::Storage::S3.new(endpoint: "https://s3-accelerate.amazonaws.com", **options)
157
+ ```
158
+
159
+ ### Other storages
160
+
161
+ If none of these storages suit you, you can write your own, you just need to
162
+ implement the same public interface:
163
+
164
+ ```rb
165
+ def create_file(uid, info = {}) ... end
166
+ def concatenate(uid, part_uids, info = {}) ... end
167
+ def patch_file(uid, io, info = {}) ... end
168
+ def update_info(uid, info) ... end
169
+ def read_info(uid) ... end
170
+ def get_file(uid, info = {}, range: nil) ... end
171
+ def delete_file(uid, info = {}) ... end
172
+ def expire_files(expiration_date) ... end
173
+ ```
174
+
175
+ ## Maximum size
95
176
 
96
177
  By default the maximum size for an uploaded file is 1GB, but you can change
97
178
  that:
98
179
 
99
180
  ```rb
100
- require "tus/server"
101
-
102
181
  Tus::Server.opts[:max_size] = 5 * 1024*1024*1024 # 5GB
103
- # or
104
182
  Tus::Server.opts[:max_size] = nil # no limit
105
183
  ```
106
184
 
107
- ### Expiration
185
+ ## Expiration
108
186
 
109
- The expiration date is automatically set on each created file, and is refreshed
110
- on each PATCH request. By default the expiration date is 1 week from the last
111
- POST or PATCH request, and the interval of checking expired files is 1 hour,
112
- but this can be changed:
187
+ Tus-ruby-server automatically adds expiration dates to each uploaded file, and
188
+ refreshes this date on each PATCH request. By default files expire 7 days after
189
+ they were last updated, but you can modify `:expiration_time`:
113
190
 
114
191
  ```rb
115
- require "tus/server"
192
+ Tus::Server.opts[:expiration_time] = 2*24*60*60 # 2 days
193
+ ```
194
+
195
+ Tus-ruby-server won't automatically delete expired files, but each storage
196
+ knows how to expire old files, so you just have to set up a recurring task
197
+ that will call `#expire_files`.
198
+
199
+ ```rb
200
+ expiration_date = Time.now.utc - Tus::Server.opts[:expiration_time]
201
+ Tus::Server.opts[:storage].expire_files(expiration_date)
202
+ ```
203
+
204
+ ## Download
205
+
206
+ In addition to implementing the tus protocol, tus-ruby-server also comes with
207
+ an endpoint for downloading the uploaded file, streaming the file directly from
208
+ the storage.
116
209
 
117
- Tus::Server.opts[:expiration_time] = 14*24*60*60 # 2 weeks
118
- Tus::Server.opts[:expiration_interval] = 24*60*60 # 1 day
210
+ The endpoint will automatically use the following `Upload-Metadata` values if
211
+ they're available:
212
+
213
+ * `content_type` -- used in the `Content-Type` response header
214
+ * `filename` -- used in the `Content-Disposition` response header
215
+
216
+ The `Content-Disposition` header will be set to "inline" by default, but you
217
+ can change it to "attachment" if you want the browser to always force download:
218
+
219
+ ```rb
220
+ Tus::Server.opts[:disposition] = "attachment"
119
221
  ```
120
222
 
121
- ### Checksum
223
+ The download endpoint supports [Range requests], so you can use the tus
224
+ file URL as `src` in `<video>` and `<audio>` HTML tags.
225
+
226
+ ## Checksum
122
227
 
123
228
  The following checksum algorithms are supported for the `checksum` extension:
124
229
 
@@ -129,25 +234,64 @@ The following checksum algorithms are supported for the `checksum` extension:
129
234
  * MD5
130
235
  * CRC32
131
236
 
132
- ## Limitations
237
+ ## Performance considerations
238
+
239
+ ### Buffering
240
+
241
+ When handling file uploads it's important not be be vulnerable to slow-write
242
+ clients. That means you need to make sure that your web/application server
243
+ buffers the request body locally before handing the request to the request
244
+ worker.
245
+
246
+ If the request body is not buffered and is read directly from the socket when
247
+ it has already reached your Rack application, your application throughput will
248
+ be severly impacted, because the workers will spend majority of their time
249
+ waiting for request body to be read from the socket, and in that time they
250
+ won't be able to serve new requests.
133
251
 
134
- Since tus-ruby-server is built using a Rack-based web framework (Roda), if a
135
- PATCH request gets interrupted, none of the received data will be stored. It's
136
- recommended to configure your client tus library not to use a single PATCH
137
- request for large files, but rather to split it into multiple chunks. You can
138
- do that for [tus-js-client] by specifying a maximum chunk size:
252
+ Puma will automatically buffer the whole request body in a Tempfile, before
253
+ fowarding the request to your Rack app. Unicorn and Passenger will not do that,
254
+ so it's highly recommended to put a frontend server like Nginx in front of
255
+ those web servers, and configure it to buffer the request body.
256
+
257
+ ### Chunking
258
+
259
+ The tus protocol specifies
260
+
261
+ > The Server SHOULD always attempt to store as much of the received data as possible.
262
+
263
+ The tus-ruby-server Rack application supports saving partial data for if the
264
+ PATCH request gets interrupted before all data has been sent, but I'm not aware
265
+ of any Rack-compliant web server that will forward interrupted requests to the
266
+ Rack app.
267
+
268
+ This means that for resumable upload to be possible with tus-ruby-server in
269
+ general, the file must be uploaded in multiple chunks; the client shouldn't
270
+ rely that server will store any data if the PATCH request was interrupted.
139
271
 
140
272
  ```js
273
+ // using tus-js-client
141
274
  new tus.Upload(file, {
142
275
  endpoint: "http://localhost:9292/files",
143
- chunkSize: 15*1024*1024, // 15 MB
276
+ chunkSize: 5*1024*1024, // required option
144
277
  // ...
145
278
  })
146
279
  ```
147
280
 
148
- Tus-server also currently doesn't support the `checksum-trailer` extension,
149
- which would allow sending the checksum header *after* the data has been sent,
150
- using [trailing headers].
281
+ ### Downloading
282
+
283
+ Tus-ruby-server has a download endpoint which streams the uploaded file to the
284
+ client. Unfortunately, with most classic web servers this endpoint will be
285
+ vulnerable to slow-read clients, because the worker is only done once the whole
286
+ response body has been received by the client. Web servers that are not
287
+ vulnerable to slow-read clients include [Goliath]/[Thin] ([EventMachine]) and
288
+ [Reel] ([Celluloid::IO]).
289
+
290
+ So, depending on your requirements, you might want to avoid displaying the
291
+ uploaded file in the browser (making the user download the file directly from
292
+ the tus server), until it has been moved to a permanent storage. You might also
293
+ want to consider copying finished uploads to permanent storage directly from
294
+ the underlying tus storage, instead of downloading it through the app.
151
295
 
152
296
  ## Inspiration
153
297
 
@@ -170,3 +314,12 @@ The tus-ruby-server was inspired by [rubytus].
170
314
  [Shrine]: https://github.com/janko-m/shrine
171
315
  [trailing headers]: https://tools.ietf.org/html/rfc7230#section-4.1.2
172
316
  [rubytus]: https://github.com/picocandy/rubytus
317
+ [aws-sdk]: https://github.com/aws/aws-sdk-ruby
318
+ [`Aws::S3::Client#initialize`]: http://docs.aws.amazon.com/sdkforruby/api/Aws/S3/Client.html#initialize-instance_method
319
+ [`Aws::S3::Client#create_multipart_upload`]: http://docs.aws.amazon.com/sdkforruby/api/Aws/S3/Client.html#create_multipart_upload-instance_method
320
+ [Range requests]: https://tools.ietf.org/html/rfc7233
321
+ [EventMachine]: https://github.com/eventmachine/eventmachine
322
+ [Reel]: https://github.com/celluloid/reel
323
+ [Goliath]: https://github.com/postrank-labs/goliath
324
+ [Thin]: https://github.com/macournoyer/thin
325
+ [Celluloid::IO]: https://github.com/celluloid/celluloid-io
data/lib/tus/checksum.rb CHANGED
@@ -6,43 +6,56 @@ module Tus
6
6
  class Checksum
7
7
  attr_reader :algorithm
8
8
 
9
+ def self.generate(algorithm, input)
10
+ new(algorithm).generate(input)
11
+ end
12
+
9
13
  def initialize(algorithm)
10
14
  @algorithm = algorithm
11
15
  end
12
16
 
13
- def match?(checksum, content)
14
- checksum = Base64.decode64(checksum)
15
- generate(content) == checksum
17
+ def match?(checksum, io)
18
+ generate(io) == checksum
16
19
  end
17
20
 
18
- def generate(content)
19
- send("generate_#{algorithm}", content)
21
+ def generate(io)
22
+ hash = send("generate_#{algorithm}", io)
23
+ io.rewind
24
+ hash
20
25
  end
21
26
 
22
27
  private
23
28
 
24
- def generate_sha1(content)
25
- Digest::SHA1.hexdigest(content)
29
+ def generate_sha1(io)
30
+ digest(:SHA1, io)
31
+ end
32
+
33
+ def generate_sha256(io)
34
+ digest(:SHA256, io)
26
35
  end
27
36
 
28
- def generate_sha256(content)
29
- Digest::SHA256.hexdigest(content)
37
+ def generate_sha384(io)
38
+ digest(:SHA384, io)
30
39
  end
31
40
 
32
- def generate_sha384(content)
33
- Digest::SHA384.hexdigest(content)
41
+ def generate_sha512(io)
42
+ digest(:SHA512, io)
34
43
  end
35
44
 
36
- def generate_sha512(content)
37
- Digest::SHA512.hexdigest(content)
45
+ def generate_md5(io)
46
+ digest(:MD5, io)
38
47
  end
39
48
 
40
- def generate_md5(content)
41
- Digest::MD5.hexdigest(content)
49
+ def generate_crc32(io)
50
+ crc = nil
51
+ crc = Zlib.crc32(io.read(16*1024, buffer ||= "").to_s, crc) until io.eof?
52
+ Base64.encode64(crc.to_s)
42
53
  end
43
54
 
44
- def generate_crc32(content)
45
- Zlib.crc32(content).to_s
55
+ def digest(name, io)
56
+ digest = Digest.const_get(name).new
57
+ digest.update(io.read(16*1024, buffer ||= "").to_s) until io.eof?
58
+ digest.base64digest
46
59
  end
47
60
  end
48
61
  end
data/lib/tus/errors.rb ADDED
@@ -0,0 +1,4 @@
1
+ module Tus
2
+ Error = Class.new(StandardError)
3
+ NotFound = Class.new(Error)
4
+ end
data/lib/tus/info.rb CHANGED
@@ -3,6 +3,15 @@ require "time"
3
3
 
4
4
  module Tus
5
5
  class Info
6
+ HEADERS = %w[
7
+ Upload-Length
8
+ Upload-Offset
9
+ Upload-Defer-Length
10
+ Upload-Metadata
11
+ Upload-Concat
12
+ Upload-Expires
13
+ ]
14
+
6
15
  def initialize(hash)
7
16
  @hash = hash
8
17
  end
@@ -16,11 +25,15 @@ module Tus
16
25
  end
17
26
 
18
27
  def to_h
19
- @hash.reject { |key, value| value.nil? }
28
+ @hash
29
+ end
30
+
31
+ def headers
32
+ @hash.select { |key, value| HEADERS.include?(key) && !value.nil? }
20
33
  end
21
34
 
22
35
  def length
23
- Integer(@hash["Upload-Length"])
36
+ Integer(@hash["Upload-Length"]) if @hash["Upload-Length"]
24
37
  end
25
38
 
26
39
  def offset
@@ -35,7 +48,7 @@ module Tus
35
48
  Time.parse(@hash["Upload-Expires"])
36
49
  end
37
50
 
38
- def final_upload?
51
+ def concatenation?
39
52
  @hash["Upload-Concat"].to_s.start_with?("final")
40
53
  end
41
54