tus-server 0.9.1 → 0.10.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 9a311eaa9b3d1a3081b3d0c2ade38825b8f392c9
4
- data.tar.gz: 8bce1af5d6da40478f2f7fda210f38f43c65f679
3
+ metadata.gz: 4970687c1cfd81a01844c3626f4ee8e879410772
4
+ data.tar.gz: 017e147d7dbf79a40fb9ab1ada4a9c87b4490f81
5
5
  SHA512:
6
- metadata.gz: de7a065c14ed3b11b3ea50c67738d720bdca114af6c0ae6ef5692bad6bdb9389e761df105f037ada860518627d46448e3bbe76fef78ed7043b6f9af4031c46f1
7
- data.tar.gz: 6fffa14a3bff247d01002e3a457521983d6d60e063235e500a5f7ad5965e685be3493e75d783affe3871a6d8fc47500253a1d7127117e60bc04a0fe45c299ad3
6
+ metadata.gz: 7b937ef67a26f46eeaa45e241bf593684ef99420d8ff41007bcf734130b891733f71a096c61a00c02bc805fbcf4aa1932260f4c85f5c4ac09fb097a8f53d4ef7
7
+ data.tar.gz: 24ee3bc3aab8dc3b4bf427dc5a65477efdf8b2dbb03edefeb8f23ee77ee0adf31c5e3d022b26223a36c0d2f69125177c958cae712a6f18feff7cde5e6a76d51d
data/README.md CHANGED
@@ -62,8 +62,16 @@ require "tus/storage/filesystem"
62
62
  Tus::Server.opts[:storage] = Tus::Storage::Filesystem.new("public/cache")
63
63
  ```
64
64
 
65
- One downside of filesystem storage is that by default it doesn't work if you
66
- want to run tus-ruby-servers on multiple servers, you'd have to set up a shared
65
+ If the configured directory doesn't exist, it will automatically be created.
66
+ By default the UNIX permissions applied will be 0644 for files and 0755 for
67
+ directories, but you can set different permissions:
68
+
69
+ ```rb
70
+ Tus::Storage::Filesystem.new("data", permissions: 0600, directory_permissions: 0777)
71
+ ```
72
+
73
+ One downside of filesystem storage is that it doesn't work by default if you
74
+ want to run tus-ruby-server on multiple servers, you'd have to set up a shared
67
75
  filesystem between the servers. Another downside is that you have to make sure
68
76
  your servers have enough disk space. Also, if you're using Heroku, you cannot
69
77
  store files on the filesystem as they won't persist.
@@ -88,21 +96,17 @@ client = Mongo::Client.new("mongodb://127.0.0.1:27017/mydb")
88
96
  Tus::Server.opts[:storage] = Tus::Storage::Gridfs.new(client: client)
89
97
  ```
90
98
 
91
- The Gridfs specification requires that all chunks are of equal size, except the
92
- last chunk. `Tus::Storage::Gridfs` will by default automatically make the
93
- Gridfs chunk size equal to the size of the first uploaded chunk. This means
94
- that all of the uploaded chunks need to be of equal size (except the last
95
- chunk).
96
-
97
- If you don't want the Gridfs chunk size to be equal to the size of the uploaded
98
- chunks, you can hardcode the chunk size that will be used for all uploads.
99
+ By default MongoDB Gridfs stores files in chunks of 256KB, but you can change
100
+ that with the `:chunk_size` option:
99
101
 
100
102
  ```rb
101
- Tus::Storage::Gridfs.new(client: client, chunk_size: 256*1024) # 256 KB
103
+ Tus::Storage::Gridfs.new(client: client, chunk_size: 1*1024*1024) # 1 MB
102
104
  ```
103
105
 
104
- Just note that in this case the size of each uploaded chunk (except the last
105
- one) needs to be a multiple of the `:chunk_size`.
106
+ Note that if you're using the [concatenation] tus feature with Gridfs, all
107
+ partial uploads except the last one are required to fill in their Gridfs
108
+ chunks, meaning the length of each partial upload needs to be a multiple of the
109
+ `:chunk_size` number.
106
110
 
107
111
  ### Amazon S3
108
112
 
@@ -125,15 +129,9 @@ Tus::Server.opts[:storage] = Tus::Storage::S3.new(
125
129
  )
126
130
  ```
127
131
 
128
- It might seem at first that using a remote storage like Amazon S3 will slow
129
- down the overall upload, but the time it takes for the client to upload the
130
- file to the Rack app is in general *much* longer than the time for the server
131
- to upload that chunk to S3, because of the differences in the Internet
132
- connection speed between the user's computer and server.
133
-
134
- One thing to note is that S3's multipart API requires each chunk except the last
135
- one to be 5MB or larger, so that is the minimum chunk size that you can specify
136
- on your tus client if you want to use the S3 storage.
132
+ One thing to note is that S3's multipart API requires each chunk except the
133
+ last to be **5MB or larger**, so that is the minimum chunk size that you can
134
+ specify on your tus client if you want to use the S3 storage.
137
135
 
138
136
  If you want to files to be stored in a certain subdirectory, you can specify
139
137
  a `:prefix` in the storage configuration.
@@ -174,12 +172,11 @@ def expire_files(expiration_date) ... end
174
172
 
175
173
  ## Maximum size
176
174
 
177
- By default the maximum size for an uploaded file is 1GB, but you can change
178
- that:
175
+ By default the size of files the tus server will accept is unlimited, but you
176
+ can configure the maximum file size:
179
177
 
180
178
  ```rb
181
179
  Tus::Server.opts[:max_size] = 5 * 1024*1024*1024 # 5GB
182
- Tus::Server.opts[:max_size] = nil # no limit
183
180
  ```
184
181
 
185
182
  ## Expiration
data/lib/tus/input.rb CHANGED
@@ -21,7 +21,11 @@ module Tus
21
21
  end
22
22
 
23
23
  def size
24
- @input.size
24
+ if defined?(Rack::Lint) && @input.is_a?(Rack::Lint::InputWrapper)
25
+ @input.instance_variable_get("@input").size
26
+ else
27
+ @input.size
28
+ end
25
29
  end
26
30
 
27
31
  def close
data/lib/tus/server.rb CHANGED
@@ -22,11 +22,12 @@ module Tus
22
22
  SUPPORTED_CHECKSUM_ALGORITHMS = %w[sha1 sha256 sha384 sha512 md5 crc32]
23
23
  RESUMABLE_CONTENT_TYPE = "application/offset+octet-stream"
24
24
 
25
- opts[:max_size] = 1024*1024*1024
25
+ opts[:max_size] = nil
26
26
  opts[:expiration_time] = 7*24*60*60
27
27
  opts[:disposition] = "inline"
28
28
 
29
29
  plugin :all_verbs
30
+ plugin :default_headers, {"Content-Type" => ""}
30
31
  plugin :delete_empty_headers
31
32
  plugin :request_headers
32
33
  plugin :not_allowed
@@ -130,6 +131,10 @@ module Tus
130
131
  info["Upload-Offset"] = (info.offset + input.size).to_s
131
132
  info["Upload-Expires"] = (Time.now + expiration_time).httpdate
132
133
 
134
+ if info.offset == info.length # last chunk
135
+ storage.finalize_file(uid, info.to_h) if storage.respond_to?(:finalize_file)
136
+ end
137
+
133
138
  storage.update_info(uid, info.to_h)
134
139
  response.headers.update(info.headers)
135
140
 
@@ -144,8 +149,8 @@ module Tus
144
149
 
145
150
  metadata = info.metadata
146
151
  response.headers["Content-Disposition"] = opts[:disposition]
147
- response.headers["Content-Disposition"] << "; filename=\"#{metadata["filename"]}\"" if metadata["filename"]
148
- response.headers["Content-Type"] = metadata["content_type"] if metadata["content_type"]
152
+ response.headers["Content-Disposition"] += "; filename=\"#{metadata["filename"]}\"" if metadata["filename"]
153
+ response.headers["Content-Type"] = metadata["content_type"] || "application/octet-stream"
149
154
 
150
155
  response = storage.get_file(uid, info.to_h, range: range)
151
156
 
@@ -208,7 +213,7 @@ module Tus
208
213
  if length
209
214
  error!(403, "Cannot modify completed upload") if current_offset == length
210
215
  error!(413, "Size of this chunk surpasses Upload-Length") if Integer(request.content_length) + current_offset > length
211
- else
216
+ elsif max_size
212
217
  error!(413, "Size of this chunk surpasses Tus-Max-Size") if Integer(request.content_length) + current_offset > max_size
213
218
  end
214
219
  end
@@ -314,6 +319,7 @@ module Tus
314
319
  def error!(status, message)
315
320
  response.status = status
316
321
  response.write(message) unless request.head?
322
+ response.headers["Content-Type"] = "text/plain"
317
323
  request.halt
318
324
  end
319
325
 
@@ -9,76 +9,91 @@ module Tus
9
9
  class Filesystem
10
10
  attr_reader :directory
11
11
 
12
- def initialize(directory)
13
- @directory = Pathname(directory)
12
+ def initialize(directory, permissions: 0644, directory_permissions: 0755)
13
+ @directory = Pathname(directory)
14
+ @permissions = permissions
15
+ @directory_permissions = directory_permissions
14
16
 
15
17
  create_directory! unless @directory.exist?
16
18
  end
17
19
 
18
20
  def create_file(uid, info = {})
19
- file_path(uid).open("wb") { |file| file.write("") }
21
+ file_path(uid).binwrite("")
22
+ file_path(uid).chmod(@permissions)
23
+
24
+ info_path(uid).binwrite("{}")
25
+ info_path(uid).chmod(@permissions)
20
26
  end
21
27
 
22
28
  def concatenate(uid, part_uids, info = {})
29
+ create_file(uid, info)
30
+
23
31
  file_path(uid).open("wb") do |file|
24
- begin
25
- part_uids.each do |part_uid|
32
+ part_uids.each do |part_uid|
33
+ # Rather than checking upfront whether all parts exist, we use
34
+ # exception flow to account for the possibility of parts being
35
+ # deleted during concatenation.
36
+ begin
26
37
  IO.copy_stream(file_path(part_uid), file)
38
+ rescue Errno::ENOENT
39
+ raise Tus::Error, "some parts for concatenation are missing"
27
40
  end
28
- rescue Errno::ENOENT
29
- raise Tus::Error, "some parts for concatenation are missing"
30
41
  end
31
42
  end
32
43
 
44
+ # Delete parts after concatenation.
33
45
  delete(part_uids)
34
46
 
35
- # server requires us to return the size of the concatenated file
47
+ # Tus server requires us to return the size of the concatenated file.
36
48
  file_path(uid).size
37
49
  end
38
50
 
39
- def patch_file(uid, io, info = {})
40
- raise Tus::NotFound if !file_path(uid).exist?
51
+ def patch_file(uid, input, info = {})
52
+ exists!(uid)
41
53
 
42
- file_path(uid).open("ab") { |file| IO.copy_stream(io, file) }
54
+ file_path(uid).open("ab") { |file| IO.copy_stream(input, file) }
43
55
  end
44
56
 
45
57
  def read_info(uid)
46
- raise Tus::NotFound if !file_path(uid).exist?
58
+ exists!(uid)
47
59
 
48
- begin
49
- data = info_path(uid).binread
50
- rescue Errno::ENOENT
51
- data = "{}"
52
- end
53
-
54
- JSON.parse(data)
60
+ JSON.parse(info_path(uid).binread)
55
61
  end
56
62
 
57
63
  def update_info(uid, info)
58
- info_path(uid).open("wb") { |file| file.write(info.to_json) }
64
+ exists!(uid)
65
+
66
+ info_path(uid).binwrite(JSON.generate(info))
59
67
  end
60
68
 
61
69
  def get_file(uid, info = {}, range: nil)
62
- raise Tus::NotFound if !file_path(uid).exist?
70
+ exists!(uid)
63
71
 
64
72
  file = file_path(uid).open("rb")
65
- range ||= 0..file.size-1
73
+ range ||= 0..(file.size - 1)
74
+ length = range.end - range.begin + 1
66
75
 
76
+ # Create an Enumerator which will yield chunks of the requested file
77
+ # content, allowing tus server to efficiently stream requested content
78
+ # to the client.
67
79
  chunks = Enumerator.new do |yielder|
68
80
  file.seek(range.begin)
69
- remaining_length = range.end - range.begin + 1
70
- buffer = ""
81
+ remaining_length = length
71
82
 
72
83
  while remaining_length > 0
73
- chunk = file.read([16*1024, remaining_length].min, buffer)
74
- break unless chunk
75
- remaining_length -= chunk.length
76
-
84
+ chunk = file.read([16*1024, remaining_length].min, buffer ||= "") or break
85
+ remaining_length -= chunk.bytesize
77
86
  yielder << chunk
78
87
  end
79
88
  end
80
89
 
81
- Response.new(chunks: chunks, close: ->{file.close})
90
+ # We return a response object that responds to #each, #length and #close,
91
+ # which the tus server can return directly as the Rack response.
92
+ Response.new(
93
+ chunks: chunks,
94
+ length: length,
95
+ close: ->{file.close},
96
+ )
82
97
  end
83
98
 
84
99
  def delete_file(uid, info = {})
@@ -86,11 +101,9 @@ module Tus
86
101
  end
87
102
 
88
103
  def expire_files(expiration_date)
89
- uids = []
90
-
91
- Pathname.glob(directory.join("*.file")).each do |pathname|
92
- uids << pathname.basename(".*") if pathname.mtime <= expiration_date
93
- end
104
+ uids = directory.children
105
+ .select { |pathname| pathname.mtime <= expiration_date }
106
+ .map { |pathname| pathname.basename(".*").to_s }
94
107
 
95
108
  delete(uids)
96
109
  end
@@ -99,9 +112,14 @@ module Tus
99
112
 
100
113
  def delete(uids)
101
114
  paths = uids.flat_map { |uid| [file_path(uid), info_path(uid)] }
115
+
102
116
  FileUtils.rm_f paths
103
117
  end
104
118
 
119
+ def exists!(uid)
120
+ raise Tus::NotFound if !file_path(uid).exist?
121
+ end
122
+
105
123
  def file_path(uid)
106
124
  directory.join("#{uid}.file")
107
125
  end
@@ -112,13 +130,18 @@ module Tus
112
130
 
113
131
  def create_directory!
114
132
  directory.mkpath
115
- directory.chmod(0755)
133
+ directory.chmod(@directory_permissions)
116
134
  end
117
135
 
118
136
  class Response
119
- def initialize(chunks:, close:)
137
+ def initialize(chunks:, close:, length:)
120
138
  @chunks = chunks
121
139
  @close = close
140
+ @length = length
141
+ end
142
+
143
+ def length
144
+ @length
122
145
  end
123
146
 
124
147
  def each(&block)
@@ -10,7 +10,7 @@ module Tus
10
10
  class Gridfs
11
11
  attr_reader :client, :prefix, :bucket, :chunk_size
12
12
 
13
- def initialize(client:, prefix: "fs", chunk_size: nil)
13
+ def initialize(client:, prefix: "fs", chunk_size: 256*1024)
14
14
  @client = client
15
15
  @prefix = prefix
16
16
  @bucket = @client.database.fs(bucket_name: @prefix)
@@ -19,143 +19,115 @@ module Tus
19
19
  end
20
20
 
21
21
  def create_file(uid, info = {})
22
- tus_info = Tus::Info.new(info)
23
- content_type = tus_info.metadata["content_type"]
22
+ content_type = Tus::Info.new(info).metadata["content_type"]
24
23
 
25
- file = Mongo::Grid::File.new("",
24
+ create_grid_file(
26
25
  filename: uid,
27
- metadata: {},
28
- chunk_size: chunk_size,
29
26
  content_type: content_type,
30
27
  )
31
-
32
- bucket.insert_one(file)
33
28
  end
34
29
 
35
30
  def concatenate(uid, part_uids, info = {})
36
- file_infos = bucket.files_collection.find(filename: {"$in" => part_uids}).to_a
37
- file_infos.sort_by! { |file_info| part_uids.index(file_info[:filename]) }
38
-
39
- if file_infos.count != part_uids.count
40
- raise Tus::Error, "some parts for concatenation are missing"
41
- end
42
-
43
- chunk_sizes = file_infos.map { |file_info| file_info[:chunkSize] }
44
- if chunk_sizes[0..-2].uniq.count > 1
45
- raise Tus::Error, "some parts have different chunk sizes, so they cannot be concatenated"
46
- end
31
+ grid_infos = files_collection.find(filename: {"$in" => part_uids}).to_a
32
+ grid_infos.sort_by! { |grid_info| part_uids.index(grid_info[:filename]) }
47
33
 
48
- if chunk_sizes.uniq != [chunk_sizes.last] && bucket.chunks_collection.find(files_id: file_infos.last[:_id]).count > 1
49
- raise Tus::Error, "last part has different chunk size and is composed of more than one chunk"
50
- end
34
+ validate_parts!(grid_infos, part_uids)
51
35
 
52
- length = file_infos.inject(0) { |sum, file_info| sum + file_info[:length] }
53
- chunk_size = file_infos.first[:chunkSize]
54
- tus_info = Tus::Info.new(info)
55
- content_type = tus_info.metadata["content_type"]
36
+ length = grid_infos.map { |doc| doc[:length] }.reduce(0, :+)
37
+ content_type = Tus::Info.new(info).metadata["content_type"]
56
38
 
57
- file = Mongo::Grid::File.new("",
39
+ grid_file = create_grid_file(
58
40
  filename: uid,
59
- metadata: {},
60
- chunk_size: chunk_size,
61
41
  length: length,
62
42
  content_type: content_type,
63
43
  )
64
44
 
65
- bucket.insert_one(file)
66
-
67
- file_infos.inject(0) do |offset, file_info|
68
- result = bucket.chunks_collection
69
- .find(files_id: file_info[:_id])
70
- .update_many("$set" => {files_id: file.id}, "$inc" => {n: offset})
45
+ # Update the chunks belonging to parts so that they point to the new file.
46
+ grid_infos.inject(0) do |offset, grid_info|
47
+ result = chunks_collection
48
+ .find(files_id: grid_info[:_id])
49
+ .update_many("$set" => {files_id: grid_file.id}, "$inc" => {n: offset})
71
50
 
72
51
  offset += result.modified_count
73
52
  end
74
53
 
75
- bucket.files_collection.delete_many(filename: {"$in" => part_uids})
54
+ # Delete the parts after concatenation.
55
+ files_collection.delete_many(filename: {"$in" => part_uids})
76
56
 
77
- # server requires us to return the size of the concatenated file
57
+ # Tus server requires us to return the size of the concatenated file.
78
58
  length
79
59
  end
80
60
 
81
- def patch_file(uid, io, info = {})
82
- file_info = bucket.files_collection.find(filename: uid).first
83
- raise Tus::NotFound if file_info.nil?
84
-
85
- file_info[:md5] = Digest::MD5.new # hack for `Chunk.split` updating MD5
86
- file_info[:chunkSize] ||= io.size
87
- file_info = Mongo::Grid::File::Info.new(Mongo::Options::Mapper.transform(file_info, Mongo::Grid::File::Info::MAPPINGS.invert))
88
-
89
- tus_info = Tus::Info.new(info)
90
- last_chunk = (tus_info.length && io.size == tus_info.remaining_length)
91
-
92
- if io.size % file_info.chunk_size != 0 && !last_chunk
93
- raise Tus::Error,
94
- "Input has length #{io.size} but expected it to be a multiple of " \
95
- "chunk size #{file_info.chunk_size} or for it to be the last chunk"
96
- end
61
+ def patch_file(uid, input, info = {})
62
+ grid_info = find_grid_info!(uid)
97
63
 
98
- offset = bucket.chunks_collection.find(files_id: file_info.id).count
99
- chunks = Mongo::Grid::File::Chunk.split(io, file_info, offset)
64
+ patch_last_chunk(input, grid_info)
100
65
 
101
- bucket.chunks_collection.insert_many(chunks)
102
- chunks.each { |chunk| chunk.data.data.clear } # deallocate strings
66
+ grid_chunks = split_into_grid_chunks(input, grid_info)
67
+ chunks_collection.insert_many(grid_chunks)
68
+ grid_chunks.each { |grid_chunk| grid_chunk.data.data.clear } # deallocate strings
103
69
 
104
- bucket.files_collection.find(filename: uid).update_one("$set" => {
105
- length: file_info.length + io.size,
70
+ # Update the total length and refresh the upload date on each update,
71
+ # which are used in #get_file, #concatenate and #expire_files.
72
+ files_collection.find(filename: uid).update_one("$set" => {
73
+ length: grid_info[:length] + input.size,
106
74
  uploadDate: Time.now.utc,
107
- chunkSize: file_info.chunk_size,
108
75
  })
109
76
  end
110
77
 
111
78
  def read_info(uid)
112
- file_info = bucket.files_collection.find(filename: uid).first
113
- raise Tus::NotFound if file_info.nil?
79
+ grid_info = find_grid_info!(uid)
114
80
 
115
- file_info.fetch("metadata")
81
+ grid_info[:metadata]
116
82
  end
117
83
 
118
84
  def update_info(uid, info)
119
- bucket.files_collection.find(filename: uid)
120
- .update_one("$set" => {metadata: info})
85
+ grid_info = find_grid_info!(uid)
86
+
87
+ files_collection.update_one({filename: uid}, {"$set" => {metadata: info}})
121
88
  end
122
89
 
123
90
  def get_file(uid, info = {}, range: nil)
124
- file_info = bucket.files_collection.find(filename: uid).first
125
- raise Tus::NotFound if file_info.nil?
91
+ grid_info = find_grid_info!(uid)
126
92
 
127
- filter = {files_id: file_info[:_id]}
93
+ range ||= 0..(grid_info[:length] - 1)
94
+ length = range.end - range.begin + 1
128
95
 
129
- if range
130
- chunk_start = range.begin / file_info[:chunkSize] if range.begin
131
- chunk_stop = range.end / file_info[:chunkSize] if range.end
96
+ chunk_start = range.begin / grid_info[:chunkSize]
97
+ chunk_stop = range.end / grid_info[:chunkSize]
132
98
 
133
- filter[:n] = {}
134
- filter[:n].update("$gte" => chunk_start) if chunk_start
135
- filter[:n].update("$lte" => chunk_stop) if chunk_stop
136
- end
99
+ filter = {
100
+ files_id: grid_info[:_id],
101
+ n: {"$gte" => chunk_start, "$lte" => chunk_stop}
102
+ }
137
103
 
138
- chunks_view = bucket.chunks_collection.find(filter).read(bucket.read_preference).sort(n: 1)
104
+ # Query only the subset of chunks specified by the range query. We
105
+ # cannot use Mongo::FsBucket#open_download_stream here because it
106
+ # doesn't support changing the filter.
107
+ chunks_view = chunks_collection.find(filter).sort(n: 1)
139
108
 
109
+ # Create an Enumerator which will yield chunks of the requested file
110
+ # content, allowing tus server to efficiently stream requested content
111
+ # to the client.
140
112
  chunks = Enumerator.new do |yielder|
141
113
  chunks_view.each do |document|
142
114
  data = document[:data].data
143
115
 
144
116
  if document[:n] == chunk_start && document[:n] == chunk_stop
145
- byte_start = range.begin % file_info[:chunkSize]
146
- byte_stop = range.end % file_info[:chunkSize]
117
+ byte_start = range.begin % grid_info[:chunkSize]
118
+ byte_stop = range.end % grid_info[:chunkSize]
147
119
  elsif document[:n] == chunk_start
148
- byte_start = range.begin % file_info[:chunkSize]
149
- byte_stop = file_info[:chunkSize] - 1
120
+ byte_start = range.begin % grid_info[:chunkSize]
121
+ byte_stop = grid_info[:chunkSize] - 1
150
122
  elsif document[:n] == chunk_stop
151
123
  byte_start = 0
152
- byte_stop = range.end % file_info[:chunkSize]
124
+ byte_stop = range.end % grid_info[:chunkSize]
153
125
  end
154
126
 
127
+ # If we're on the first or last chunk, return a subset of the chunk
128
+ # specified by the given range, otherwise return the full chunk.
155
129
  if byte_start && byte_stop
156
- partial_data = data[byte_start..byte_stop]
157
- yielder << partial_data
158
- partial_data.clear # deallocate chunk string
130
+ yielder << data[byte_start..byte_stop]
159
131
  else
160
132
  yielder << data
161
133
  end
@@ -164,26 +136,100 @@ module Tus
164
136
  end
165
137
  end
166
138
 
167
- Response.new(chunks: chunks, close: ->{chunks_view.close_query})
139
+ # We return a response object that responds to #each, #length and #close,
140
+ # which the tus server can return directly as the Rack response.
141
+ Response.new(
142
+ chunks: chunks,
143
+ length: length,
144
+ close: ->{chunks_view.close_query},
145
+ )
168
146
  end
169
147
 
170
148
  def delete_file(uid, info = {})
171
- file_info = bucket.files_collection.find(filename: uid).first
172
- bucket.delete(file_info.fetch("_id")) if file_info
149
+ grid_info = files_collection.find(filename: uid).first
150
+ bucket.delete(grid_info[:_id]) if grid_info
173
151
  end
174
152
 
175
153
  def expire_files(expiration_date)
176
- file_infos = bucket.files_collection.find(uploadDate: {"$lte" => expiration_date}).to_a
177
- file_info_ids = file_infos.map { |info| info[:_id] }
154
+ grid_infos = files_collection.find(uploadDate: {"$lte" => expiration_date}).to_a
155
+ grid_info_ids = grid_infos.map { |info| info[:_id] }
156
+
157
+ files_collection.delete_many(_id: {"$in" => grid_info_ids})
158
+ chunks_collection.delete_many(files_id: {"$in" => grid_info_ids})
159
+ end
160
+
161
+ private
162
+
163
+ def create_grid_file(**options)
164
+ file_options = {metadata: {}, chunk_size: chunk_size}.merge(options)
165
+ grid_file = Mongo::Grid::File.new("", file_options)
166
+
167
+ bucket.insert_one(grid_file)
168
+
169
+ grid_file
170
+ end
171
+
172
+ def split_into_grid_chunks(io, grid_info)
173
+ grid_info[:md5] = Digest::MD5.new # hack for `Chunk.split` updating MD5
174
+ grid_info = Mongo::Grid::File::Info.new(Mongo::Options::Mapper.transform(grid_info, Mongo::Grid::File::Info::MAPPINGS.invert))
175
+ offset = chunks_collection.count(files_id: grid_info.id)
178
176
 
179
- bucket.files_collection.delete_many(_id: {"$in" => file_info_ids})
180
- bucket.chunks_collection.delete_many(files_id: {"$in" => file_info_ids})
177
+ Mongo::Grid::File::Chunk.split(io, grid_info, offset)
178
+ end
179
+
180
+ def patch_last_chunk(input, grid_info)
181
+ if grid_info[:length] % grid_info[:chunkSize] != 0
182
+ last_chunk = chunks_collection.find(files_id: grid_info[:_id]).sort(n: -1).limit(1).first
183
+ data = last_chunk[:data].data
184
+ data << input.read(grid_info[:chunkSize] - data.length)
185
+
186
+ chunks_collection.find(files_id: grid_info[:_id], n: last_chunk[:n])
187
+ .update_one("$set" => {data: BSON::Binary.new(data)})
188
+
189
+ data.clear # deallocate string
190
+ end
191
+ end
192
+
193
+ def find_grid_info!(uid)
194
+ files_collection.find(filename: uid).first or raise Tus::NotFound
195
+ end
196
+
197
+ def validate_parts!(grid_infos, part_uids)
198
+ validate_parts_presence!(grid_infos, part_uids)
199
+ validate_parts_full_chunks!(grid_infos)
200
+ end
201
+
202
+ def validate_parts_presence!(grid_infos, part_uids)
203
+ if grid_infos.count != part_uids.count
204
+ raise Tus::Error, "some parts for concatenation are missing"
205
+ end
206
+ end
207
+
208
+ def validate_parts_full_chunks!(grid_infos)
209
+ grid_infos.each do |grid_info|
210
+ if grid_info[:length] % grid_info[:chunkSize] != 0 && grid_info != grid_infos.last
211
+ raise Tus::Error, "cannot concatenate parts which aren't evenly distributed across chunks"
212
+ end
213
+ end
214
+ end
215
+
216
+ def files_collection
217
+ bucket.files_collection
218
+ end
219
+
220
+ def chunks_collection
221
+ bucket.chunks_collection
181
222
  end
182
223
 
183
224
  class Response
184
- def initialize(chunks:, close:)
225
+ def initialize(chunks:, close:, length:)
185
226
  @chunks = chunks
186
227
  @close = close
228
+ @length = length
229
+ end
230
+
231
+ def length
232
+ @length
187
233
  end
188
234
 
189
235
  def each(&block)
@@ -16,13 +16,14 @@ module Tus
16
16
 
17
17
  attr_reader :client, :bucket, :prefix, :upload_options
18
18
 
19
- def initialize(bucket:, prefix: nil, upload_options: {}, **client_options)
19
+ def initialize(bucket:, prefix: nil, upload_options: {}, thread_count: 10, **client_options)
20
20
  resource = Aws::S3::Resource.new(**client_options)
21
21
 
22
- @client = resource.client
23
- @bucket = resource.bucket(bucket)
24
- @prefix = prefix
22
+ @client = resource.client
23
+ @bucket = resource.bucket(bucket)
24
+ @prefix = prefix
25
25
  @upload_options = upload_options
26
+ @thread_count = thread_count
26
27
  end
27
28
 
28
29
  def create_file(uid, info = {})
@@ -40,73 +41,33 @@ module Tus
40
41
 
41
42
  info["multipart_id"] = multipart_upload.id
42
43
  info["multipart_parts"] = []
44
+
45
+ multipart_upload
43
46
  end
44
47
 
45
48
  def concatenate(uid, part_uids, info = {})
46
- create_file(uid, info)
47
-
48
- multipart_upload = object(uid).multipart_upload(info["multipart_id"])
49
-
50
- queue = Queue.new
51
- part_uids.each_with_index do |part_uid, idx|
52
- queue << {
53
- copy_source: [bucket.name, object(part_uid).key].join("/"),
54
- part_number: idx + 1
55
- }
56
- end
57
-
58
- threads = 10.times.map do
59
- Thread.new do
60
- Thread.current.abort_on_exception = true
61
- completed = []
62
-
63
- begin
64
- loop do
65
- multipart_copy_task = queue.deq(true) rescue break
49
+ multipart_upload = create_file(uid, info)
66
50
 
67
- part_number = multipart_copy_task[:part_number]
68
- copy_source = multipart_copy_task[:copy_source]
51
+ objects = part_uids.map { |part_uid| object(part_uid) }
52
+ parts = copy_parts(objects, multipart_upload)
69
53
 
70
- part = multipart_upload.part(part_number)
71
- response = part.copy_from(copy_source: copy_source)
72
-
73
- completed << {
74
- part_number: part_number,
75
- etag: response.copy_part_result.etag,
76
- }
77
- end
78
-
79
- completed
80
- rescue
81
- queue.clear
82
- raise
83
- end
84
- end
54
+ parts.each do |part|
55
+ info["multipart_parts"] << { "part_number" => part[:part_number], "etag" => part[:etag] }
85
56
  end
86
57
 
87
- parts = threads.flat_map(&:value).sort_by { |part| part[:part_number] }
88
-
89
- multipart_upload.complete(multipart_upload: {parts: parts})
58
+ finalize_file(uid, info)
90
59
 
91
60
  delete(part_uids.flat_map { |part_uid| [object(part_uid), object("#{part_uid}.info")] })
92
61
 
93
- info.delete("multipart_id")
94
- info.delete("multipart_parts")
95
-
96
- client.head_object(bucket: bucket.name, key: object(uid).key).content_length
97
- rescue
62
+ # Tus server requires us to return the size of the concatenated file.
63
+ object = client.head_object(bucket: bucket.name, key: object(uid).key)
64
+ object.content_length
65
+ rescue => error
98
66
  abort_multipart_upload(multipart_upload) if multipart_upload
99
- raise
67
+ raise error
100
68
  end
101
69
 
102
70
  def patch_file(uid, io, info = {})
103
- tus_info = Tus::Info.new(info)
104
- last_chunk = (tus_info.length && io.size == tus_info.remaining_length)
105
-
106
- if io.size < MIN_PART_SIZE && !last_chunk
107
- raise Tus::Error, "Chunk size cannot be smaller than 5MB"
108
- end
109
-
110
71
  upload_id = info["multipart_id"]
111
72
  part_number = info["multipart_parts"].count + 1
112
73
 
@@ -114,30 +75,27 @@ module Tus
114
75
  multipart_part = multipart_upload.part(part_number)
115
76
  md5 = Tus::Checksum.new("md5").generate(io)
116
77
 
117
- begin
118
- response = multipart_part.upload(body: io, content_md5: md5)
119
- rescue Aws::S3::Errors::NoSuchUpload
120
- raise Tus::NotFound
121
- end
78
+ response = multipart_part.upload(body: io, content_md5: md5)
122
79
 
123
80
  info["multipart_parts"] << {
124
81
  "part_number" => part_number,
125
82
  "etag" => response.etag[/"(.+)"/, 1],
126
83
  }
84
+ rescue Aws::S3::Errors::NoSuchUpload
85
+ raise Tus::NotFound
86
+ end
127
87
 
128
- # finalize the multipart upload if this chunk was the last part
129
- if last_chunk
130
- multipart_upload.complete(
131
- multipart_upload: {
132
- parts: info["multipart_parts"].map do |part|
133
- {part_number: part["part_number"], etag: part["etag"]}
134
- end
135
- }
136
- )
137
-
138
- info.delete("multipart_id")
139
- info.delete("multipart_parts")
88
+ def finalize_file(uid, info = {})
89
+ upload_id = info["multipart_id"]
90
+ parts = info["multipart_parts"].map do |part|
91
+ { part_number: part["part_number"], etag: part["etag"] }
140
92
  end
93
+
94
+ multipart_upload = object(uid).multipart_upload(upload_id)
95
+ multipart_upload.complete(multipart_upload: {parts: parts})
96
+
97
+ info.delete("multipart_id")
98
+ info.delete("multipart_parts")
141
99
  end
142
100
 
143
101
  def read_info(uid)
@@ -152,29 +110,31 @@ module Tus
152
110
  end
153
111
 
154
112
  def get_file(uid, info = {}, range: nil)
155
- if range
156
- range = "bytes=#{range.begin}-#{range.end}"
157
- end
113
+ object = object(uid)
114
+ range = "bytes=#{range.begin}-#{range.end}" if range
158
115
 
159
116
  raw_chunks = Enumerator.new do |yielder|
160
- object(uid).get(range: range) do |chunk|
117
+ object.get(range: range) do |chunk|
161
118
  yielder << chunk
162
119
  chunk.clear # deallocate string
163
120
  end
164
121
  end
165
122
 
166
- begin
167
- first_chunk = raw_chunks.next
168
- rescue Aws::S3::Errors::NoSuchKey
169
- raise Tus::NotFound
170
- end
123
+ # Start the request to be notified if the object doesn't exist, and to
124
+ # get Aws::S3::Object#content_length.
125
+ first_chunk = raw_chunks.next
171
126
 
172
127
  chunks = Enumerator.new do |yielder|
173
128
  yielder << first_chunk
174
129
  loop { yielder << raw_chunks.next }
175
130
  end
176
131
 
177
- Response.new(chunks: chunks)
132
+ Response.new(
133
+ chunks: chunks,
134
+ length: object.content_length,
135
+ )
136
+ rescue Aws::S3::Errors::NoSuchKey
137
+ raise Tus::NotFound
178
138
  end
179
139
 
180
140
  def delete_file(uid, info = {})
@@ -226,18 +186,71 @@ module Tus
226
186
  # multipart upload was successfully aborted or doesn't exist
227
187
  end
228
188
 
189
+ def copy_parts(objects, multipart_upload)
190
+ parts = compute_parts(objects, multipart_upload)
191
+ queue = parts.inject(Queue.new) { |queue, part| queue << part }
192
+
193
+ threads = @thread_count.times.map { copy_part_thread(queue) }
194
+
195
+ threads.flat_map(&:value).sort_by { |part| part[:part_number] }
196
+ end
197
+
198
+ def compute_parts(objects, multipart_upload)
199
+ objects.map.with_index do |object, idx|
200
+ {
201
+ bucket: multipart_upload.bucket_name,
202
+ key: multipart_upload.object_key,
203
+ upload_id: multipart_upload.id,
204
+ copy_source: [object.bucket_name, object.key].join("/"),
205
+ part_number: idx + 1,
206
+ }
207
+ end
208
+ end
209
+
210
+ def copy_part_thread(queue)
211
+ Thread.new do
212
+ Thread.current.abort_on_exception = true
213
+ begin
214
+ results = []
215
+ loop do
216
+ part = queue.deq(true) rescue break
217
+ results << copy_part(part)
218
+ end
219
+ results
220
+ rescue => error
221
+ queue.clear
222
+ raise error
223
+ end
224
+ end
225
+ end
226
+
227
+ def copy_part(part)
228
+ response = client.upload_part_copy(part)
229
+
230
+ { part_number: part[:part_number], etag: response.copy_part_result.etag }
231
+ end
232
+
229
233
  def object(key)
230
234
  bucket.object([*prefix, key].join("/"))
231
235
  end
232
236
 
233
237
  class Response
234
- def initialize(chunks:)
238
+ def initialize(chunks:, length:)
235
239
  @chunks = chunks
240
+ @length = length
241
+ end
242
+
243
+ def length
244
+ @length
236
245
  end
237
246
 
238
247
  def each(&block)
239
248
  @chunks.each(&block)
240
249
  end
250
+
251
+ def close
252
+ # aws-sdk doesn't provide an API to terminate the HTTP connection
253
+ end
241
254
  end
242
255
  end
243
256
  end
data/tus-server.gemspec CHANGED
@@ -1,6 +1,6 @@
1
1
  Gem::Specification.new do |gem|
2
2
  gem.name = "tus-server"
3
- gem.version = "0.9.1"
3
+ gem.version = "0.10.0"
4
4
 
5
5
  gem.required_ruby_version = ">= 2.1"
6
6
 
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: tus-server
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.9.1
4
+ version: 0.10.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Janko Marohnić
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2017-03-24 00:00:00.000000000 Z
11
+ date: 2017-03-27 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: roda