RubyGems - tus-server - Versions diffs - 0.9.1 → 0.10.0 - Mend

tus-server 0.9.1 → 0.10.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

checksums.yaml +4 -4
data/README.md +22 -25
data/lib/tus/input.rb +5 -1
data/lib/tus/server.rb +10 -4
data/lib/tus/storage/filesystem.rb +59 -36
data/lib/tus/storage/gridfs.rb +138 -92
data/lib/tus/storage/s3.rb +98 -85
data/tus-server.gemspec +1 -1
metadata +2 -2

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA1:
-  metadata.gz: 9a311eaa9b3d1a3081b3d0c2ade38825b8f392c9
-  data.tar.gz: 8bce1af5d6da40478f2f7fda210f38f43c65f679
+  metadata.gz: 4970687c1cfd81a01844c3626f4ee8e879410772
+  data.tar.gz: 017e147d7dbf79a40fb9ab1ada4a9c87b4490f81
 SHA512:
-  metadata.gz: de7a065c14ed3b11b3ea50c67738d720bdca114af6c0ae6ef5692bad6bdb9389e761df105f037ada860518627d46448e3bbe76fef78ed7043b6f9af4031c46f1
-  data.tar.gz: 6fffa14a3bff247d01002e3a457521983d6d60e063235e500a5f7ad5965e685be3493e75d783affe3871a6d8fc47500253a1d7127117e60bc04a0fe45c299ad3
+  metadata.gz: 7b937ef67a26f46eeaa45e241bf593684ef99420d8ff41007bcf734130b891733f71a096c61a00c02bc805fbcf4aa1932260f4c85f5c4ac09fb097a8f53d4ef7
+  data.tar.gz: 24ee3bc3aab8dc3b4bf427dc5a65477efdf8b2dbb03edefeb8f23ee77ee0adf31c5e3d022b26223a36c0d2f69125177c958cae712a6f18feff7cde5e6a76d51d

data/README.md CHANGED Viewed

@@ -62,8 +62,16 @@ require "tus/storage/filesystem"
 Tus::Server.opts[:storage] = Tus::Storage::Filesystem.new("public/cache")
 ```
-One downside of filesystem storage is that by default it doesn't work if you
-want to run tus-ruby-servers on multiple servers, you'd have to set up a shared
+If the configured directory doesn't exist, it will automatically be created.
+By default the UNIX permissions applied will be 0644 for files and 0755 for
+directories, but you can set different permissions:
+```rb
+Tus::Storage::Filesystem.new("data", permissions: 0600, directory_permissions: 0777)
+```
+One downside of filesystem storage is that it doesn't work by default if you
+want to run tus-ruby-server on multiple servers, you'd have to set up a shared
 filesystem between the servers. Another downside is that you have to make sure
 your servers have enough disk space. Also, if you're using Heroku, you cannot
 store files on the filesystem as they won't persist.
@@ -88,21 +96,17 @@ client = Mongo::Client.new("mongodb://127.0.0.1:27017/mydb")
 Tus::Server.opts[:storage] = Tus::Storage::Gridfs.new(client: client)
 ```
-The Gridfs specification requires that all chunks are of equal size, except the
-last chunk. `Tus::Storage::Gridfs` will by default automatically make the
-Gridfs chunk size equal to the size of the first uploaded chunk. This means
-that all of the uploaded chunks need to be of equal size (except the last
-chunk).
-If you don't want the Gridfs chunk size to be equal to the size of the uploaded
-chunks, you can hardcode the chunk size that will be used for all uploads.
+By default MongoDB Gridfs stores files in chunks of 256KB, but you can change
+that with the `:chunk_size` option:
 ```rb
-Tus::Storage::Gridfs.new(client: client, chunk_size: 256*1024) # 256 KB
+Tus::Storage::Gridfs.new(client: client, chunk_size: 1*1024*1024) # 1 MB
 ```
-Just note that in this case the size of each uploaded chunk (except the last
-one) needs to be a multiple of the `:chunk_size`.
+Note that if you're using the [concatenation] tus feature with Gridfs, all
+partial uploads except the last one are required to fill in their Gridfs
+chunks, meaning the length of each partial upload needs to be a multiple of the
+`:chunk_size` number.
 ### Amazon S3
@@ -125,15 +129,9 @@ Tus::Server.opts[:storage] = Tus::Storage::S3.new(
 )
 ```
-It might seem at first that using a remote storage like Amazon S3 will slow
-down the overall upload, but the time it takes for the client to upload the
-file to the Rack app is in general *much* longer than the time for the server
-to upload that chunk to S3, because of the differences in the Internet
-connection speed between the user's computer and server.
-One thing to note is that S3's multipart API requires each chunk except the last
-one to be 5MB or larger, so that is the minimum chunk size that you can specify
-on your tus client if you want to use the S3 storage.
+One thing to note is that S3's multipart API requires each chunk except the
+last to be **5MB or larger**, so that is the minimum chunk size that you can
+specify on your tus client if you want to use the S3 storage.
 If you want to files to be stored in a certain subdirectory, you can specify
 a `:prefix` in the storage configuration.
@@ -174,12 +172,11 @@ def expire_files(expiration_date)          ... end
 ## Maximum size
-By default the maximum size for an uploaded file is 1GB, but you can change
-that:
+By default the size of files the tus server will accept is unlimited, but you
+can configure the maximum file size:
 ```rb
 Tus::Server.opts[:max_size] = 5 * 1024*1024*1024 # 5GB
-Tus::Server.opts[:max_size] = nil                # no limit
 ```
 ## Expiration

data/lib/tus/input.rb CHANGED Viewed

@@ -21,7 +21,11 @@ module Tus
     end
     def size
-      @input.size
+      if defined?(Rack::Lint) && @input.is_a?(Rack::Lint::InputWrapper)
+        @input.instance_variable_get("@input").size
+      else
+        @input.size
+      end
     end
     def close

data/lib/tus/server.rb CHANGED Viewed

@@ -22,11 +22,12 @@ module Tus
     SUPPORTED_CHECKSUM_ALGORITHMS = %w[sha1 sha256 sha384 sha512 md5 crc32]
     RESUMABLE_CONTENT_TYPE = "application/offset+octet-stream"
-    opts[:max_size]        = 1024*1024*1024
+    opts[:max_size]        = nil
     opts[:expiration_time] = 7*24*60*60
     opts[:disposition]     = "inline"
     plugin :all_verbs
+    plugin :default_headers, {"Content-Type" => ""}
     plugin :delete_empty_headers
     plugin :request_headers
     plugin :not_allowed
@@ -130,6 +131,10 @@ module Tus
           info["Upload-Offset"] = (info.offset + input.size).to_s
           info["Upload-Expires"] = (Time.now + expiration_time).httpdate
+          if info.offset == info.length # last chunk
+            storage.finalize_file(uid, info.to_h) if storage.respond_to?(:finalize_file)
+          end
           storage.update_info(uid, info.to_h)
           response.headers.update(info.headers)
@@ -144,8 +149,8 @@ module Tus
           metadata = info.metadata
           response.headers["Content-Disposition"] = opts[:disposition]
-          response.headers["Content-Disposition"] << "; filename=\"#{metadata["filename"]}\"" if metadata["filename"]
-          response.headers["Content-Type"] = metadata["content_type"] if metadata["content_type"]
+          response.headers["Content-Disposition"] += "; filename=\"#{metadata["filename"]}\"" if metadata["filename"]
+          response.headers["Content-Type"] = metadata["content_type"] || "application/octet-stream"
           response = storage.get_file(uid, info.to_h, range: range)
@@ -208,7 +213,7 @@ module Tus
       if length
         error!(403, "Cannot modify completed upload") if current_offset == length
         error!(413, "Size of this chunk surpasses Upload-Length") if Integer(request.content_length) + current_offset > length
-      else
+      elsif max_size
         error!(413, "Size of this chunk surpasses Tus-Max-Size") if Integer(request.content_length) + current_offset > max_size
       end
     end
@@ -314,6 +319,7 @@ module Tus
     def error!(status, message)
       response.status = status
       response.write(message) unless request.head?
+      response.headers["Content-Type"] = "text/plain"
       request.halt
     end

data/lib/tus/storage/filesystem.rb CHANGED Viewed

@@ -9,76 +9,91 @@ module Tus
     class Filesystem
       attr_reader :directory
-      def initialize(directory)
-        @directory = Pathname(directory)
+      def initialize(directory, permissions: 0644, directory_permissions: 0755)
+        @directory             = Pathname(directory)
+        @permissions           = permissions
+        @directory_permissions = directory_permissions
         create_directory! unless @directory.exist?
       end
       def create_file(uid, info = {})
-        file_path(uid).open("wb") { |file| file.write("") }
+        file_path(uid).binwrite("")
+        file_path(uid).chmod(@permissions)
+        info_path(uid).binwrite("{}")
+        info_path(uid).chmod(@permissions)
       end
       def concatenate(uid, part_uids, info = {})
+        create_file(uid, info)
         file_path(uid).open("wb") do |file|
-          begin
-            part_uids.each do |part_uid|
+          part_uids.each do |part_uid|
+            # Rather than checking upfront whether all parts exist, we use
+            # exception flow to account for the possibility of parts being
+            # deleted during concatenation.
+            begin
               IO.copy_stream(file_path(part_uid), file)
+            rescue Errno::ENOENT
+              raise Tus::Error, "some parts for concatenation are missing"
             end
-          rescue Errno::ENOENT
-            raise Tus::Error, "some parts for concatenation are missing"
           end
         end
+        # Delete parts after concatenation.
         delete(part_uids)
-        # server requires us to return the size of the concatenated file
+        # Tus server requires us to return the size of the concatenated file.
         file_path(uid).size
       end
-      def patch_file(uid, io, info = {})
-        raise Tus::NotFound if !file_path(uid).exist?
+      def patch_file(uid, input, info = {})
+        exists!(uid)
-        file_path(uid).open("ab") { |file| IO.copy_stream(io, file) }
+        file_path(uid).open("ab") { |file| IO.copy_stream(input, file) }
       end
       def read_info(uid)
-        raise Tus::NotFound if !file_path(uid).exist?
+        exists!(uid)
-        begin
-          data = info_path(uid).binread
-        rescue Errno::ENOENT
-          data = "{}"
-        end
-        JSON.parse(data)
+        JSON.parse(info_path(uid).binread)
       end
       def update_info(uid, info)
-        info_path(uid).open("wb") { |file| file.write(info.to_json) }
+        exists!(uid)
+        info_path(uid).binwrite(JSON.generate(info))
       end
       def get_file(uid, info = {}, range: nil)
-        raise Tus::NotFound if !file_path(uid).exist?
+        exists!(uid)
         file = file_path(uid).open("rb")
-        range ||= 0..file.size-1
+        range ||= 0..(file.size - 1)
+        length = range.end - range.begin + 1
+        # Create an Enumerator which will yield chunks of the requested file
+        # content, allowing tus server to efficiently stream requested content
+        # to the client.
         chunks = Enumerator.new do |yielder|
           file.seek(range.begin)
-          remaining_length = range.end - range.begin + 1
-          buffer = ""
+          remaining_length = length
           while remaining_length > 0
-            chunk = file.read([16*1024, remaining_length].min, buffer)
-            break unless chunk
-            remaining_length -= chunk.length
+            chunk = file.read([16*1024, remaining_length].min, buffer ||= "") or break
+            remaining_length -= chunk.bytesize
             yielder << chunk
           end
         end
-        Response.new(chunks: chunks, close: ->{file.close})
+        # We return a response object that responds to #each, #length and #close,
+        # which the tus server can return directly as the Rack response.
+        Response.new(
+          chunks: chunks,
+          length: length,
+          close:  ->{file.close},
+        )
       end
       def delete_file(uid, info = {})
@@ -86,11 +101,9 @@ module Tus
       end
       def expire_files(expiration_date)
-        uids = []
-        Pathname.glob(directory.join("*.file")).each do |pathname|
-          uids << pathname.basename(".*") if pathname.mtime <= expiration_date
-        end
+        uids = directory.children
+          .select { |pathname| pathname.mtime <= expiration_date }
+          .map { |pathname| pathname.basename(".*").to_s }
         delete(uids)
       end
@@ -99,9 +112,14 @@ module Tus
       def delete(uids)
         paths = uids.flat_map { |uid| [file_path(uid), info_path(uid)] }
         FileUtils.rm_f paths
       end
+      def exists!(uid)
+        raise Tus::NotFound if !file_path(uid).exist?
+      end
       def file_path(uid)
         directory.join("#{uid}.file")
       end
@@ -112,13 +130,18 @@ module Tus
       def create_directory!
         directory.mkpath
-        directory.chmod(0755)
+        directory.chmod(@directory_permissions)
       end
       class Response
-        def initialize(chunks:, close:)
+        def initialize(chunks:, close:, length:)
           @chunks = chunks
           @close  = close
+          @length = length
+        end
+        def length
+          @length
         end
         def each(&block)

data/lib/tus/storage/gridfs.rb CHANGED Viewed

@@ -10,7 +10,7 @@ module Tus
     class Gridfs
       attr_reader :client, :prefix, :bucket, :chunk_size
-      def initialize(client:, prefix: "fs", chunk_size: nil)
+      def initialize(client:, prefix: "fs", chunk_size: 256*1024)
         @client = client
         @prefix = prefix
         @bucket = @client.database.fs(bucket_name: @prefix)
@@ -19,143 +19,115 @@ module Tus
       end
       def create_file(uid, info = {})
-        tus_info     = Tus::Info.new(info)
-        content_type = tus_info.metadata["content_type"]
+        content_type = Tus::Info.new(info).metadata["content_type"]
-        file = Mongo::Grid::File.new("",
+        create_grid_file(
           filename:     uid,
-          metadata:     {},
-          chunk_size:   chunk_size,
           content_type: content_type,
         )
-        bucket.insert_one(file)
       end
       def concatenate(uid, part_uids, info = {})
-        file_infos = bucket.files_collection.find(filename: {"$in" => part_uids}).to_a
-        file_infos.sort_by! { |file_info| part_uids.index(file_info[:filename]) }
-        if file_infos.count != part_uids.count
-          raise Tus::Error, "some parts for concatenation are missing"
-        end
-        chunk_sizes = file_infos.map { |file_info| file_info[:chunkSize] }
-        if chunk_sizes[0..-2].uniq.count > 1
-          raise Tus::Error, "some parts have different chunk sizes, so they cannot be concatenated"
-        end
+        grid_infos = files_collection.find(filename: {"$in" => part_uids}).to_a
+        grid_infos.sort_by! { |grid_info| part_uids.index(grid_info[:filename]) }
-        if chunk_sizes.uniq != [chunk_sizes.last] && bucket.chunks_collection.find(files_id: file_infos.last[:_id]).count > 1
-          raise Tus::Error, "last part has different chunk size and is composed of more than one chunk"
-        end
+        validate_parts!(grid_infos, part_uids)
-        length       = file_infos.inject(0) { |sum, file_info| sum + file_info[:length] }
-        chunk_size   = file_infos.first[:chunkSize]
-        tus_info     = Tus::Info.new(info)
-        content_type = tus_info.metadata["content_type"]
+        length       = grid_infos.map { |doc| doc[:length] }.reduce(0, :+)
+        content_type = Tus::Info.new(info).metadata["content_type"]
-        file = Mongo::Grid::File.new("",
+        grid_file = create_grid_file(
           filename:     uid,
-          metadata:     {},
-          chunk_size:   chunk_size,
           length:       length,
           content_type: content_type,
         )
-        bucket.insert_one(file)
-        file_infos.inject(0) do |offset, file_info|
-          result = bucket.chunks_collection
-            .find(files_id: file_info[:_id])
-            .update_many("$set" => {files_id: file.id}, "$inc" => {n: offset})
+        # Update the chunks belonging to parts so that they point to the new file.
+        grid_infos.inject(0) do |offset, grid_info|
+          result = chunks_collection
+            .find(files_id: grid_info[:_id])
+            .update_many("$set" => {files_id: grid_file.id}, "$inc" => {n: offset})
           offset += result.modified_count
         end
-        bucket.files_collection.delete_many(filename: {"$in" => part_uids})
+        # Delete the parts after concatenation.
+        files_collection.delete_many(filename: {"$in" => part_uids})
-        # server requires us to return the size of the concatenated file
+        # Tus server requires us to return the size of the concatenated file.
         length
       end
-      def patch_file(uid, io, info = {})
-        file_info = bucket.files_collection.find(filename: uid).first
-        raise Tus::NotFound if file_info.nil?
-        file_info[:md5] = Digest::MD5.new # hack for `Chunk.split` updating MD5
-        file_info[:chunkSize] ||= io.size
-        file_info = Mongo::Grid::File::Info.new(Mongo::Options::Mapper.transform(file_info, Mongo::Grid::File::Info::MAPPINGS.invert))
-        tus_info = Tus::Info.new(info)
-        last_chunk = (tus_info.length && io.size == tus_info.remaining_length)
-        if io.size % file_info.chunk_size != 0 && !last_chunk
-          raise Tus::Error,
-            "Input has length #{io.size} but expected it to be a multiple of " \
-            "chunk size #{file_info.chunk_size} or for it to be the last chunk"
-        end
+      def patch_file(uid, input, info = {})
+        grid_info = find_grid_info!(uid)
-        offset = bucket.chunks_collection.find(files_id: file_info.id).count
-        chunks = Mongo::Grid::File::Chunk.split(io, file_info, offset)
+        patch_last_chunk(input, grid_info)
-        bucket.chunks_collection.insert_many(chunks)
-        chunks.each { |chunk| chunk.data.data.clear } # deallocate strings
+        grid_chunks = split_into_grid_chunks(input, grid_info)
+        chunks_collection.insert_many(grid_chunks)
+        grid_chunks.each { |grid_chunk| grid_chunk.data.data.clear } # deallocate strings
-        bucket.files_collection.find(filename: uid).update_one("$set" => {
-          length:     file_info.length + io.size,
+        # Update the total length and refresh the upload date on each update,
+        # which are used in #get_file, #concatenate and #expire_files.
+        files_collection.find(filename: uid).update_one("$set" => {
+          length:     grid_info[:length] + input.size,
           uploadDate: Time.now.utc,
-          chunkSize:  file_info.chunk_size,
         })
       end
       def read_info(uid)
-        file_info = bucket.files_collection.find(filename: uid).first
-        raise Tus::NotFound if file_info.nil?
+        grid_info = find_grid_info!(uid)
-        file_info.fetch("metadata")
+        grid_info[:metadata]
       end
       def update_info(uid, info)
-        bucket.files_collection.find(filename: uid)
-          .update_one("$set" => {metadata: info})
+        grid_info = find_grid_info!(uid)
+        files_collection.update_one({filename: uid}, {"$set" => {metadata: info}})
       end
       def get_file(uid, info = {}, range: nil)
-        file_info = bucket.files_collection.find(filename: uid).first
-        raise Tus::NotFound if file_info.nil?
+        grid_info = find_grid_info!(uid)
-        filter = {files_id: file_info[:_id]}
+        range ||= 0..(grid_info[:length] - 1)
+        length = range.end - range.begin + 1
-        if range
-          chunk_start = range.begin / file_info[:chunkSize] if range.begin
-          chunk_stop  = range.end   / file_info[:chunkSize] if range.end
+        chunk_start = range.begin / grid_info[:chunkSize]
+        chunk_stop  = range.end   / grid_info[:chunkSize]
-          filter[:n] = {}
-          filter[:n].update("$gte" => chunk_start) if chunk_start
-          filter[:n].update("$lte" => chunk_stop) if chunk_stop
-        end
+        filter = {
+          files_id: grid_info[:_id],
+          n: {"$gte" => chunk_start, "$lte" => chunk_stop}
+        }
-        chunks_view = bucket.chunks_collection.find(filter).read(bucket.read_preference).sort(n: 1)
+        # Query only the subset of chunks specified by the range query. We
+        # cannot use Mongo::FsBucket#open_download_stream here because it
+        # doesn't support changing the filter.
+        chunks_view = chunks_collection.find(filter).sort(n: 1)
+        # Create an Enumerator which will yield chunks of the requested file
+        # content, allowing tus server to efficiently stream requested content
+        # to the client.
         chunks = Enumerator.new do |yielder|
           chunks_view.each do |document|
             data = document[:data].data
             if document[:n] == chunk_start && document[:n] == chunk_stop
-              byte_start = range.begin % file_info[:chunkSize]
-              byte_stop  = range.end   % file_info[:chunkSize]
+              byte_start = range.begin % grid_info[:chunkSize]
+              byte_stop  = range.end   % grid_info[:chunkSize]
             elsif document[:n] == chunk_start
-              byte_start = range.begin % file_info[:chunkSize]
-              byte_stop  = file_info[:chunkSize] - 1
+              byte_start = range.begin % grid_info[:chunkSize]
+              byte_stop  = grid_info[:chunkSize] - 1
             elsif document[:n] == chunk_stop
               byte_start = 0
-              byte_stop  = range.end % file_info[:chunkSize]
+              byte_stop  = range.end % grid_info[:chunkSize]
             end
+            # If we're on the first or last chunk, return a subset of the chunk
+            # specified by the given range, otherwise return the full chunk.
             if byte_start && byte_stop
-              partial_data = data[byte_start..byte_stop]
-              yielder << partial_data
-              partial_data.clear # deallocate chunk string
+              yielder << data[byte_start..byte_stop]
             else
               yielder << data
             end
@@ -164,26 +136,100 @@ module Tus
           end
         end
-        Response.new(chunks: chunks, close: ->{chunks_view.close_query})
+        # We return a response object that responds to #each, #length and #close,
+        # which the tus server can return directly as the Rack response.
+        Response.new(
+          chunks: chunks,
+          length: length,
+          close:  ->{chunks_view.close_query},
+        )
       end
       def delete_file(uid, info = {})
-        file_info = bucket.files_collection.find(filename: uid).first
-        bucket.delete(file_info.fetch("_id")) if file_info
+        grid_info = files_collection.find(filename: uid).first
+        bucket.delete(grid_info[:_id]) if grid_info
       end
       def expire_files(expiration_date)
-        file_infos = bucket.files_collection.find(uploadDate: {"$lte" => expiration_date}).to_a
-        file_info_ids = file_infos.map { |info| info[:_id] }
+        grid_infos = files_collection.find(uploadDate: {"$lte" => expiration_date}).to_a
+        grid_info_ids = grid_infos.map { |info| info[:_id] }
+        files_collection.delete_many(_id: {"$in" => grid_info_ids})
+        chunks_collection.delete_many(files_id: {"$in" => grid_info_ids})
+      end
+      private
+      def create_grid_file(**options)
+        file_options = {metadata: {}, chunk_size: chunk_size}.merge(options)
+        grid_file = Mongo::Grid::File.new("", file_options)
+        bucket.insert_one(grid_file)
+        grid_file
+      end
+      def split_into_grid_chunks(io, grid_info)
+        grid_info[:md5] = Digest::MD5.new # hack for `Chunk.split` updating MD5
+        grid_info = Mongo::Grid::File::Info.new(Mongo::Options::Mapper.transform(grid_info, Mongo::Grid::File::Info::MAPPINGS.invert))
+        offset = chunks_collection.count(files_id: grid_info.id)
-        bucket.files_collection.delete_many(_id: {"$in" => file_info_ids})
-        bucket.chunks_collection.delete_many(files_id: {"$in" => file_info_ids})
+        Mongo::Grid::File::Chunk.split(io, grid_info, offset)
+      end
+      def patch_last_chunk(input, grid_info)
+        if grid_info[:length] % grid_info[:chunkSize] != 0
+          last_chunk = chunks_collection.find(files_id: grid_info[:_id]).sort(n: -1).limit(1).first
+          data = last_chunk[:data].data
+          data << input.read(grid_info[:chunkSize] - data.length)
+          chunks_collection.find(files_id: grid_info[:_id], n: last_chunk[:n])
+            .update_one("$set" => {data: BSON::Binary.new(data)})
+          data.clear # deallocate string
+        end
+      end
+      def find_grid_info!(uid)
+        files_collection.find(filename: uid).first or raise Tus::NotFound
+      end
+      def validate_parts!(grid_infos, part_uids)
+        validate_parts_presence!(grid_infos, part_uids)
+        validate_parts_full_chunks!(grid_infos)
+      end
+      def validate_parts_presence!(grid_infos, part_uids)
+        if grid_infos.count != part_uids.count
+          raise Tus::Error, "some parts for concatenation are missing"
+        end
+      end
+      def validate_parts_full_chunks!(grid_infos)
+        grid_infos.each do |grid_info|
+          if grid_info[:length] % grid_info[:chunkSize] != 0 && grid_info != grid_infos.last
+            raise Tus::Error, "cannot concatenate parts which aren't evenly distributed across chunks"
+          end
+        end
+      end
+      def files_collection
+        bucket.files_collection
+      end
+      def chunks_collection
+        bucket.chunks_collection
       end
       class Response
-        def initialize(chunks:, close:)
+        def initialize(chunks:, close:, length:)
           @chunks = chunks
           @close  = close
+          @length = length
+        end
+        def length
+          @length
         end
         def each(&block)

data/lib/tus/storage/s3.rb CHANGED Viewed

@@ -16,13 +16,14 @@ module Tus
       attr_reader :client, :bucket, :prefix, :upload_options
-      def initialize(bucket:, prefix: nil, upload_options: {}, **client_options)
+      def initialize(bucket:, prefix: nil, upload_options: {}, thread_count: 10, **client_options)
         resource = Aws::S3::Resource.new(**client_options)
-        @client = resource.client
-        @bucket = resource.bucket(bucket)
-        @prefix = prefix
+        @client         = resource.client
+        @bucket         = resource.bucket(bucket)
+        @prefix         = prefix
         @upload_options = upload_options
+        @thread_count   = thread_count
       end
       def create_file(uid, info = {})
@@ -40,73 +41,33 @@ module Tus
         info["multipart_id"]    = multipart_upload.id
         info["multipart_parts"] = []
+        multipart_upload
       end
       def concatenate(uid, part_uids, info = {})
-        create_file(uid, info)
-        multipart_upload = object(uid).multipart_upload(info["multipart_id"])
-        queue = Queue.new
-        part_uids.each_with_index do |part_uid, idx|
-          queue << {
-            copy_source: [bucket.name, object(part_uid).key].join("/"),
-            part_number: idx + 1
-          }
-        end
-        threads = 10.times.map do
-          Thread.new do
-            Thread.current.abort_on_exception = true
-            completed = []
-            begin
-              loop do
-                multipart_copy_task = queue.deq(true) rescue break
+        multipart_upload = create_file(uid, info)
-                part_number = multipart_copy_task[:part_number]
-                copy_source = multipart_copy_task[:copy_source]
+        objects = part_uids.map { |part_uid| object(part_uid) }
+        parts   = copy_parts(objects, multipart_upload)
-                part = multipart_upload.part(part_number)
-                response = part.copy_from(copy_source: copy_source)
-                completed << {
-                  part_number: part_number,
-                  etag: response.copy_part_result.etag,
-                }
-              end
-              completed
-            rescue
-              queue.clear
-              raise
-            end
-          end
+        parts.each do |part|
+          info["multipart_parts"] << { "part_number" => part[:part_number], "etag" => part[:etag] }
         end
-        parts = threads.flat_map(&:value).sort_by { |part| part[:part_number] }
-        multipart_upload.complete(multipart_upload: {parts: parts})
+        finalize_file(uid, info)
         delete(part_uids.flat_map { |part_uid| [object(part_uid), object("#{part_uid}.info")] })
-        info.delete("multipart_id")
-        info.delete("multipart_parts")
-        client.head_object(bucket: bucket.name, key: object(uid).key).content_length
-      rescue
+        # Tus server requires us to return the size of the concatenated file.
+        object = client.head_object(bucket: bucket.name, key: object(uid).key)
+        object.content_length
+      rescue => error
         abort_multipart_upload(multipart_upload) if multipart_upload
-        raise
+        raise error
       end
       def patch_file(uid, io, info = {})
-        tus_info = Tus::Info.new(info)
-        last_chunk = (tus_info.length && io.size == tus_info.remaining_length)
-        if io.size < MIN_PART_SIZE && !last_chunk
-          raise Tus::Error, "Chunk size cannot be smaller than 5MB"
-        end
         upload_id   = info["multipart_id"]
         part_number = info["multipart_parts"].count + 1
@@ -114,30 +75,27 @@ module Tus
         multipart_part   = multipart_upload.part(part_number)
         md5              = Tus::Checksum.new("md5").generate(io)
-        begin
-          response = multipart_part.upload(body: io, content_md5: md5)
-        rescue Aws::S3::Errors::NoSuchUpload
-          raise Tus::NotFound
-        end
+        response = multipart_part.upload(body: io, content_md5: md5)
         info["multipart_parts"] << {
           "part_number" => part_number,
           "etag"        => response.etag[/"(.+)"/, 1],
         }
+      rescue Aws::S3::Errors::NoSuchUpload
+        raise Tus::NotFound
+      end
-        # finalize the multipart upload if this chunk was the last part
-        if last_chunk
-          multipart_upload.complete(
-            multipart_upload: {
-              parts: info["multipart_parts"].map do |part|
-                {part_number: part["part_number"], etag: part["etag"]}
-              end
-            }
-          )
-          info.delete("multipart_id")
-          info.delete("multipart_parts")
+      def finalize_file(uid, info = {})
+        upload_id = info["multipart_id"]
+        parts = info["multipart_parts"].map do |part|
+          { part_number: part["part_number"], etag: part["etag"] }
         end
+        multipart_upload = object(uid).multipart_upload(upload_id)
+        multipart_upload.complete(multipart_upload: {parts: parts})
+        info.delete("multipart_id")
+        info.delete("multipart_parts")
       end
       def read_info(uid)
@@ -152,29 +110,31 @@ module Tus
       end
       def get_file(uid, info = {}, range: nil)
-        if range
-          range = "bytes=#{range.begin}-#{range.end}"
-        end
+        object = object(uid)
+        range  = "bytes=#{range.begin}-#{range.end}" if range
         raw_chunks = Enumerator.new do |yielder|
-          object(uid).get(range: range) do |chunk|
+          object.get(range: range) do |chunk|
             yielder << chunk
             chunk.clear # deallocate string
           end
         end
-        begin
-          first_chunk = raw_chunks.next
-        rescue Aws::S3::Errors::NoSuchKey
-          raise Tus::NotFound
-        end
+        # Start the request to be notified if the object doesn't exist, and to
+        # get Aws::S3::Object#content_length.
+        first_chunk = raw_chunks.next
         chunks = Enumerator.new do |yielder|
           yielder << first_chunk
           loop { yielder << raw_chunks.next }
         end
-        Response.new(chunks: chunks)
+        Response.new(
+          chunks: chunks,
+          length: object.content_length,
+        )
+      rescue Aws::S3::Errors::NoSuchKey
+        raise Tus::NotFound
       end
       def delete_file(uid, info = {})
@@ -226,18 +186,71 @@ module Tus
         # multipart upload was successfully aborted or doesn't exist
       end
+      def copy_parts(objects, multipart_upload)
+        parts = compute_parts(objects, multipart_upload)
+        queue = parts.inject(Queue.new) { |queue, part| queue << part }
+        threads = @thread_count.times.map { copy_part_thread(queue) }
+        threads.flat_map(&:value).sort_by { |part| part[:part_number] }
+      end
+      def compute_parts(objects, multipart_upload)
+        objects.map.with_index do |object, idx|
+          {
+            bucket:      multipart_upload.bucket_name,
+            key:         multipart_upload.object_key,
+            upload_id:   multipart_upload.id,
+            copy_source: [object.bucket_name, object.key].join("/"),
+            part_number: idx + 1,
+          }
+        end
+      end
+      def copy_part_thread(queue)
+        Thread.new do
+          Thread.current.abort_on_exception = true
+          begin
+            results = []
+            loop do
+              part = queue.deq(true) rescue break
+              results << copy_part(part)
+            end
+            results
+          rescue => error
+            queue.clear
+            raise error
+          end
+        end
+      end
+      def copy_part(part)
+        response = client.upload_part_copy(part)
+        { part_number: part[:part_number], etag: response.copy_part_result.etag }
+      end
       def object(key)
         bucket.object([*prefix, key].join("/"))
       end
       class Response
-        def initialize(chunks:)
+        def initialize(chunks:, length:)
           @chunks = chunks
+          @length = length
+        end
+        def length
+          @length
         end
         def each(&block)
           @chunks.each(&block)
         end
+        def close
+          # aws-sdk doesn't provide an API to terminate the HTTP connection
+        end
       end
     end
   end

data/tus-server.gemspec CHANGED Viewed

@@ -1,6 +1,6 @@
 Gem::Specification.new do |gem|
   gem.name         = "tus-server"
-  gem.version      = "0.9.1"
+  gem.version      = "0.10.0"
   gem.required_ruby_version = ">= 2.1"

metadata CHANGED Viewed

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: tus-server
 version: !ruby/object:Gem::Version
-  version: 0.9.1
+  version: 0.10.0
 platform: ruby
 authors:
 - Janko Marohnić
 autorequire:
 bindir: bin
 cert_chain: []
-date: 2017-03-24 00:00:00.000000000 Z
+date: 2017-03-27 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: roda