RubyGems - swineherd-fs - Versions diffs - 0.0.2 - Mend

swineherd-fs 0.0.2

Files changed (13) hide show

data/Gemfile +2 -0
data/LICENSE +188 -0
data/README.textile +66 -0
data/VERSION +1 -0
data/lib/swineherd-fs/hadoopfilesystem.rb +249 -0
data/lib/swineherd-fs/localfilesystem.rb +81 -0
data/lib/swineherd-fs/s3filesystem.rb +311 -0
data/lib/swineherd-fs.rb +91 -0
data/rspec.watchr +19 -0
data/spec/filesystem_spec.rb +186 -0
data/spec/spec_helper.rb +2 -0
data/swineherd-fs.gemspec +23 -0
metadata +121 -0

data/Gemfile ADDED Viewed

	@@ -0,0 +1,2 @@
1	+ source "http://rubygems.org"
2	+ gemspec

data/LICENSE ADDED Viewed

@@ -0,0 +1,188 @@
+Copyright 2011 Infochimps, Inc
+Apache License Version 2.0, January 2004, http://www.apache.org/licenses/
+TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+1. Definitions.
+   "License" shall mean the terms and conditions for use, reproduction,
+   and distribution as defined by Sections 1 through 9 of this document.
+   "Licensor" shall mean the copyright owner or entity authorized by
+   the copyright owner that is granting the License.
+   "Legal Entity" shall mean the union of the acting entity and all
+   other entities that control, are controlled by, or are under common
+   control with that entity. For the purposes of this definition,
+   "control" means (i) the power, direct or indirect, to cause the
+   direction or management of such entity, whether by contract or
+   otherwise, or (ii) ownership of fifty percent (50%) or more of the
+   outstanding shares, or (iii) beneficial ownership of such entity.
+   "You" (or "Your") shall mean an individual or Legal Entity
+   exercising permissions granted by this License.
+   "Source" form shall mean the preferred form for making modifications,
+   including but not limited to software source code, documentation
+   source, and configuration files.
+   "Object" form shall mean any form resulting from mechanical
+   transformation or translation of a Source form, including but
+   not limited to compiled object code, generated documentation,
+   and conversions to other media types.
+   "Work" shall mean the work of authorship, whether in Source or
+   Object form, made available under the License, as indicated by a
+   copyright notice that is included in or attached to the work
+   (an example is provided in the Appendix below).
+   "Derivative Works" shall mean any work, whether in Source or Object
+   form, that is based on (or derived from) the Work and for which the
+   editorial revisions, annotations, elaborations, or other modifications
+   represent, as a whole, an original work of authorship. For the purposes
+   of this License, Derivative Works shall not include works that remain
+   separable from, or merely link (or bind by name) to the interfaces of,
+   the Work and Derivative Works thereof.
+   "Contribution" shall mean any work of authorship, including
+   the original version of the Work and any modifications or additions
+   to that Work or Derivative Works thereof, that is intentionally
+   submitted to Licensor for inclusion in the Work by the copyright owner
+   or by an individual or Legal Entity authorized to submit on behalf of
+   the copyright owner. For the purposes of this definition, "submitted"
+   means any form of electronic, verbal, or written communication sent
+   to the Licensor or its representatives, including but not limited to
+   communication on electronic mailing lists, source code control systems,
+   and issue tracking systems that are managed by, or on behalf of, the
+   Licensor for the purpose of discussing and improving the Work, but
+   excluding communication that is conspicuously marked or otherwise
+   designated in writing by the copyright owner as "Not a Contribution."
+   "Contributor" shall mean Licensor and any individual or Legal Entity
+   on behalf of whom a Contribution has been received by Licensor and
+   subsequently incorporated within the Work.
+2. Grant of Copyright License. Subject to the terms and conditions of
+   this License, each Contributor hereby grants to You a perpetual,
+   worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+   copyright license to reproduce, prepare Derivative Works of,
+   publicly display, publicly perform, sublicense, and distribute the
+   Work and such Derivative Works in Source or Object form.
+3. Grant of Patent License. Subject to the terms and conditions of
+   this License, each Contributor hereby grants to You a perpetual,
+   worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+   (except as stated in this section) patent license to make, have made,
+   use, offer to sell, sell, import, and otherwise transfer the Work,
+   where such license applies only to those patent claims licensable
+   by such Contributor that are necessarily infringed by their
+   Contribution(s) alone or by combination of their Contribution(s)
+   with the Work to which such Contribution(s) was submitted. If You
+   institute patent litigation against any entity (including a
+   cross-claim or counterclaim in a lawsuit) alleging that the Work
+   or a Contribution incorporated within the Work constitutes direct
+   or contributory patent infringement, then any patent licenses
+   granted to You under this License for that Work shall terminate
+   as of the date such litigation is filed.
+4. Redistribution. You may reproduce and distribute copies of the
+   Work or Derivative Works thereof in any medium, with or without
+   modifications, and in Source or Object form, provided that You
+   meet the following conditions:
+   (a) You must give any other recipients of the Work or
+       Derivative Works a copy of this License; and
+   (b) You must cause any modified files to carry prominent notices
+       stating that You changed the files; and
+   (c) You must retain, in the Source form of any Derivative Works
+       that You distribute, all copyright, patent, trademark, and
+       attribution notices from the Source form of the Work,
+       excluding those notices that do not pertain to any part of
+       the Derivative Works; and
+   (d) If the Work includes a "NOTICE" text file as part of its
+       distribution, then any Derivative Works that You distribute must
+       include a readable copy of the attribution notices contained
+       within such NOTICE file, excluding those notices that do not
+       pertain to any part of the Derivative Works, in at least one
+       of the following places: within a NOTICE text file distributed
+       as part of the Derivative Works; within the Source form or
+       documentation, if provided along with the Derivative Works; or,
+       within a display generated by the Derivative Works, if and
+       wherever such third-party notices normally appear. The contents
+       of the NOTICE file are for informational purposes only and
+       do not modify the License. You may add Your own attribution
+       notices within Derivative Works that You distribute, alongside
+       or as an addendum to the NOTICE text from the Work, provided
+       that such additional attribution notices cannot be construed
+       as modifying the License.
+   You may add Your own copyright statement to Your modifications and
+   may provide additional or different license terms and conditions
+   for use, reproduction, or distribution of Your modifications, or
+   for any such Derivative Works as a whole, provided Your use,
+   reproduction, and distribution of the Work otherwise complies with
+   the conditions stated in this License.
+5. Submission of Contributions. Unless You explicitly state otherwise,
+   any Contribution intentionally submitted for inclusion in the Work
+   by You to the Licensor shall be under the terms and conditions of
+   this License, without any additional terms or conditions.
+   Notwithstanding the above, nothing herein shall supersede or modify
+   the terms of any separate license agreement you may have executed
+   with Licensor regarding such Contributions.
+6. Trademarks. This License does not grant permission to use the trade
+   names, trademarks, service marks, or product names of the Licensor,
+   except as required for reasonable and customary use in describing the
+   origin of the Work and reproducing the content of the NOTICE file.
+7. Disclaimer of Warranty. Unless required by applicable law or
+   agreed to in writing, Licensor provides the Work (and each
+   Contributor provides its Contributions) on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+   implied, including, without limitation, any warranties or conditions
+   of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
+   PARTICULAR PURPOSE. You are solely responsible for determining the
+   appropriateness of using or redistributing the Work and assume any
+   risks associated with Your exercise of permissions under this License.
+8. Limitation of Liability. In no event and under no legal theory,
+   whether in tort (including negligence), contract, or otherwise,
+   unless required by applicable law (such as deliberate and grossly
+   negligent acts) or agreed to in writing, shall any Contributor be
+   liable to You for damages, including any direct, indirect, special,
+   incidental, or consequential damages of any character arising as a
+   result of this License or out of the use or inability to use the
+   Work (including but not limited to damages for loss of goodwill,
+   work stoppage, computer failure or malfunction, or any and all
+   other commercial damages or losses), even if such Contributor
+   has been advised of the possibility of such damages.
+9. Accepting Warranty or Additional Liability. While redistributing
+   the Work or Derivative Works thereof, You may choose to offer,
+   and charge a fee for, acceptance of support, warranty, indemnity,
+   or other liability obligations and/or rights consistent with this
+   License. However, in accepting such obligations, You may act only
+   on Your own behalf and on Your sole responsibility, not on behalf
+   of any other Contributor, and only if You agree to indemnify,
+   defend, and hold each Contributor harmless for any liability
+   incurred by, or claims asserted against, such Contributor by reason
+   of your accepting any such warranty or additional liability.
+END OF TERMS AND CONDITIONS
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+    http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.

data/README.textile ADDED Viewed

@@ -0,0 +1,66 @@
+h1. Swineherd-fs
+    * @file@ - Local file system. Only thoroughly tested on Ubuntu Linux.
+    * @hdfs@ - Hadoop distributed file system. Uses the Apache Hadoop 0.20 API. Requires JRuby.
+    * @s3@   - Amazon Simple Storage System (s3).
+    * @ftp@  - FTP (Not yet implemented)
+    All filesystem abstractions implement the following core functions, many taken from the UNIX filesystem:
+    * @mv@
+    * @cp@
+    * @cp_r@
+    * @rm@
+    * @rm_r@
+    * @open@
+    * @exists?@
+    * @directory?@
+    * @ls@
+    * @ls_r@
+    * @mkdir_p@
+    Note: Since S3 is just a key-value store, it is difficult to preserve the notion of a directory.  Therefore the @mkdir_p@ function has no purpose, as there cannot be empty directories. @mkdir_p@ currently only ensures that the bucket exists.  This implies that the @directory?@ test only succeeds if the directory is non-empty, which clashes with the notion on the UNIX filesystem.
+    Additionally, the S3 and HDFS abstractions implement functions for moving files to and from the local filesystem:
+    * @copy_to_local@
+    * @copy_from_local@
+    Note: For these methods the destination and source path respectively are assumed to be local, so they do not have to be prefaced by a filescheme.
+    The @Swineherd::Filesystem@ module implements a generic filesystem abstraction using schemed filepaths (hdfs://,s3://,file://).
+    Currently only the following methods are supported for @Swineherd::Filesystem@:
+    * @cp@
+    * @exists?@
+    For example, instead of doing the following:<pre><code>hdfs = Swineherd::HadoopFilesystem.new
+    localfs = Swineherd::LocalFileSystem.new
+    hdfs.copy_to_local('foo/bar/baz.txt', 'foo/bar/baz.txt') unless localfs.exists? 'foo/bar/baz.txt'
+    </code></pre>
+    You can do:<pre><code>fs = Swineherd::Filesystem
+    fs.cp('hdfs://foo/bar/baz.txt','foo/bar/baz.txt') unless fs.exists?('foo/bar/baz.txt')
+    </code></pre>
+    Note: A path without a scheme is treated as a path on the local filesystem, or use the explicit file:// scheme for clarity.  The following are equivalent:
+    <pre><code>fs.exists?('foo/bar/baz.txt')
+    fs.exists?('file://foo/bar/baz.txt')
+    </code></pre>
+h4. Config
+ * In order to use the @S3Filesystem@, Swineherd requires AWS S3 access credentials.
+ * In @~/swineherd.yaml@ or @/etc/swineherd.yaml@:
+ <pre><code>aws:
+   access_key: my_access_key
+   secret_key: my_secret_key
+ </code></pre>
+ * Or just pass them in when creating the instance:
+ <pre><code>S3 = Swineherd::S3FileSystem.new(:access_key => "my_access_key",:secret_key => "my_secret_key")</code></pre>

data/VERSION ADDED Viewed

	@@ -0,0 +1 @@
1	+ 0.0.2

data/lib/swineherd-fs/hadoopfilesystem.rb ADDED Viewed

@@ -0,0 +1,249 @@
+module Swineherd
+  #
+  # Methods for dealing with the Hadoop distributed file system (hdfs). This class
+  # requires that you run with JRuby as it makes use of the native Java Hadoop
+  # libraries.
+  #
+  class HadoopFileSystem
+    attr_accessor :conf, :hdfs
+    def initialize *args
+      set_hadoop_environment if running_jruby?
+      @conf = Java::org.apache.hadoop.conf.Configuration.new
+      if Swineherd.config[:aws]
+        @conf.set("fs.s3.awsAccessKeyId",Swineherd.config[:aws][:access_key])
+        @conf.set("fs.s3.awsSecretAccessKey",Swineherd.config[:aws][:secret_key])
+        @conf.set("fs.s3n.awsAccessKeyId",Swineherd.config[:aws][:access_key])
+        @conf.set("fs.s3n.awsSecretAccessKey",Swineherd.config[:aws][:secret_key])
+      end
+      @hdfs = Java::org.apache.hadoop.fs.FileSystem.get(@conf)
+    end
+    def open path, mode="r", &blk
+      HadoopFile.new(path,mode,self,&blk)
+    end
+    def size path
+      ls_r(path).inject(0){|sz,filepath| sz += @hdfs.get_file_status(Path.new(filepath)).get_len}
+    end
+    def ls path
+      (@hdfs.list_status(Path.new(path)) || []).map{|path| path.get_path.to_s}
+    end
+    #list directories recursively, similar to unix 'ls -R'
+    def ls_r path
+      ls(path).inject([]){|rec_paths,path| rec_paths << path; rec_paths << ls(path) unless file?(path); rec_paths}.flatten
+    end
+    def rm path
+      begin
+        @hdfs.delete(Path.new(path), false)
+      rescue java.io.IOException => e
+        raise Errno::EISDIR, e.message
+      end
+    end
+    def rm_r path
+      @hdfs.delete(Path.new(path), true)
+    end
+    def exists? path
+      @hdfs.exists(Path.new(path))
+    end
+    def directory? path
+      exists?(path) && @hdfs.get_file_status(Path.new(path)).is_dir?
+    end
+    def file? path
+      exists?(path) && @hdfs.isFile(Path.new(path))
+    end
+    def mv srcpath, dstpath
+      @hdfs.rename(Path.new(srcpath), Path.new(dstpath))
+    end
+    #supports s3://,s3n://,hdfs:// in @srcpath@ and @dstpath@
+    def cp srcpath, dstpath
+      @src_fs  = Java::org.apache.hadoop.fs.FileSystem.get(Java::JavaNet::URI.create(srcpath),@conf)
+      @dest_fs = Java::org.apache.hadoop.fs.FileSystem.get(Java::JavaNet::URI.create(dstpath),@conf)
+      FileUtil.copy(@src_fs, Path.new(srcpath),@dest_fs, Path.new(dstpath), false, @conf)
+    end
+    def cp_r srcpath,dstpath
+      cp srcpath,dstpath
+    end
+    def mkdir_p path
+      @hdfs.mkdirs(Path.new(path))
+    end
+    #
+    # Copy hdfs file to local filesystem
+    #
+    def copy_to_local srcfile, dstfile
+      @hdfs.copy_to_local_file(Path.new(srcfile), Path.new(dstfile))
+    end
+#    alias :get :copy_to_local
+    #
+    # Copy local file to hdfs filesystem
+    #
+    def copy_from_local srcfile, dstfile
+      @hdfs.copy_from_local_file(Path.new(srcfile), Path.new(dstfile))
+    end
+    #    alias :put :copy_from_local
+    #
+    # Merge all part files in a directory into one file.
+    #
+    def merge srcdir, dstfile
+      FileUtil.copy_merge(@hdfs, Path.new(srcdir), @hdfs, Path.new(dstfile), false, @conf, "")
+    end
+    #
+    # This is hackety. Use with caution.
+    #
+    def stream input, output
+      input_fs_scheme  = (Java::JavaNet::URI.create(input).scheme || "file") + "://"
+      output_fs_scheme = (Java::JavaNet::URI.create(output).scheme || "file") + "://"
+      system("#{@hadoop_home}/bin/hadoop \\
+       jar         #{@hadoop_home}/contrib/streaming/hadoop-*streaming*.jar                     \\
+       -D          mapred.job.name=\"Stream { #{input_fs_scheme}(#{File.basename(input)}) -> #{output_fs_scheme}(#{File.basename(output)}) }\" \\
+       -D          mapred.min.split.size=1000000000                                            \\
+       -D          mapred.reduce.tasks=0                                                       \\
+       -mapper     \"/bin/cat\"                                                                \\
+       -input      \"#{input}\"                                                                \\
+       -output     \"#{output}\"")
+    end
+    #
+    # BZIP
+    #
+    def bzip input, output
+      system("#{@hadoop_home}/bin/hadoop \\
+       jar         #{@hadoop_home}/contrib/streaming/hadoop-*streaming*.jar     \\
+       -D          mapred.output.compress=true                                  \\
+       -D          mapred.output.compression.codec=org.apache.hadoop.io.compress.BZip2Codec  \\
+       -D          mapred.reduce.tasks=1                                        \\
+       -mapper     \"/bin/cat\"                                                 \\
+       -reducer    \"/bin/cat\"                                                 \\
+       -input      \"#{input}\"                                                 \\
+       -output     \"#{output}\"")
+    end
+    #
+    # Merges many input files into :reduce_tasks amount of output files
+    #
+    def dist_merge inputs, output, options = {}
+      options[:reduce_tasks]     ||= 25
+      options[:partition_fields] ||= 2
+      options[:sort_fields]      ||= 2
+      options[:field_separator]  ||= '/t'
+      names = inputs.map{|inp| File.basename(inp)}.join(',')
+      cmd   = "#{@hadoop_home}/bin/hadoop \\
+       jar         #{@hadoop_home}/contrib/streaming/hadoop-*streaming*.jar                   \\
+       -D          mapred.job.name=\"Swineherd Merge (#{names} -> #{output})\"               \\
+       -D          num.key.fields.for.partition=\"#{options[:partition_fields]}\"            \\
+       -D          stream.num.map.output.key.fields=\"#{options[:sort_fields]}\"             \\
+       -D          mapred.text.key.partitioner.options=\"-k1,#{options[:partition_fields]}\" \\
+       -D          stream.map.output.field.separator=\"'#{options[:field_separator]}'\"      \\
+       -D          mapred.min.split.size=1000000000                                          \\
+       -D          mapred.reduce.tasks=#{options[:reduce_tasks]}                             \\
+       -partitioner org.apache.hadoop.mapred.lib.KeyFieldBasedPartitioner                    \\
+       -mapper     \"/bin/cat\"                                                              \\
+       -reducer    \"/usr/bin/uniq\"                                                         \\
+       -input      \"#{inputs.join(',')}\"                                                   \\
+       -output     \"#{output}\""
+      puts cmd
+      system cmd
+    end
+    class HadoopFile
+      attr_accessor :handle
+      #
+      # In order to open input and output streams we must pass around the hadoop fs object itself
+      #
+      def initialize path, mode, fs, &blk
+        raise Errno::EISDIR,"#{path} is a directory" if fs.directory?(path)
+        @path = Path.new(path)
+        case mode
+        when "r"
+          @handle = fs.hdfs.open(@path).to_io(&blk)
+        when "w"
+          @handle = fs.hdfs.create(@path).to_io.to_outputstream
+          if block_given?
+            yield self
+            self.close
+          end
+        end
+      end
+      def path
+        @path.toString()
+      end
+      def read
+        @handle.read
+      end
+      def write string
+        @handle.write(string.to_java_string.get_bytes)
+      end
+      def close
+        @handle.close
+      end
+    end
+    private
+    # Check that we are running with jruby, check for hadoop home.
+    def running_jruby?
+      begin
+        require 'java'
+      rescue LoadError => e
+        raise "\nJava not found, are you sure you're running with JRuby?\n" + e.message
+      end
+      @hadoop_home = ENV['HADOOP_HOME']
+      raise "\nHadoop installation not found, try setting $HADOOP_HOME\n" unless @hadoop_home && (File.exist? @hadoop_home)
+      true
+    end
+    #
+    # Place hadoop jars in class path, require appropriate jars, set hadoop conf
+    #
+    def set_classpath
+      hadoop_conf = (ENV['HADOOP_CONF_DIR'] || File.join(@hadoop_home, 'conf'))
+      hadoop_conf += "/" unless hadoop_conf.end_with? "/"
+      $CLASSPATH << hadoop_conf unless $CLASSPATH.include?(hadoop_conf)
+    end
+    def import_classes
+      Dir["#{@hadoop_home}/hadoop*.jar", "#{@hadoop_home}/lib/*.jar"].each{|jar| require jar}
+      ['org.apache.hadoop.fs.Path',
+        'org.apache.hadoop.fs.FileUtil',
+        'org.apache.hadoop.mapreduce.lib.input.FileInputFormat',
+        'org.apache.hadoop.mapreduce.lib.output.FileOutputFormat',
+        'org.apache.hadoop.fs.FSDataOutputStream',
+        'org.apache.hadoop.fs.FSDataInputStream'].map{|j_class| java_import(j_class) }
+    end
+    def set_hadoop_environment
+      set_classpath
+      import_classes
+    end
+  end
+end

data/lib/swineherd-fs/localfilesystem.rb ADDED Viewed

@@ -0,0 +1,81 @@
+module Swineherd
+  class LocalFileSystem
+    #include Swineherd::BaseFileSystem
+    def initialize *args
+    end
+    def open path, mode="r", &blk
+      File.open(path,mode,&blk)
+    end
+    #Globs for files at @path@, append '**/*' to glob recursively
+    def size path
+      Dir[path].inject(0){|s,f|s+=File.size(f)}
+    end
+    #A leaky abstraction, should be called rm_rf if it calls rm_rf
+    def rm_r path
+      FileUtils.rm_rf path
+    end
+    def rm path
+      FileUtils.rm path
+    end
+    def exists? path
+      File.exists?(path)
+    end
+    def directory? path
+      File.directory? path
+    end
+    def mv srcpath, dstpath
+      FileUtils.mv(srcpath,dstpath)
+    end
+    def cp srcpath, dstpath
+      FileUtils.cp(srcpath,dstpath)
+    end
+    def cp_r srcpath, dstpath
+      FileUtils.cp_r(srcpath,dstpath)
+    end
+    def mkdir_p path
+      FileUtils.mkdir_p path
+    end
+    #List directory contents,similar to unix `ls`
+    #Dir[@path@/*] to return files in immediate directory of @path@
+    def ls path
+      if exists?(path)
+        if !directory?(path)
+          [path]
+        else
+          path += '/' unless path =~ /\/$/
+          Dir[path+'*']
+        end
+      else
+        raise Errno::ENOENT, "No such file or directory - #{path}"
+      end
+    end
+    #Recursively list directory contents
+    #Dir[@path@/**/*],similar to unix `ls -R`
+    def ls_r path
+      if exists?(path)
+        if !directory?(path)
+          [path]
+        else
+          path += '/' unless path =~ /\/$/
+          Dir[path+'**/*']
+        end
+      else
+        raise Errno::ENOENT, "No such file or directory - #{path}"
+      end
+    end
+  end
+end

data/lib/swineherd-fs/s3filesystem.rb ADDED Viewed

@@ -0,0 +1,311 @@
+module Swineherd
+  #
+  # Methods for interacting with Amazon's Simple Store Service (S3).
+  #
+  class S3FileSystem
+    attr_accessor :s3
+    def initialize options={}
+      aws_access_key = options[:aws_access_key] || (Swineherd.config[:aws] && Swineherd.config[:aws][:access_key])
+      aws_secret_key = options[:aws_secret_key] || (Swineherd.config[:aws] && Swineherd.config[:aws][:secret_key])
+      raise "Missing AWS keys" unless aws_access_key && aws_secret_key
+      @s3 = RightAws::S3.new(aws_access_key, aws_secret_key,:logger => Logger.new(nil)) #FIXME: Just wanted it to shut up
+    end
+    def open path, mode="r", &blk
+      S3File.new(path,mode,self,&blk)
+    end
+    def size path
+      if directory?(path)
+        ls_r(path).inject(0){|sum,file| sum += filesize(file)}
+      else
+        filesize(path)
+      end
+    end
+    def rm path
+      bkt,key = split_path(path)
+      if key.empty? || directory?(path)
+        raise Errno::EISDIR,"#{path} is a directory or bucket, use rm_r or rm_bucket"
+      else
+        @s3.interface.delete(bkt, key)
+      end
+    end
+    #rm_r - Remove recursively. Does not delete buckets, use rm_bucket
+    #params: @path@ - Path of file or folder to delete
+    #returns: Array - Array of paths which were deleted
+    def rm_r path
+      bkt,key = split_path(path)
+      if key.empty?
+        # only the bucket was passed in
+      else
+        if directory?(path)
+          @s3.interface.delete_folder(bkt,key).flatten
+        else
+          @s3.interface.delete(bkt, key)
+          [path]
+        end
+      end
+    end
+    def rm_bucket bucket_name
+      @s3.interface.force_delete_bucket(bucket_name)
+    end
+    def exists? path
+      bucket,key = split_path(path)
+      begin
+        if key.empty? #only a bucket was passed in, check if it exists
+          #FIXME: there may be a better way to test, relying on error to be raised here
+          @s3.interface.bucket_location(bucket) && true
+        elsif file?(path) #simply test for existence of the file
+          true
+        else #treat as directory and see if there are files beneath it
+          #if it's not a file, it is harmless to add '/'.
+          #the prefix search may return files with the same root extension,
+          #ie. foo.txt and foo.txt.bak, if we leave off the trailing slash
+          key+="/" unless key =~ /\/$/
+          @s3.interface.list_bucket(bucket,:prefix => key).size > 0
+        end
+      rescue RightAws::AwsError => error
+        if error.message =~ /nosuchbucket/i
+          false
+        elsif error.message =~ /not found/i
+          false
+        else
+          raise
+        end
+      end
+    end
+    def directory? path
+      exists?(path) && !file?(path)
+    end
+    def file? path
+      bucket,key = split_path(path)
+      begin
+        return false if (key.nil? || key.empty?) #buckets are not files
+        #FIXME: there may be a better way to test, relying on error to be raised
+        @s3.interface.head(bucket,key) && true
+      rescue RightAws::AwsError => error
+        if error.message =~ /nosuchbucket/i
+          false
+        elsif  error.message =~ /not found/i
+          false
+        else
+          raise
+        end
+      end
+    end
+    def mv srcpath, dstpath
+      src_bucket,src_key_path = split_path(srcpath)
+      dst_bucket,dst_key_path = split_path(dstpath)
+      mkdir_p(dstpath) unless exists?(dstpath)
+      if directory? srcpath
+        paths_to_copy = ls_r(srcpath)
+        common_dir    = common_directory(paths_to_copy)
+        paths_to_copy.each do |path|
+          bkt,key = split_path(path)
+          src_key = key
+          dst_key = File.join(dst_key_path, path.gsub(common_dir, ''))
+          @s3.interface.move(src_bucket, src_key, dst_bucket, dst_key)
+        end
+      else
+        @s3.interface.move(src_bucket, src_key_path, dst_bucket, dst_key_path)
+      end
+    end
+    def cp srcpath, dstpath
+      src_bucket,src_key_path = split_path(srcpath)
+      dst_bucket,dst_key_path = split_path(dstpath)
+      mkdir_p(dstpath) unless exists?(dstpath)
+      if src_key_path.empty? || directory?(srcpath)
+        raise Errno::EISDIR,"#{srcpath} is a directory or bucket, use cp_r"
+      else
+        @s3.interface.copy(src_bucket, src_key_path, dst_bucket, dst_key_path)
+      end
+    end
+    # mv is just a special case of cp_r...this is a waste
+    def cp_r srcpath, dstpath
+      src_bucket,src_key_path = split_path(srcpath)
+      dst_bucket,dst_key_path = split_path(dstpath)
+      mkdir_p(dstpath) unless exists?(dstpath)
+      if directory? srcpath
+        paths_to_copy = ls_r(srcpath)
+        common_dir    = common_directory(paths_to_copy)
+        paths_to_copy.each do |path|
+          bkt,key = split_path(path)
+          src_key = key
+          dst_key = File.join(dst_key_path, path.gsub(common_dir, ''))
+          @s3.interface.copy(src_bucket, src_key, dst_bucket, dst_key)
+        end
+      else
+        @s3.interface.copy(src_bucket, src_key_path, dst_bucket, dst_key_path)
+      end
+    end
+    #This is a bit funny, there's actually no need to create a 'path' since
+    #s3 is nothing more than a glorified key-value store. When you create a
+    #'file' (key) the 'path' will be created for you. All we do here is create
+    #the bucket unless it already exists.
+    def mkdir_p path
+      bkt,key = split_path(path)
+      @s3.interface.create_bucket(bkt) unless exists? path
+    end
+    def ls path
+      if exists?(path)
+        bkt,prefix = split_path(path)
+        prefix += '/' if directory?(path) && !(prefix =~ /\/$/) && !prefix.empty?
+        contents = []
+        @s3.interface.incrementally_list_bucket(bkt, {'prefix' => prefix,:delimiter => '/'}) do |res|
+          contents += res[:common_prefixes].map{|c| File.join(bkt,c)}
+          contents += res[:contents].map{|c| File.join(bkt, c[:key])}
+        end
+        contents
+      else
+        raise Errno::ENOENT, "No such file or directory - #{path}"
+      end
+    end
+    def ls_r path
+      if(file?(path))
+        [path]
+      else
+        ls(path).inject([]){|paths,path| paths << path if directory?(path);paths << ls_r(path)}.flatten
+      end
+    end
+    # FIXME: Not implemented for directories
+    # @srcpath@ is assumed to be on the local filesystem
+    def copy_from_local srcpath, destpath
+      bucket,key = split_path(destpath)
+      if File.exists?(srcpath)
+        if File.directory?(srcpath)
+          raise "NotYetImplemented"
+        else
+          @s3.interface.put(bucket, key, File.open(srcpath))
+        end
+      else
+        raise Errno::ENOENT, "No such file or directory - #{srcpath}"
+      end
+    end
+#    alias :put :copy_from_local
+    #FIXME: Not implemented for directories
+    def copy_to_local srcpath, dstpath
+      src_bucket,src_key_path = split_path(srcpath)
+      dstfile = File.new(dstpath, 'w')
+      @s3.interface.get(src_bucket, src_key_path) do |chunk|
+        dstfile.write(chunk)
+      end
+      dstfile.close
+    end
+#    alias :get :copy_to_local
+    def bucket path
+      #URI.parse(path).path.split('/').reject{|x| x.empty?}.first
+      split_path(path).first
+    end
+    def key_for path
+      #File.join(URI.parse(path).path.split('/').reject{|x| x.empty?}[1..-1])
+      split_path(path).last
+    end
+    def split_path path
+      uri = URI.parse(path)
+      base_uri = ""
+      base_uri << uri.host if uri.scheme
+      base_uri << uri.path
+      path = base_uri.split('/').reject{|x| x.empty?}
+      [path[0],path[1..-1].join("/")]
+    end
+    private
+    # FIXME: This is dense
+    def common_directory paths
+      dirs     = paths.map{|path| path.split('/')}
+      min_size = dirs.map{|splits| splits.size}.min
+      dirs     = dirs.map{|splits| splits[0...min_size]}
+      uncommon_idx = dirs.transpose.each_with_index.find{|dirnames, idx| dirnames.uniq.length > 1}.last
+      dirs[0][0...uncommon_idx].join('/')
+    end
+    def filesize filepath
+      bucket,key = split_path(filepath)
+      header = @s3.interface.head(bucket, key)
+      header['content-length'].to_i
+    end
+    class S3File
+      attr_accessor :path, :handle, :fs
+      #
+      # In order to open input and output streams we must pass around the s3 fs object itself
+      #
+      def initialize path, mode, fs, &blk
+        @fs   = fs
+        @path = path
+        case mode
+        when "r" then
+          #          raise "#{fs.type(path)} is not a readable file - #{path}" unless fs.type(path) == "file"
+        when "w" then
+          #          raise "Path #{path} is a directory." unless (fs.type(path) == "file") || (fs.type(path) == "unknown")
+          @handle = Tempfile.new('s3filestream')
+          if block_given?
+            yield self
+            close
+          end
+        end
+      end
+      #
+      # Faster than iterating
+      #
+      def read
+        bucket,key = fs.split_path(path)
+        fs.s3.interface.get_object(bucket, key)
+      end
+      #
+      # This is a little hackety. That is, once you call (.each) on the object the full object starts
+      # downloading...
+      #
+      def readline
+        bucket,key = fs.split_path(path)
+        @handle ||= fs.s3.interface.get_object(bucket, key).each
+        begin
+          @handle.next
+        rescue StopIteration, NoMethodError
+          @handle = nil
+          raise EOFError.new("end of file reached")
+        end
+      end
+      def write string
+        @handle.write(string)
+      end
+      def close
+        bucket,key = fs.split_path(path)
+        if @handle
+          @handle.read
+          fs.s3.interface.put(bucket, key, File.open(@handle.path, 'r'))
+          @handle.close
+        end
+        @handle = nil
+      end
+    end
+  end
+end

data/lib/swineherd-fs.rb ADDED Viewed

@@ -0,0 +1,91 @@
+require 'configliere' ; Configliere.use(:commandline, :env_var, :define,:config_file)
+require 'logger'
+require 'fileutils'
+require 'tempfile'
+require 'right_aws'
+require 'swineherd-fs/localfilesystem'
+require 'swineherd-fs/s3filesystem'
+require 'swineherd-fs/hadoopfilesystem'
+#Merge in system and user settings
+SYSTEM_CONFIG_PATH = "/etc/swineherd.yaml" unless defined?(SYSTEM_CONFIG_PATH)
+USER_CONFIG_PATH   = File.join(ENV['HOME'], '.swineherd.yaml') unless defined?(USER_CONFIG_PATH)
+module Swineherd
+  def self.config
+    return @config if @config
+    config = Configliere::Param.new
+    config.read SYSTEM_CONFIG_PATH if File.exists? SYSTEM_CONFIG_PATH
+    config.read USER_CONFIG_PATH  if File.exists? USER_CONFIG_PATH
+    @config ||= config
+  end
+  def self.logger
+    return @log if @log
+    @log ||= Logger.new(config[:log_file] || STDOUT)
+    @log.formatter = proc { |severity, datetime, progname, msg|
+      "[#{severity.upcase}] #{msg}\n"
+    }
+    @log
+  end
+  def self.logger= logger
+    @log = logger
+  end
+  module FileSystem
+    HDFS_SCHEME_REGEXP = /^hdfs:\/\//
+    S3_SCHEME_REGEXP   = /^s3n?:\/\//
+    FILESYSTEMS = {
+      'file' => Swineherd::LocalFileSystem,
+      'hdfs' => Swineherd::HadoopFileSystem,
+      's3'   => Swineherd::S3FileSystem,
+      's3n'  => Swineherd::S3FileSystem
+    }
+    # A factory function that returns an instance of the requested class
+    def self.get scheme, *args
+      begin
+        FILESYSTEMS[scheme.to_s].new *args
+      rescue NoMethodError => e
+        raise "Filesystem with scheme #{scheme} does not exist.\n #{e.message}"
+      end
+    end
+    def self.exists?(path)
+      fs = self.get(scheme_for(path))
+      Swineherd.logger.info "#exists? - #{fs.class} for '#{path}'"
+      fs.exists?(path)
+    end
+    def self.cp(srcpath,destpath)
+      src_fs  = scheme_for(srcpath)
+      dest_fs = scheme_for(destpath)
+      Swineherd.logger.info "#cp - #{src_fs} --> #{dest_fs}"
+      if(src_fs.eql?(dest_fs))
+        self.get(src_fs).cp(srcpath,destpath)
+      elsif src_fs.eql?(:file)
+        self.get(dest_fs).copy_from_local(srcpath,destpath)
+      elsif dest_fs.eql?(:file)
+        self.get(src_fs).copy_to_local(srcpath,destpath)
+      else #cp between s3/s3n and hdfs can be handled by Hadoop:FileUtil in HadoopFileSystem
+        self.get(:hdfs).cp(srcpath,destpath)
+      end
+    end
+    private
+    #defaults to local filesystem :file
+    def self.scheme_for(path)
+      scheme = URI.parse(path).scheme
+      (scheme && scheme.to_sym) || :file
+    end
+  end
+end

data/rspec.watchr ADDED Viewed

@@ -0,0 +1,19 @@
+# -*- ruby -*-
+def run_spec(file)
+  unless File.exist?(file)
+    puts "#{file} does not exist"
+    return
+  end
+  puts   "Running #{file}"
+  system "rspec #{file}"
+end
+watch("spec/.*/*_spec\.rb") do |match|
+  run_spec match[0]
+end
+watch("lib/swineherd-fs/(.*)\.rb") do |match|
+  file = %{spec/#{match[1]}_spec.rb}
+  run_spec file if File.exists?(file)
+end

data/spec/filesystem_spec.rb ADDED Viewed

@@ -0,0 +1,186 @@
+require 'spec_helper'
+FS_SPEC_ROOT = File.dirname(__FILE__)
+S3_TEST_BUCKET = 'swineherd-fs-test-bucket' #You'll have to set this to something else
+shared_examples_for "an abstract filesystem" do
+  let(:test_filename){ File.join(test_dirname,"filename.txt") }
+  let(:test_string){ "foobarbaz" }
+  let(:files){ ['d.txt','b/c.txt'].map{|f| File.join(test_dirname,f)} }
+  let(:dirs){ %w(b).map{|d| File.join(test_dirname,d)} }
+  it "implements #exists?" do
+    fs.mkdir_p(test_dirname)
+    expect{ fs.open(test_filename,'w'){|f| f.write(test_string)} }.to change{ fs.exists?(test_filename) }.from(false).to(true)
+  end
+  it "implements #directory?" do
+    fs.mkdir_p(test_dirname)
+    fs.open(test_filename, 'w'){|f| f.write(test_string)}
+    fs.directory?(test_filename).should eql false
+    fs.directory?(test_dirname).should eql true
+  end
+  it "implements #rm on files" do
+    fs.mkdir_p(test_dirname)
+    fs.open(test_filename, 'w'){|f| f.write(test_string)}
+    expect{ fs.rm(test_filename) }.to change{ fs.exists?(test_filename) }.from(true).to(false)
+  end
+  it "raises error on #rm of non-empty directory" do
+    fs.mkdir_p(test_dirname)
+    fs.open(test_filename, 'w'){|f| f.write(test_string)}
+    expect{fs.rm(test_dirname)}.to raise_error
+  end
+  it "implements #rm_r" do
+    fs.mkdir_p(test_dirname)
+    fs.open(test_filename,'w'){|f| f.write(test_string)}
+    expect{ fs.rm_r(test_dirname) }.to change{ fs.exists?(test_dirname) && fs.exists?(test_filename) }.from(true).to(false)
+  end
+  it "implements #ls" do
+    dirs.each{ |dir| fs.mkdir_p(dir) }
+    files.each{|filename| fs.open(filename,"w"){|f|f.write(test_string) }}
+    fs.ls(test_dirname).length.should eql 2
+  end
+  it "implements #ls_r" do
+    dirs.each{ |dir| fs.mkdir_p(dir) }
+    files.each{|filename| fs.open(filename,"w"){|f|f.write(test_string) }}
+    fs.ls_r(test_dirname).length.should eql 3
+  end
+  it "implements #size" do
+    fs.mkdir_p(test_dirname)
+    fs.open(test_filename,'w'){|f| f.write(test_string)}
+    test_string.length.should eql(fs.size(test_filename))
+  end
+  it "implements #mkdir_p" do
+    expect{ fs.mkdir_p(test_dirname) }.to change{ fs.directory?(test_dirname) }.from(false).to(true)
+  end
+  it "implements #mv" do
+    fs.mkdir_p(test_dirname)
+    fs.open(test_filename, 'w'){|f| f.write(test_string)}
+    filename2 = File.join(test_dirname,"new_file.txt")
+    expect{ fs.mv(test_filename, filename2) }.to change{ fs.exists?(filename2) }.from(false).to(true)
+    fs.exists?(test_filename).should eql false
+    fs.open(filename2,"r").read.should eql test_string
+  end
+  it "implements #cp" do
+    fs.mkdir_p(test_dirname)
+    fs.open(test_filename, 'w'){|f| f.write(test_string)}
+    filename2 = File.join(test_dirname,"new_file.txt")
+    expect{ fs.cp(test_filename, filename2) }.to change{ fs.exists?(filename2) }.from(false).to(true)
+    fs.open(test_filename,"r").read.should eql fs.open(filename2,"r").read
+  end
+  it "implements #cp_r"
+  it "implements #open" do
+    fs.mkdir_p(test_dirname)
+    expect{
+      file = fs.open(test_filename, 'w')
+      file.write(test_string)
+      file.close
+    }.to change{ fs.exists?(test_filename) }.from(false).to(true)
+  end
+  it "implements #open with &blk" do
+    fs.mkdir_p(test_dirname)
+    expect{ fs.open(test_filename, 'w'){|f| f.write(test_string)} }.to change{ fs.exists?(test_filename) }.from(false).to(true)
+  end
+  describe "with a new file" do
+    it "implements path" do
+      fs.mkdir_p(test_dirname)
+      file = fs.open(test_filename,'w')
+      file.path.should eql test_filename
+    end
+    it "implements write" do
+      fs.mkdir_p(test_dirname)
+      fs.open(test_filename,'w'){|f| f.write(test_string)}
+    end
+    it "should not allow write after close" do
+      fs.mkdir_p(test_dirname)
+      file = fs.open(test_filename,'w')
+      file.write(test_string)
+      file.close
+      lambda{file.write(test_string)}.should raise_error
+    end
+    it "implements read" do
+      fs.mkdir_p(test_dirname)
+      fs.open(test_filename,'w'){|f| f.write(test_string)}
+      fs.open(test_filename,'r').read.should eql test_string
+    end
+  end
+  after do
+    fs.rm_r(test_dirname) if fs.exists?(test_dirname)
+  end
+end
+describe Swineherd::FileSystem do
+  let(:fs){ Swineherd::FileSystem }
+  let(:test_dirname){ FS_SPEC_ROOT+"/tmp/test_dir" }
+  let(:test_filename){ File.join(test_dirname,"filename.txt") }
+  let(:test_string){ "foobarbaz" }
+  it "implements #cp" do
+    localfs = Swineherd::LocalFileSystem.new
+    s3_fs = Swineherd::S3FileSystem.new
+    localfs.mkdir_p(test_dirname)
+    localfs.open(test_filename, 'w'){|f| f.write(test_string)}
+    s3_filename = "s3://"+S3_TEST_BUCKET+"/new_file.txt"
+    expect{ fs.cp(test_filename, s3_filename) }.to change{ fs.exists?(s3_filename) }.from(false).to(true)
+    localfs.rm_r(test_dirname) if localfs.exists?(test_dirname)
+    s3_fs.rm(s3_filename)
+  end
+end
+describe Swineherd::LocalFileSystem do
+  it_behaves_like "an abstract filesystem" do
+    let(:fs){ Swineherd::LocalFileSystem.new }
+    let(:test_dirname){ FS_SPEC_ROOT+"/tmp/test_dir" }
+  end
+end
+describe Swineherd::S3FileSystem do
+  #mkdir_p won't pass because there is no concept of a directory on s3
+  it_behaves_like "an abstract filesystem" do
+    let(:fs){ Swineherd::S3FileSystem.new }
+    let(:test_dirname){ S3_TEST_BUCKET+"/tmp/test_dir" }
+  end
+  describe "an S3FileSystem" do
+    let(:fs){ Swineherd::S3FileSystem.new }
+    it "should return false for #file? on a bucket" do
+      fs.file?(S3_TEST_BUCKET).should eql false
+    end
+  end
+end
+describe Swineherd::HadoopFileSystem do
+  it_behaves_like "an abstract filesystem" do
+    let(:fs){ Swineherd::HadoopFileSystem.new }
+    let(:test_dirname){ "/tmp/test_dir" }
+  end
+end

data/spec/spec_helper.rb ADDED Viewed

	@@ -0,0 +1,2 @@
1	+ $LOAD_PATH << File.expand_path(File.join(File.dirname(__FILE__),'../lib'))
2	+ require 'swineherd-fs'

data/swineherd-fs.gemspec ADDED Viewed

@@ -0,0 +1,23 @@
+# -*- encoding: utf-8 -*-
+Gem::Specification.new do |s|
+  s.name = %q{swineherd-fs}
+  s.version = "0.0.2"
+  s.authors = ["David Snyder","Jacob Perkins"]
+  s.date = %q{2012-01-20}
+  s.description = %q{A filesystem abstraction for Amazon S3 and Hadoop HDFS}
+  s.summary = %q{A filesystem abstraction for Amazon S3 and Hadoop HDFS}
+  s.email = %q{"david@infochimps.com"}
+  s.homepage = %q{http://github.com/infochimps-labs/swineherd-fs}
+  s.files = ["LICENSE", "VERSION","Gemfile", "swineherd-fs.gemspec", "rspec.watchr", "README.textile", "lib/swineherd-fs.rb","lib/swineherd-fs/localfilesystem.rb", "lib/swineherd-fs/s3filesystem.rb", "lib/swineherd-fs/hadoopfilesystem.rb", "spec/spec_helper.rb", "spec/filesystem_spec.rb"]
+  s.test_files =  ["spec/spec_helper.rb", "spec/filesystem_spec.rb"]
+  s.require_paths = ["lib"]
+  s.add_development_dependency("rspec")
+  s.add_development_dependency("watchr")
+  s.add_runtime_dependency(%q<configliere>, [">= 0"])
+  s.add_runtime_dependency(%q<right_aws>, [">= 0"])
+  s.add_runtime_dependency(%q<jruby-openssl>, [">= 0"])
+end

metadata ADDED Viewed

@@ -0,0 +1,121 @@
+--- !ruby/object:Gem::Specification
+name: swineherd-fs
+version: !ruby/object:Gem::Version
+  prerelease:
+  version: 0.0.2
+platform: ruby
+authors:
+- David Snyder
+- Jacob Perkins
+autorequire:
+bindir: bin
+cert_chain: []
+date: 2012-01-20 00:00:00 Z
+dependencies:
+- !ruby/object:Gem::Dependency
+  name: rspec
+  prerelease: false
+  requirement: &id001 !ruby/object:Gem::Requirement
+    none: false
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: "0"
+  type: :development
+  version_requirements: *id001
+- !ruby/object:Gem::Dependency
+  name: watchr
+  prerelease: false
+  requirement: &id002 !ruby/object:Gem::Requirement
+    none: false
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: "0"
+  type: :development
+  version_requirements: *id002
+- !ruby/object:Gem::Dependency
+  name: configliere
+  prerelease: false
+  requirement: &id003 !ruby/object:Gem::Requirement
+    none: false
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: "0"
+  type: :runtime
+  version_requirements: *id003
+- !ruby/object:Gem::Dependency
+  name: right_aws
+  prerelease: false
+  requirement: &id004 !ruby/object:Gem::Requirement
+    none: false
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: "0"
+  type: :runtime
+  version_requirements: *id004
+- !ruby/object:Gem::Dependency
+  name: jruby-openssl
+  prerelease: false
+  requirement: &id005 !ruby/object:Gem::Requirement
+    none: false
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: "0"
+  type: :runtime
+  version_requirements: *id005
+description: A filesystem abstraction for Amazon S3 and Hadoop HDFS
+email: "\"david@infochimps.com\""
+executables: []
+extensions: []
+extra_rdoc_files: []
+files:
+- LICENSE
+- VERSION
+- Gemfile
+- swineherd-fs.gemspec
+- rspec.watchr
+- README.textile
+- lib/swineherd-fs.rb
+- lib/swineherd-fs/localfilesystem.rb
+- lib/swineherd-fs/s3filesystem.rb
+- lib/swineherd-fs/hadoopfilesystem.rb
+- spec/spec_helper.rb
+- spec/filesystem_spec.rb
+homepage: http://github.com/infochimps-labs/swineherd-fs
+licenses: []
+post_install_message:
+rdoc_options: []
+require_paths:
+- lib
+required_ruby_version: !ruby/object:Gem::Requirement
+  none: false
+  requirements:
+  - - ">="
+    - !ruby/object:Gem::Version
+      version: "0"
+required_rubygems_version: !ruby/object:Gem::Requirement
+  none: false
+  requirements:
+  - - ">="
+    - !ruby/object:Gem::Version
+      version: "0"
+requirements: []
+rubyforge_project:
+rubygems_version: 1.8.15
+signing_key:
+specification_version: 3
+summary: A filesystem abstraction for Amazon S3 and Hadoop HDFS
+test_files:
+- spec/spec_helper.rb
+- spec/filesystem_spec.rb