RubyGems - neptune - Versions diffs - 0.0.4 → 0.0.5 - Mend

neptune 0.0.4 → 0.0.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

data/README CHANGED

@@ -3,7 +3,7 @@ Neptune: A Domain Specific Language for Deploying HPC
 Software on Cloud Platforms
 Neptune provides programmers with a simple interface
-by which they can deploy MPI, X10, and MapReduce jobs
+by which they can deploy MPI, X10, MapReduce, UPC, and Erlang jobs
 to without needing to know the particulars of the underlying
 cloud platform. You only need to give Neptune your code,
 tell it how many machines to run on and where to put the output:
@@ -28,6 +28,10 @@ in it! Neptune will run 'make' on it (you can specify which target
 to make as well) and return to you a folder containing the standard
 out and standard error of the make command.
+By default, Neptune jobs store their outputs in the underlying database
+that AppScale is running over. As of Neptune 0.0.5, job outputs can
+also be stored in Amazon S3, Eucalyptus Walrus, and Google Storage.
 Sample Neptune job scripts can be found in samples. Test scripts will
 be added to the 'test' folder soon.
@@ -53,6 +57,14 @@ in for a link to that as it becomes available.
 Version History:
+March 18, 2011 - 0.0.5 released, adding support for storage outside
+of AppScale to be used. Tested and working with Amazon S3 and Google
+Storage
+February 10, 2011 - 0.0.4 released, adding UPC and Erlang support,
+and restructuring syntax to pass in hashes to method calls instead
+of passing in blocks
 February 4, 2011 - 0.0.3 released, allowing users to use
 Neptune properly as a gem within Ruby code

data/lib/neptune.rb CHANGED

@@ -19,9 +19,17 @@ $VERBOSE = nil
 #MR_RUN_JOB_REQUIRED = %w{ }
 #MR_REQUIRED = %w{ output }
+# A list of Neptune jobs that do not require nodes to be spawned
+# up for computation
+NO_NODES_NEEDED = ["acl", "output", "compile"]
+# A list of storage mechanisms that we can use to store and retrieve
+# data to for Neptune jobs.
+ALLOWED_STORAGE_TYPES = ["appdb", "gstorage", "s3"]
 # A list of jobs that require some kind of work to be done before
 # the actual computation can be performed.
-NEED_PREPROCESSING = ["compile", "mapreduce", "mpi"]
+NEED_PREPROCESSING = ["compile", "erlang", "mapreduce", "mpi"]
 # A set of methods and constants that we've monkey-patched to enable Neptune
 # support. In the future, it is likely that the only exposed / monkey-patched
@@ -66,6 +74,19 @@ def preprocess_compile(job_data)
   job_data["@code"] = dest
 end
+def preprocess_erlang(job_data)
+  source_code = File.expand_path(job_data["@code"])
+  unless File.exists?(source_code)
+    file_not_found = "The specified code, #{job_data['@code']}," +
+      " didn't exist. Please specify one that exists and try again"
+    abort(file_not_found)
+  end
+  dest_code = "/tmp/"
+  keyname = job_data["@keyname"]
+  CommonFunctions.scp_to_shadow(source_code, dest_code, keyname)
+end
 # This preprocessing method handles copying data for regular
 # Hadoop MapReduce and Hadoop MapReduce Streaming. For the former
 # case, we copy over just the JAR the user has given us, and
@@ -97,12 +118,31 @@ end
 # code to the master node in AppScale - this node will
 # then copy it to whoever will run the MPI job.
 def preprocess_mpi(job_data)
+  if job_data["@procs_to_use"]
+    p = job_data["@procs_to_use"]
+    n = job_data["@nodes_to_use"]
+    if p < n
+      not_enough_procs = "When specifying both :procs_to_use and :nodes_to_use" +
+        ", :procs_to_use must be at least as large as :nodes_to_use. Please " +
+        "change this and try again. You specified :procs_to_use = #{p} and" +
+        ":nodes_to_use = #{n}."
+      abort(not_enough_procs)
+    end
+  end
   source_code = File.expand_path(job_data["@code"])
   unless File.exists?(source_code)
-    file_not_found = "The specified code, #{job_data['@code']}," +
+    file_not_found = "The specified code, #{source_code}," +
       " didn't exist. Please specify one that exists and try again"
     abort(file_not_found)
   end
+  unless File.file?(source_code)
+    should_be_file = "The specified code, #{source_code}, was not a file - " +
+      " it was a directory or symbolic link. Please specify a file and try again."
+    abort(should_be_file)
+  end
   dest_code = "/tmp/thempicode"
   keyname = job_data["@keyname"]
@@ -145,6 +185,10 @@ def neptune(params)
   job_data["@job"] = nil
   job_data["@keyname"] = keyname || "appscale"
+  if job_data["@nodes_to_use"].class == Hash
+    job_data["@nodes_to_use"] = job_data["@nodes_to_use"].to_a.flatten
+  end
   if (job_data["@output"].nil? or job_data["@output"] == "")
     abort("Job output must be specified")
   end
@@ -153,6 +197,49 @@ def neptune(params)
     abort("Job output must begin with a slash ('/')")
   end
+  if job_data["@storage"]
+    storage = job_data["@storage"]
+    unless ALLOWED_STORAGE_TYPES.include?(storage)
+      msg = "Supported storage types are #{ALLOWED_STORAGE_TYPES.join(', ')}" +
+        " - we do not support #{storage}."
+      abort(msg)
+    end
+    # Our implementation for storing / retrieving via Google Storage uses
+    # the same library as we do for S3 - so just tell it that it's S3
+    if storage == "gstorage"
+      storage = "s3"
+      job_data["@storage"] = "s3"
+    end
+    if storage == "s3"
+      ["EC2_ACCESS_KEY", "EC2_SECRET_KEY", "S3_URL"].each { |item|
+        unless job_data["@#{item}"]
+          if ENV[item]
+            puts "Using #{item} from environment"
+            job_data["@#{item}"] = ENV[item]
+          else
+            msg = "When storing data to S3, #{item} must be specified or be in " +
+              "your environment. Please do so and try again."
+            abort(msg)
+          end
+        end
+      }
+      # the rightscale gems won't take the s3 url if it has http or https on
+      # the front, so rip it off first - it also doesn't like a trailing slash
+      s3_url = job_data["@S3_URL"]
+      puts "s3 url is now #{s3_url}"
+      if s3_url =~ /\Ahttp[s]?:\/\/(.*)\/\Z/
+        s3_url = $1
+      end
+      puts "s3 url is now #{s3_url}"
+      job_data["@S3_URL"] = s3_url
+    end
+  else
+    job_data["@storage"] = "appdb"
+  end
   #if job_data["@can_run_on"].class == Range
   #  job_data["@can_run_on"] = job_data["@can_run_on"].to_a
   #elsif job_data["@can_run_on"].class == Fixnum

metadata CHANGED

@@ -1,13 +1,13 @@
 --- !ruby/object:Gem::Specification
 name: neptune
 version: !ruby/object:Gem::Version
-  hash: 23
+  hash: 21
   prerelease:
   segments:
   - 0
   - 0
-  - 4
-  version: 0.0.4
+  - 5
+  version: 0.0.5
 platform: ruby
 authors:
 - Chris Bunch
@@ -15,7 +15,7 @@ autorequire: neptune
 bindir: bin
 cert_chain: []
-date: 2011-02-09 00:00:00 -08:00
+date: 2011-03-18 00:00:00 -07:00
 default_executable: neptune
 dependencies: []
@@ -108,7 +108,7 @@ files:
 - README
 - LICENSE
 has_rdoc: true
-homepage: http://appscale.cs.ucsb.edu
+homepage: http://neptune-lang.org
 licenses: []
 post_install_message: