neptune 0.2.0 → 0.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -135,13 +135,8 @@
135
135
 
136
136
  <div id="description">
137
137
 
138
- <p>A set of methods and constants that we’ve monkey-patched to enable
139
- Neptune support. In the future, it is likely that the only exposed /
140
- monkey-patched method should be job, while the others could probably be
141
- folded into either a Neptune-specific class or into <a
142
- href="CommonFunctions.html">CommonFunctions</a>. TODO(cbunch): This
143
- doesn’t look like it does anything - run the integration test and confirm
144
- one way or the other.</p>
138
+ <p>Since we’re monkeypatching <a href="Object.html">Object</a> to add
139
+ neptune() and babel(), a short blurb is necessary here to make rdoc happy.</p>
145
140
 
146
141
  </div>
147
142
 
@@ -255,13 +250,20 @@ handles this.</p>
255
250
  <div class="method-source-code"
256
251
  id="babel-source">
257
252
  <pre>
258
- <span class="ruby-comment"># File lib/babel.rb, line 42</span>
253
+ <span class="ruby-comment"># File lib/babel.rb, line 43</span>
259
254
  def babel(params)
260
255
  <span class="ruby-comment"># Since this whole function should run asynchronously, we run it as a future.</span>
261
256
  <span class="ruby-comment"># It automatically starts running in a new thread, and attempting to get the</span>
262
257
  <span class="ruby-comment"># value of what this returns causes it to block until the job completes.</span>
263
258
  future {
259
+ if params[:storage]
260
+ params[:is_remote] = true
261
+ else
262
+ params[:is_remote] = false
263
+ end
264
+
264
265
  job_data = <span class="ruby-constant">BabelHelper</span>.convert_from_neptune_params(params)
266
+
265
267
  <span class="ruby-constant">NeptuneHelper</span>.validate_storage_params(job_data) <span class="ruby-comment"># adds in S3 storage params</span>
266
268
 
267
269
  <span class="ruby-comment"># :code is the only required parameter - everything else can use default vals</span>
@@ -280,12 +282,7 @@ def babel(params)
280
282
  end
281
283
 
282
284
  <span class="ruby-constant">BabelHelper</span>.run_job(job_data)
283
- <span class="ruby-comment"># So actually retrieving the job's output is done via a promise, so only if</span>
284
- <span class="ruby-comment"># the user actually uses the value do we actually go and poll for output.</span>
285
- <span class="ruby-comment"># The running of the job is done above, outside of the promise, so</span>
286
- <span class="ruby-comment"># the job is always run, regardless of whether or not we get its output.</span>
287
285
  <span class="ruby-constant">BabelHelper</span>.wait_and_get_output(job_data)
288
- <span class="ruby-comment"># promise { BabelHelper.wait_and_get_output(job_data) }</span>
289
286
  }
290
287
  end</pre>
291
288
  </div>
@@ -323,14 +320,14 @@ vice-versa).</p>
323
320
  <div class="method-source-code"
324
321
  id="neptune-source">
325
322
  <pre>
326
- <span class="ruby-comment"># File lib/neptune.rb, line 485</span>
323
+ <span class="ruby-comment"># File lib/neptune.rb, line 58</span>
327
324
  def neptune(params)
328
- <span class="ruby-constant">Kernel</span>.puts <span class="ruby-string">&quot;Received a request to run a job.&quot;</span>
329
- <span class="ruby-constant">Kernel</span>.puts params[:type]
325
+ <span class="ruby-comment"># Kernel.puts &quot;Received a request to run a job.&quot;</span>
326
+ <span class="ruby-comment"># Kernel.puts params[:type]</span>
330
327
 
331
328
  job_data = <span class="ruby-constant">NeptuneHelper</span>.get_job_data(params)
332
329
  <span class="ruby-constant">NeptuneHelper</span>.validate_storage_params(job_data)
333
- <span class="ruby-constant">Kernel</span>.puts &quot;job data = #{job_data.inspect}&quot;
330
+ <span class="ruby-comment"># Kernel.puts &quot;job data = #{job_data.inspect}&quot;</span>
334
331
  keyname = job_data[<span class="ruby-string">&quot;@keyname&quot;</span>]
335
332
 
336
333
  shadow_ip = <span class="ruby-constant">CommonFunctions</span>.get_from_yaml(keyname, :shadow)
@@ -24,7 +24,7 @@
24
24
  <div id="metadata">
25
25
  <dl>
26
26
  <dt class="modified-date">Last Modified</dt>
27
- <dd class="modified-date">Fri Feb 10 19:53:13 -0800 2012</dd>
27
+ <dd class="modified-date">Sun Feb 12 16:12:33 -0800 2012</dd>
28
28
 
29
29
 
30
30
  <dt class="requires">Requires</dt>
@@ -1,7 +1,7 @@
1
- Sat, 11 Feb 2012 13:24:19 -0800
2
- bin/neptune Fri, 10 Feb 2012 19:53:13 -0800
3
- lib/babel.rb Sat, 11 Feb 2012 13:20:56 -0800
4
- lib/neptune.rb Sat, 11 Feb 2012 11:16:25 -0800
5
- lib/custom_exceptions.rb Sat, 31 Dec 2011 13:13:50 -0800
6
- lib/app_controller_client.rb Wed, 25 Jan 2012 12:07:13 -0800
7
- lib/common_functions.rb Sat, 11 Feb 2012 13:20:31 -0800
1
+ Sun, 12 Feb 2012 16:40:58 -0800
2
+ bin/neptune Sun, 12 Feb 2012 16:12:33 -0800
3
+ lib/babel.rb Sun, 12 Feb 2012 16:35:05 -0800
4
+ lib/neptune.rb Sun, 12 Feb 2012 16:34:30 -0800
5
+ lib/custom_exceptions.rb Sun, 12 Feb 2012 16:18:29 -0800
6
+ lib/app_controller_client.rb Sun, 12 Feb 2012 16:13:32 -0800
7
+ lib/common_functions.rb Sun, 12 Feb 2012 16:18:14 -0800
@@ -24,7 +24,7 @@
24
24
  <div id="metadata">
25
25
  <dl>
26
26
  <dt class="modified-date">Last Modified</dt>
27
- <dd class="modified-date">Wed Jan 25 12:07:13 -0800 2012</dd>
27
+ <dd class="modified-date">Sun Feb 12 16:13:32 -0800 2012</dd>
28
28
 
29
29
 
30
30
  <dt class="requires">Requires</dt>
@@ -24,7 +24,7 @@
24
24
  <div id="metadata">
25
25
  <dl>
26
26
  <dt class="modified-date">Last Modified</dt>
27
- <dd class="modified-date">Sat Feb 11 13:20:56 -0800 2012</dd>
27
+ <dd class="modified-date">Sun Feb 12 16:35:05 -0800 2012</dd>
28
28
 
29
29
 
30
30
  <dt class="requires">Requires</dt>
@@ -24,7 +24,7 @@
24
24
  <div id="metadata">
25
25
  <dl>
26
26
  <dt class="modified-date">Last Modified</dt>
27
- <dd class="modified-date">Sat Feb 11 13:20:31 -0800 2012</dd>
27
+ <dd class="modified-date">Sun Feb 12 16:18:14 -0800 2012</dd>
28
28
 
29
29
 
30
30
  <dt class="requires">Requires</dt>
@@ -24,7 +24,7 @@
24
24
  <div id="metadata">
25
25
  <dl>
26
26
  <dt class="modified-date">Last Modified</dt>
27
- <dd class="modified-date">Sat Dec 31 13:13:50 -0800 2011</dd>
27
+ <dd class="modified-date">Sun Feb 12 16:18:29 -0800 2012</dd>
28
28
 
29
29
 
30
30
  <dt class="requires">Requires</dt>
@@ -24,7 +24,7 @@
24
24
  <div id="metadata">
25
25
  <dl>
26
26
  <dt class="modified-date">Last Modified</dt>
27
- <dd class="modified-date">Sat Feb 11 11:16:25 -0800 2012</dd>
27
+ <dd class="modified-date">Sun Feb 12 16:34:30 -0800 2012</dd>
28
28
 
29
29
 
30
30
  <dt class="requires">Requires</dt>
@@ -12,12 +12,27 @@ require 'timeout'
12
12
  # long calls unless necessary.
13
13
  NO_TIMEOUT = -1
14
14
 
15
+
15
16
  # A client that uses SOAP messages to communicate with the underlying cloud
16
17
  # platform (here, AppScale). This client is similar to that used in the AppScale
17
18
  # Tools, but with non-Neptune SOAP calls removed.
18
19
  class AppControllerClient
19
- attr_accessor :conn, :ip, :secret
20
+
21
+
22
+ # The SOAP client that we use to communicate with the AppController.
23
+ attr_accessor :conn
24
+
25
+
26
+ # The IP address of the AppController that we will be connecting to.
27
+ attr_accessor :ip
20
28
 
29
+
30
+ # The secret string that is used to authenticate this client with
31
+ # AppControllers. It is initially generated by appscale-run-instances and can
32
+ # be found on the machine that ran that tool, or on any AppScale machine.
33
+ attr_accessor :secret
34
+
35
+
21
36
  # A constructor that requires both the IP address of the machine to communicate
22
37
  # with as well as the secret (string) needed to perform communication.
23
38
  # AppControllers will reject SOAP calls if this secret (basically a password)
@@ -38,6 +53,7 @@ class AppControllerClient
38
53
  @conn.add_method("neptune_does_file_exist", "file", "job_data", "secret")
39
54
  end
40
55
 
56
+
41
57
  # A helper method to make SOAP calls for us. This method is mainly here to
42
58
  # reduce code duplication: all SOAP calls expect a certain timeout and can
43
59
  # tolerate certain exceptions, so we consolidate this code into this method.
@@ -73,6 +89,7 @@ class AppControllerClient
73
89
  end
74
90
  end
75
91
 
92
+
76
93
  # Initiates the start of a Neptune job, whether it be a HPC job (MPI, X10,
77
94
  # or MapReduce), or a scaling job (e.g., for AppScale itself). This method
78
95
  # should not be used for retrieving the output of a job or getting / setting
@@ -90,6 +107,7 @@ class AppControllerClient
90
107
  return result
91
108
  end
92
109
 
110
+
93
111
  # Stores a file stored on the user's local file system in the underlying
94
112
  # database. The user can specify to use either the underlying database
95
113
  # that AppScale is using, or alternative storage mechanisms (as of writing,
@@ -104,6 +122,7 @@ class AppControllerClient
104
122
  return result
105
123
  end
106
124
 
125
+
107
126
  # Retrieves the output of a Neptune job, stored in an underlying
108
127
  # database. Within AppScale, a special application runs, referred to as the
109
128
  # Repository, which provides a key-value interface to Neptune job data.
@@ -123,6 +142,7 @@ class AppControllerClient
123
142
  return result
124
143
  end
125
144
 
145
+
126
146
  # Returns the ACL associated with the named piece of data stored
127
147
  # in the underlying cloud platform. Right now, data can only be
128
148
  # public or private, but future versions will add individual user
@@ -137,6 +157,7 @@ class AppControllerClient
137
157
  return result
138
158
  end
139
159
 
160
+
140
161
  # Sets the ACL of a specified pieces of data stored in the underlying
141
162
  # cloud platform. As is the case with get_acl, ACLs can be either
142
163
  # public or private right now, but this will be expanded upon in
@@ -151,6 +172,9 @@ class AppControllerClient
151
172
  return result
152
173
  end
153
174
 
175
+
176
+ # Instructs the AppController to fetch the code specified and compile it.
177
+ # The result should then be placed in a location specified in the job data.
154
178
  def compile_code(job_data)
155
179
  result = ""
156
180
  make_call(NO_TIMEOUT, false) {
@@ -160,6 +184,10 @@ class AppControllerClient
160
184
  return result
161
185
  end
162
186
 
187
+
188
+ # Asks the AppController for a list of all the Babel engines (each of which
189
+ # is a queue to store jobs and something that executes tasks) that are
190
+ # supported for the given credentials.
163
191
  def get_supported_babel_engines(job_data)
164
192
  result = []
165
193
  make_call(NO_TIMEOUT, false) {
@@ -168,6 +196,10 @@ class AppControllerClient
168
196
  return result
169
197
  end
170
198
 
199
+
200
+ # Asks the AppController to see if the given file exists in the remote
201
+ # datastore. If extra credentials are needed for this operation, they are
202
+ # searched for within the job data.
171
203
  def does_file_exist?(file, job_data)
172
204
  result = false
173
205
  make_call(NO_TIMEOUT, false) {
@@ -35,6 +35,7 @@ SLEEP_TIME = 5 # seconds
35
35
  # job requests.
36
36
  MAX_SLEEP_TIME = 60 # seconds
37
37
 
38
+
38
39
  # Babel provides a nice wrapper around Neptune jobs. Instead of making users
39
40
  # write multiple Neptune jobs to actually run code (e.g., putting input in the
40
41
  # datastore, run the job, get the output back), Babel automatically handles
@@ -44,7 +45,14 @@ def babel(params)
44
45
  # It automatically starts running in a new thread, and attempting to get the
45
46
  # value of what this returns causes it to block until the job completes.
46
47
  future {
48
+ if params[:storage]
49
+ params[:is_remote] = true
50
+ else
51
+ params[:is_remote] = false
52
+ end
53
+
47
54
  job_data = BabelHelper.convert_from_neptune_params(params)
55
+
48
56
  NeptuneHelper.validate_storage_params(job_data) # adds in S3 storage params
49
57
 
50
58
  # :code is the only required parameter - everything else can use default vals
@@ -63,23 +71,20 @@ def babel(params)
63
71
  end
64
72
 
65
73
  BabelHelper.run_job(job_data)
66
- # So actually retrieving the job's output is done via a promise, so only if
67
- # the user actually uses the value do we actually go and poll for output.
68
- # The running of the job is done above, outside of the promise, so
69
- # the job is always run, regardless of whether or not we get its output.
70
74
  BabelHelper.wait_and_get_output(job_data)
71
- # promise { BabelHelper.wait_and_get_output(job_data) }
72
75
  }
73
76
  end
74
77
 
75
78
 
76
79
  # This module provides convenience functions for babel().
77
80
  module BabelHelper
81
+
82
+
78
83
  # If the user fails to give us an output location, this function will generate
79
84
  # one for them, based on either the location of their code (for remotely
80
85
  # specified code), or a babel parameter (for locally specified code).
81
86
  def self.generate_output_location(job_data)
82
- if job_data["@is_remote"]
87
+ if job_data["@storage"]
83
88
  # We already know the bucket name - the same one that the user
84
89
  # has told us their code is located in.
85
90
  prefix = job_data["@code"].scan(/\/(.*?)\//)[0].to_s
@@ -90,6 +95,7 @@ module BabelHelper
90
95
  return "/#{prefix}/babel/temp-#{CommonFunctions.get_random_alphanumeric()}"
91
96
  end
92
97
 
98
+
93
99
  # Provides a common way for callers to get the name of the bucket that
94
100
  # should be used for Neptune jobs where the code is stored locally.
95
101
  def self.get_bucket_for_local_data(job_data)
@@ -107,6 +113,7 @@ module BabelHelper
107
113
  return bucket_name
108
114
  end
109
115
 
116
+
110
117
  # For jobs where the code is stored remotely, this method ensures that
111
118
  # the code and any possible inputs actually do exist, before attempting to
112
119
  # use them for computation.
@@ -128,16 +135,18 @@ module BabelHelper
128
135
  }
129
136
  end
130
137
 
138
+
131
139
  # To avoid accidentally overwriting outputs from previous jobs, we first
132
140
  # check to make sure an output file doesn't exist before starting a new job
133
141
  # with the given name.
134
142
  def self.ensure_output_does_not_exist(job_data)
135
143
  file = job_data["@output"]
136
144
  controller = self.get_appcontroller(job_data)
137
- puts job_data.inspect
145
+ # Kernel.puts job_data.inspect
138
146
  NeptuneHelper.require_file_to_not_exist(file, job_data, controller)
139
147
  end
140
148
 
149
+
141
150
  # Returns an AppControllerClient for the given job data.
142
151
  def self.get_appcontroller(job_data)
143
152
  keyname = job_data["@keyname"] || "appscale"
@@ -146,6 +155,7 @@ module BabelHelper
146
155
  return AppControllerClient.new(shadow_ip, secret)
147
156
  end
148
157
 
158
+
149
159
  # Stores the user's code (and the directory it's in, and directories in the
150
160
  # same directory as the user's code, since there could be libraries used)
151
161
  # in the remote datastore.
@@ -157,6 +167,7 @@ module BabelHelper
157
167
  return job_data["@code"]
158
168
  end
159
169
 
170
+
160
171
  # If any input files are specified, they are copied to the remote datastore
161
172
  # via Neptune 'input' jobs. Inputs are assumed to be files on the local
162
173
  # filesystem if they begin with a slash, and job_data gets updated with
@@ -176,6 +187,7 @@ module BabelHelper
176
187
  return job_data
177
188
  end
178
189
 
190
+
179
191
  # If the user gives us local code or local inputs, this function will
180
192
  # run a Neptune 'input' job to store the data remotely.
181
193
  def self.put_file(local_path, job_data)
@@ -191,6 +203,7 @@ module BabelHelper
191
203
  return input_data[:remote]
192
204
  end
193
205
 
206
+
194
207
  # Neptune internally uses job_data with keys of the form @name, but since the
195
208
  # user has given them to us in the form :name, we convert it here.
196
209
  # TODO(cgb): It looks like this conversion to/from may be unnecessary since
@@ -204,6 +217,7 @@ module BabelHelper
204
217
  return job_data
205
218
  end
206
219
 
220
+
207
221
  # Neptune input jobs expect keys of the form :name, but since we've already
208
222
  # converted them to the form @name, this function reverses that conversion.
209
223
  def self.convert_to_neptune_params(job_data)
@@ -216,12 +230,17 @@ module BabelHelper
216
230
 
217
231
  return neptune_params
218
232
  end
219
-
233
+
234
+
220
235
  # Constructs a Neptune job to run the user's code as a Babel job (task queue)
221
236
  # from the given parameters.
222
237
  def self.run_job(job_data)
223
238
  run_data = self.convert_to_neptune_params(job_data)
224
- run_data[:type] = "babel"
239
+
240
+ # Default to babel as the job type, if the user doesn't specify one.
241
+ if run_data[:type].nil? or run_data[:type].empty?
242
+ run_data[:type] = "babel"
243
+ end
225
244
 
226
245
  # TODO(cgb): Once AppScale+Babel gets support for RabbitMQ, change this to
227
246
  # exec tasks over it, instead of locally.
@@ -233,6 +252,7 @@ module BabelHelper
233
252
  return Kernel.neptune(run_data)
234
253
  end
235
254
 
255
+
236
256
  # Constructs a Neptune job to get the output of a Babel job. If the job is not
237
257
  # yet finished, this function waits until it does, and then returns the output
238
258
  # of the job.
@@ -11,12 +11,15 @@ require 'yaml'
11
11
 
12
12
  require 'custom_exceptions'
13
13
 
14
+
14
15
  # A helper module that aggregates functions that are not part of Neptune's
15
16
  # core functionality. Specifically, this module contains methods to scp
16
17
  # files to other machines and the ability to read YAML files, which are
17
18
  # often needed to determine which machine should be used for computation
18
19
  # or to copy over code and input files.
19
20
  module CommonFunctions
21
+
22
+
20
23
  # Executes a command and returns the result. Is needed to get around
21
24
  # Flexmock's inability to mock out Kernel:` (the standard shell exec
22
25
  # method).
@@ -24,6 +27,7 @@ module CommonFunctions
24
27
  return `#{cmd}`
25
28
  end
26
29
 
30
+
27
31
  # Returns a random string composed of alphanumeric characters, as long
28
32
  # as the user requests.
29
33
  def self.get_random_alphanumeric(length=10)
@@ -32,28 +36,27 @@ module CommonFunctions
32
36
  possibleLength = possible.length
33
37
 
34
38
  length.times { |index|
35
- random << possible[rand(possibleLength)]
39
+ random << possible[Kernel.rand(possibleLength)]
36
40
  }
37
41
 
38
42
  return random
39
43
  end
40
44
 
45
+
41
46
  # Copies a file to the Shadow node (head node) within AppScale.
42
47
  # The caller specifies
43
48
  # the local file location, the destination where the file should be
44
49
  # placed, and the name of the key to use. The keyname is typically
45
50
  # specified by the Neptune job given, but defaults to ''appscale''
46
51
  # if not provided.
47
- def self.scp_to_shadow(local_file_loc,
48
- remote_file_loc,
49
- keyname,
50
- is_dir=false)
51
-
52
+ def self.scp_to_shadow(local_file_loc, remote_file_loc, keyname, is_dir=false)
52
53
  shadow_ip = CommonFunctions.get_from_yaml(keyname, :shadow)
53
54
  ssh_key = File.expand_path("~/.appscale/#{keyname}.key")
54
- CommonFunctions.scp_file(local_file_loc, remote_file_loc, shadow_ip, ssh_key, is_dir)
55
+ CommonFunctions.scp_file(local_file_loc, remote_file_loc, shadow_ip,
56
+ ssh_key, is_dir)
55
57
  end
56
-
58
+
59
+
57
60
  # Performs the actual remote copying of files: given the IP address
58
61
  # and other information from scp_to_shadow, attempts to use scp
59
62
  # to copy the file over. Aborts if the scp fails, which can occur
@@ -62,9 +65,8 @@ module CommonFunctions
62
65
  # actually a directory, we append the -r flag to scp as well.
63
66
  def self.scp_file(local_file_loc, remote_file_loc, target_ip, public_key_loc,
64
67
  is_dir=false)
65
- cmd = ""
66
- local_file_loc = File.expand_path(local_file_loc)
67
68
 
69
+ local_file_loc = File.expand_path(local_file_loc)
68
70
  ssh_args = "-o StrictHostkeyChecking=no 2>&1"
69
71
  ssh_args << " -r " if is_dir
70
72
 
@@ -85,7 +87,7 @@ module CommonFunctions
85
87
 
86
88
  loop {
87
89
  break if File.exists?(retval_loc)
88
- sleep(5)
90
+ Kernel.sleep(5)
89
91
  }
90
92
 
91
93
  retval = (File.open(retval_loc) { |f| f.read }).chomp
@@ -96,6 +98,7 @@ module CommonFunctions
96
98
  return cmd
97
99
  end
98
100
 
101
+
99
102
  # Given the AppScale keyname, reads the associated YAML file and returns
100
103
  # the contents of the given tag. The required flag (default value is true)
101
104
  # indicates whether a value must exist for this tag: if set to true, this
@@ -132,6 +135,7 @@ module CommonFunctions
132
135
  return value
133
136
  end
134
137
 
138
+
135
139
  # Returns the secret key needed for communication with AppScale's
136
140
  # Shadow node. This method is a nice frontend to the get_from_yaml
137
141
  # function, as the secret is stored in a YAML file.