pbs 1.1.4 → 2.0.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 3af819b077ee91da66818d771629351f3a82f157
4
- data.tar.gz: c732e8ea53bd0640b941421644e0555a1dcd2180
3
+ metadata.gz: 6b983a1e056e48ac1fb0973f565910ecaa7b59e8
4
+ data.tar.gz: 72ea5ac4e9b96137f9d234f44ea9912bb4aef298
5
5
  SHA512:
6
- metadata.gz: 45f4ba1bbc056a036a22276669d9a5db484b9b5308578ca4b4bdd90af1080ed9e1309cbaacc7b41a20e8e8c0b830244bbaedaa067f8fc9f38da832f4446c1d3a
7
- data.tar.gz: b88452f9007a1c6972808764f074adddfea6da6ca36d04f615228bfcffc680937cc3c601d618c8f29fc8f4590f8976d07d052229294704462c504c2660d9ec5d
6
+ metadata.gz: afe01c15267f13df46e452dab396cf568af6751fe4a92648f174b2d3a2af30443b902115938a28c9e4e7e37b0e436e1d37f13f616c3eff0a61c68f1912956c64
7
+ data.tar.gz: ef15244b9433ffff530d12da0a90a01a3e7d017c127028005f3831322f7cfac1a7c84c07cdf0c9ffdfc64d8ba3c47f53ae62cec06bd1d37d7141cfdf287e0a61
@@ -0,0 +1,8 @@
1
+ ## Unreleased
2
+
3
+ ## 2.0.0 (2016-08-05)
4
+
5
+ Features:
6
+
7
+ - initial release of v2.0.0
8
+ - added a changelog
data/README.md CHANGED
@@ -2,33 +2,217 @@
2
2
 
3
3
  ## Description
4
4
 
5
- Trimmed down Ruby wrapper for the Torque C Library utilizing Ruby-FFI.
6
-
7
- ## Requirements
8
-
9
- At minimum you will need:
10
- * Ruby 2.0
11
- * Ruby-FFI gem
12
- * Torque >= 4.2.10
5
+ Ruby wrapper for the Torque C Library utilizing Ruby-FFI. This has been
6
+ successfully tested with Torque 4.2.10 and greater. Your mileage may vary.
13
7
 
14
8
  ## Installation
15
9
 
16
10
  Add this to your application's Gemfile:
17
11
 
18
12
  ```ruby
19
- gem 'pbs'
13
+ gem 'pbs'
20
14
  ```
21
15
 
22
16
  And then execute:
23
17
 
24
18
  ```bash
25
- $ bundle install
19
+ $ bundle install
26
20
  ```
27
21
 
28
22
  ## Usage
29
23
 
30
- Most useful features are outlined in the `examples/simplejob.rb` provided. To run this simple example, type:
24
+ All communication with a specific batch server is handled through the `Batch`
25
+ object. You can generate this object for a given batch server and Torque client
26
+ installation as:
31
27
 
32
- ```bash
33
- $ ruby -Ilib examples/simplejob.rb
28
+ ```ruby
29
+ # Create new batch object for OSC's Oakley batch server
30
+ oakley = PBS::Batch.new(host: 'oak-batch.osc.edu', prefix: '/usr/local/torque/default')
31
+
32
+ # Get status information for this batch server
33
+ # see http://linux.die.net/man/7/pbs_server_attributes
34
+ oakley.get_status
35
+ #=>
36
+ #{
37
+ # "oak-batch.osc.edu:15001" => {
38
+ # :server_state => "Idle",
39
+ # :total_jobs => "2514",
40
+ # :default_queue => "batch",
41
+ # ...
42
+ # }
43
+ #}
44
+
45
+ # Get status information but only filter through specific attributes
46
+ oakley.get_status(filters: [:server_state, :total_jobs])
47
+ #=>
48
+ #{
49
+ # "oak-batch.osc.edu:15001" => {
50
+ # :server_state => "Idle",
51
+ # :total_jobs => "2514"
52
+ # }
53
+ #}
54
+ ```
55
+
56
+ You can also query information about the nodes, queues, and jobs running on
57
+ this batch server:
58
+
59
+ ```ruby
60
+ # Get list of nodes from batch server
61
+ b.get_nodes
62
+ #=>
63
+ #{
64
+ # "n0003" => {
65
+ # :state => "free",
66
+ # :power_state => "Running",
67
+ # :np => "12",
68
+ # ...
69
+ # },
70
+ # "n0004" => {
71
+ # :state => "free",
72
+ # :power_state => "Running",
73
+ # :np => "12",
74
+ # ...
75
+ # }, ...
76
+ #}
77
+
78
+ # To get info about a single node
79
+ b.get_node("n0003")
80
+ #=> { ... }
81
+
82
+ # Get list of queues from batch server
83
+ # see http://linux.die.net/man/7/pbs_queue_attributes
84
+ b.get_queues
85
+ #=>
86
+ #{
87
+ # "batch" => {
88
+ # :queue_type => "Route",
89
+ # :total_jobs => "2",
90
+ # :enabled => "True",
91
+ # ...
92
+ # },
93
+ # "serial" => {
94
+ # :queue_type => "Execution",
95
+ # :total_jobs => "2386",
96
+ # :enabled => "True",
97
+ # ...
98
+ # }, ...
99
+ #}
100
+
101
+ # To get info about a single queue
102
+ b.get_queue("serial")
103
+ #=> { ... }
104
+
105
+ # Get list of jobs from batch server
106
+ # see http://linux.die.net/man/7/pbs_server_attributes
107
+ b.get_jobs
108
+ #=>
109
+ #{
110
+ # "6621251.oak-batch.osc.edu" => {
111
+ # :Job_Name => "FEA_solver",
112
+ # :Job_Owner => "bob@oakley01.osc.edu",
113
+ # :job_state => "Q",
114
+ # ...
115
+ # },
116
+ # "6621252.oak-batch.osc.edu" => {
117
+ # :Job_Name => "CFD_solver",
118
+ # :Job_Owner => "sally@oakley02.osc.edu",
119
+ # :job_state => "R",
120
+ # ...
121
+ # }, ...
122
+ #}
123
+
124
+ # To get info about a single job
125
+ b.get_job("6621251.oak-batch.osc.edu")
126
+ #=> { ... }
127
+ ```
128
+
129
+ ### Simple Job Submission
130
+
131
+ To submit a script to the batch server:
132
+
133
+ ```ruby
134
+ # Simple job submission
135
+ job_id = b.submit_script("/path/to/script")
136
+ #=> "7166037.oak-batch.osc.edu"
137
+
138
+ # Get job information for this job
139
+ b.get_job(job_id)
140
+ #=> { ... }
141
+
142
+ # Hold this job
143
+ b.hold_job(job_id)
144
+
145
+ # Release this job
146
+ b.release_job(job_id)
147
+
148
+ # Delete this job
149
+ b.delete_job(job_id)
150
+ ```
151
+
152
+ To submit a string to the batch server:
153
+
154
+ ```ruby
155
+ # Submit a string to the batch server
156
+ job_id = b.submit_string("sleep 60")
157
+ ```
158
+
159
+ The above command will actually generate a temporary file on the local disk and
160
+ submit that to the batch server before it is cleaned up.
161
+
162
+ ### Advanced Job Submission
163
+
164
+ You can programmatically define the PBS directives of your choosing. They will
165
+ override any set within the batch script.
166
+
167
+ Define headers:
168
+
169
+ ```ruby
170
+ # Define headers:
171
+ # -N job_name
172
+ # -j oe
173
+ # -o /path/to/output
174
+ headers = {
175
+ PBS::ATTR[:N] => "job_name",
176
+ PBS::ATTR[:j] => "oe",
177
+ PBS::ATTR[:o] => "/path/to/output"
178
+ }
179
+
180
+ # or you can directly call the key
181
+ headers = {
182
+ Job_Name: "job_name",
183
+ Join_Path: "oe",
184
+ Output_Path: "/path/to/output"
185
+ }
186
+ ```
187
+
188
+ Define resources (directives that begin with `-l`):
189
+
190
+ ```ruby
191
+ # Define resources:
192
+ # -l nodes=1:ppn=12
193
+ # -l walltime=05:00:00
194
+ resources = {
195
+ nodes: "1:ppn=12",
196
+ walltime: "05:00:00"
197
+ }
198
+ ```
199
+
200
+ Define environment variables (directive that begins with `-v`):
201
+
202
+ ```ruby
203
+ # Define environment variables that will be exposed to batch job
204
+ envvars = {
205
+ TOKEN: 'a8dsjf873js0k',
206
+ USE_GUI: 1
207
+ }
208
+ ```
209
+
210
+ Submit job with these directives:
211
+
212
+ ```ruby
213
+ # Advanced job submission
214
+ job_id = b.submit_script("/path/to/script", headers: headers, resources: resources, envvars: envvars)
215
+
216
+ # Get job info
217
+ b.get_job(job_id)
34
218
  ```
data/lib/pbs.rb CHANGED
@@ -1,38 +1,9 @@
1
- require 'yaml'
2
- require 'socket'
3
-
4
1
  require_relative 'pbs/error'
5
2
  require_relative 'pbs/attributes'
6
3
  require_relative 'pbs/torque'
7
- require_relative 'pbs/conn'
8
- require_relative 'pbs/query'
9
- require_relative 'pbs/job'
4
+ require_relative 'pbs/batch'
10
5
  require_relative 'pbs/version'
11
6
 
7
+ # The main namespace for pbs
12
8
  module PBS
13
- # Path to the batch config yaml file describing the batch servers for
14
- # local batch schedulers.
15
- # @return [String] Path to the batch config yaml file.
16
- def self.default_batch_config_path
17
- default_config = File.expand_path("../../config/batch.yml", __FILE__)
18
- host_config = File.expand_path("../../config/#{Socket.gethostname}.yml", __FILE__)
19
- File.file?(host_config) ? host_config : default_config
20
- end
21
-
22
- # Get the path to the batch config yaml file.
23
- def self.batch_config_path
24
- @batch_config_path ||= self.default_batch_config_path
25
- end
26
-
27
- # Set the path to the batch config yaml file.
28
- # @param path [String] The path to the batch config yaml file.
29
- def self.batch_config_path=(path)
30
- @batch_config_path = File.expand_path(path)
31
- end
32
-
33
- # Hash generated from reading the batch config yaml file.
34
- # @return [Hash] Batch configuration generated from config yaml file.
35
- def self.batch_config
36
- YAML.load_file(batch_config_path)
37
- end
38
9
  end
@@ -1,12 +1,11 @@
1
- # Maintains a constant Hash of defined PBS attribute types
2
- #
3
- # Includes:
4
- # Attribute names used by user commands
5
- # Additional job and general attribute names
6
- # Additional queue attribute names
7
- # Additional server attribute names
8
- # Additional node attribute names
9
1
  module PBS
2
+ # Maintains a constant Hash of defined PBS attribute types
3
+ # Includes:
4
+ # Attribute names used by user commands
5
+ # Additional job and general attribute names
6
+ # Additional queue attribute names
7
+ # Additional server attribute names
8
+ # Additional node attribute names
10
9
  ATTR = {
11
10
  # Attribute names used by user commands
12
11
  a: :Execution_Time,
@@ -0,0 +1,334 @@
1
+ require 'open3'
2
+
3
+ module PBS
4
+ # Object used for simplified communication with a batch server
5
+ class Batch
6
+ # The host of the Torque batch server
7
+ # @example OSC's Oakley batch server
8
+ # my_conn.host #=> "oak-batch.osc.edu"
9
+ # @return [String] the batch server host
10
+ attr_reader :host
11
+
12
+ # The path to the Torque client installation
13
+ # @example For Torque 5.0.0
14
+ # my_conn.prefix.to_s #=> "/usr/local/torque/5.0.0"
15
+ # @return [Pathname, nil] path to torque installation
16
+ attr_reader :prefix
17
+
18
+ # @param host [#to_s] the batch server host
19
+ # @param prefix [#to_s, nil] path to torque installation
20
+ def initialize(host:, prefix: nil, **_)
21
+ @host = host.to_s
22
+ @prefix = Pathname.new(prefix) if prefix
23
+ end
24
+
25
+ # Convert object to hash
26
+ # @return [Hash] the hash describing this object
27
+ def to_h
28
+ {host: host, prefix: prefix}
29
+ end
30
+
31
+ # The comparison operator
32
+ # @param other [#to_h] batch server to compare against
33
+ # @return [Boolean] how batch servers compare
34
+ def ==(other)
35
+ to_h == other.to_h
36
+ end
37
+
38
+ # Checks whether two batch server objects are completely identical to each
39
+ # other
40
+ # @param other [Batch] batch server to compare against
41
+ # @return [Boolean] whether same objects
42
+ def eql?(other)
43
+ self.class == other.class && self == other
44
+ end
45
+
46
+ # Generates a hash value for this object
47
+ # @return [Fixnum] hash value of object
48
+ def hash
49
+ [self.class, to_h].hash
50
+ end
51
+
52
+ # Creates a connection to batch server and calls block in context of this
53
+ # connection
54
+ # @yieldparam cid [Fixnum] connection id from established batch server connection
55
+ # @yieldreturn the final value of the block
56
+ def connect(&block)
57
+ Torque.lib = prefix ? prefix.join('lib', 'libtorque.so') : nil
58
+ cid = Torque.pbs_connect(host)
59
+ Torque.raise_error(cid.abs) if cid < 0 # raise error if negative connection id
60
+ begin
61
+ value = yield cid
62
+ ensure
63
+ Torque.pbs_disconnect(cid) # always close connection
64
+ end
65
+ Torque.check_for_error # check for errors at end
66
+ value
67
+ end
68
+
69
+ # Get a hash with status info for this batch server
70
+ # @example Status info for OSC Oakley batch server
71
+ # my_conn.get_status
72
+ # #=>
73
+ # #{
74
+ # # "oak-batch.osc.edu:15001" => {
75
+ # # :server_state => "Idle",
76
+ # # ...
77
+ # # }
78
+ # #}
79
+ # @param filters [Array<Symbol>] list of attribs to filter on
80
+ # @return [Hash] status info for batch server
81
+ def get_status(filters: [])
82
+ connect do |cid|
83
+ filters = PBS::Torque::Attrl.from_list filters
84
+ batch_status = Torque.pbs_statserver cid, filters, nil
85
+ batch_status.to_h.tap { Torque.pbs_statfree batch_status }
86
+ end
87
+ end
88
+
89
+ # Get a list of hashes of the queues on the batch server
90
+ # @example Status info for OSC Oakley queues
91
+ # my_conn.get_queues
92
+ # #=>
93
+ # #{
94
+ # # "parallel" => {
95
+ # # :queue_type => "Execution",
96
+ # # ...
97
+ # # },
98
+ # # "serial" => {
99
+ # # :queue_type => "Execution",
100
+ # # ...
101
+ # # },
102
+ # # ...
103
+ # #}
104
+ # @param id [#to_s] the id of requested information
105
+ # @param filters [Array<Symbol>] list of attribs to filter on
106
+ # @return [Hash] hash of details for the queues
107
+ def get_queues(id: '', filters: [])
108
+ connect do |cid|
109
+ filters = PBS::Torque::Attrl.from_list(filters)
110
+ batch_status = Torque.pbs_statque cid, id.to_s, filters, nil
111
+ batch_status.to_h.tap { Torque.pbs_statfree batch_status }
112
+ end
113
+ end
114
+
115
+ # Get info for given batch server's queue
116
+ # @example Status info for OSC Oakley's parallel queue
117
+ # my_conn.get_queue("parallel")
118
+ # #=>
119
+ # #{
120
+ # # "parallel" => {
121
+ # # :queue_type => "Execution",
122
+ # # ...
123
+ # # }
124
+ # #}
125
+ # @param (see @get_queues)
126
+ # @return [Hash] status info for the queue
127
+ def get_queue(id, **kwargs)
128
+ get_queues(id: id, **kwargs)
129
+ end
130
+
131
+
132
+ # Get a list of hashes of the nodes on the batch server
133
+ # @example Status info for OSC Oakley nodes
134
+ # my_conn.get_nodes
135
+ # #=>
136
+ # #{
137
+ # # "n0001" => {
138
+ # # :np => "12",
139
+ # # ...
140
+ # # },
141
+ # # "n0002" => {
142
+ # # :np => "12",
143
+ # # ...
144
+ # # },
145
+ # # ...
146
+ # #}
147
+ # @param id [#to_s] the id of requested information
148
+ # @param filters [Array<Symbol>] list of attribs to filter on
149
+ # @return [Hash] hash of details for nodes
150
+ def get_nodes(id: '', filters: [])
151
+ connect do |cid|
152
+ filters = PBS::Torque::Attrl.from_list(filters)
153
+ batch_status = Torque.pbs_statnode cid, id.to_s, filters, nil
154
+ batch_status.to_h.tap { Torque.pbs_statfree batch_status }
155
+ end
156
+ end
157
+
158
+ # Get info for given batch server's node
159
+ # @example Status info for OSC Oakley's 'n0001' node
160
+ # my_conn.get_node('n0001')
161
+ # #=>
162
+ # #{
163
+ # # "n0001" => {
164
+ # # :np => "12",
165
+ # # ...
166
+ # # }
167
+ # #}
168
+ # @param (see #get_nodes)
169
+ # @return [Hash] status info for the node
170
+ def get_node(id, **kwargs)
171
+ get_nodes(id: id, **kwargs)
172
+ end
173
+
174
+ # Get a list of hashes of the jobs on the batch server
175
+ # @example Status info for OSC Oakley jobs
176
+ # my_conn.get_jobs
177
+ # #=>
178
+ # #{
179
+ # # "10219837.oak-batch.osc.edu" => {
180
+ # # :Job_Owner => "bob@oakley02.osc.edu",
181
+ # # :Job_Name => "CFD_Solver",
182
+ # # ...
183
+ # # },
184
+ # # "10219838.oak-batch.osc.edu" => {
185
+ # # :Job_Owner => "sally@oakley01.osc.edu",
186
+ # # :Job_Name => "FEA_Solver",
187
+ # # ...
188
+ # # },
189
+ # # ...
190
+ # #}
191
+ # @param id [#to_s] the id of requested information
192
+ # @param filters [Array<Symbol>] list of attribs to filter on
193
+ # @return [Hash] hash of details for jobs
194
+ def get_jobs(id: '', filters: [])
195
+ connect do |cid|
196
+ filters = PBS::Torque::Attrl.from_list(filters)
197
+ batch_status = Torque.pbs_statjob cid, id.to_s, filters, nil
198
+ batch_status.to_h.tap { Torque.pbs_statfree batch_status }
199
+ end
200
+ end
201
+
202
+ # Get info for given batch server's job
203
+ # @example Status info for OSC Oakley's '10219837.oak-batch.osc.edu' job
204
+ # my_conn.get_job('102719837.oak-batch.osc.edu')
205
+ # #=>
206
+ # #{
207
+ # # "10219837.oak-batch.osc.edu" => {
208
+ # # :Job_Owner => "bob@oakley02.osc.edu",
209
+ # # :Job_Name => "CFD_Solver",
210
+ # # ...
211
+ # # }
212
+ # #}
213
+ # @param (see #get_jobs)
214
+ # @return [Hash] hash with details of job
215
+ def get_job(id, **kwargs)
216
+ get_jobs(id: id, **kwargs)
217
+ end
218
+
219
+ # Put specified job on hold
220
+ # Possible hold types:
221
+ # :u => Available to the owner of the job, the batch operator and the batch administrator
222
+ # :o => Available to the batch operator and the batch administrator
223
+ # :s => Available to the batch administrator
224
+ # @example Put job '10219837.oak-batch.osc.edu' on hold
225
+ # my_conn.hold_job('10219837.oak-batch.osc.edu')
226
+ # @param id [#to_s] the id of the job
227
+ # @param type [Symbol] type of hold to be applied
228
+ # @return [void]
229
+ def hold_job(id, type: :u)
230
+ connect do |cid|
231
+ Torque.pbs_holdjob cid, id.to_s, type.to_s, nil
232
+ end
233
+ end
234
+
235
+ # Release a specified job that is on hold
236
+ # Possible hold types:
237
+ # :u => Available to the owner of the job, the batch operator and the batch administrator
238
+ # :o => Available to the batch operator and the batch administrator
239
+ # :s => Available to the batch administrator
240
+ # @example Release job '10219837.oak-batch.osc.edu' from hold
241
+ # my_conn.release_job('10219837.oak-batch.osc.edu')
242
+ # @param id [#to_s] the id of the job
243
+ # @param type [Symbol] type of hold to be removed
244
+ # @return [void]
245
+ def release_job(id, type: :u)
246
+ connect do |cid|
247
+ Torque.pbs_rlsjob cid, id.to_s, type.to_s, nil
248
+ end
249
+ end
250
+
251
+ # Delete a specified job from batch server
252
+ # @example Delete job '10219837.oak-batch.osc.edu' from batch
253
+ # my_conn.delete_job('10219837.oak-batch.osc.edu')
254
+ # @param id [#to_s] the id of the job
255
+ # @return [void]
256
+ def delete_job(id)
257
+ connect do |cid|
258
+ Torque.pbs_deljob cid, id.to_s, nil
259
+ end
260
+ end
261
+
262
+ # Submit a script to the batch server
263
+ # @example Submit a script with a few PBS directives
264
+ # my_conn.submit_script("/path/to/script",
265
+ # headers: {
266
+ # Job_Name: "myjob",
267
+ # Join_Path: "oe"
268
+ # },
269
+ # resources: {
270
+ # nodes: "4:ppn=12",
271
+ # walltime: "12:00:00"
272
+ # },
273
+ # envvars: {
274
+ # TOKEN: "asd90f9sd8g90hk34"
275
+ # }
276
+ # )
277
+ # #=> "6621251.oak-batch.osc.edu"
278
+ # @param script [#to_s] path to the script
279
+ # @param queue [#to_s] queue to submit script to
280
+ # @param headers [Hash] pbs headers
281
+ # @param resources [Hash] pbs resources
282
+ # @param envvars [Hash] pbs environment variables
283
+ # @param qsub [Boolean] whether use library or binary for submission
284
+ # @return [String] the id of the job that was created
285
+ def submit_script(script, queue: nil, headers: {}, resources: {}, envvars: {}, qsub: true)
286
+ send(qsub ? :qsub_submit : :pbs_submit, script, queue, headers, resources, envvars)
287
+ end
288
+
289
+ # Submit a script expanded into a string to the batch server
290
+ # @param string [#to_s] script as a string
291
+ # @param (see #submit_script)
292
+ # @return [String] the id of the job that was created
293
+ def submit_string(string, **kwargs)
294
+ Tempfile.open('qsub.') do |f|
295
+ f.write string.to_s
296
+ f.close
297
+ submit_script(f.path, **kwargs)
298
+ end
299
+ end
300
+
301
+ private
302
+ # Submit a script using Torque library
303
+ def pbs_submit(script, queue, headers, resources, envvars)
304
+ attribs = headers.dup
305
+ attribs[ATTR[:l]] = resources.dup unless resources.empty?
306
+ attribs[ATTR[:v]] = envvars.map{|k,v| "#{k}=#{v}"}.join(",") unless envvars.empty?
307
+
308
+ connect do |cid|
309
+ attropl = Torque::Attropl.from_hash attribs
310
+ Torque.pbs_submit cid, attropl, script.to_s, queue.to_s, nil
311
+ end
312
+ end
313
+
314
+ # Submit a script using Torque binary
315
+ # NB: The binary includes many useful filters and is preferred
316
+ def qsub_submit(script, queue, headers, resources, envvars)
317
+ params = ["-q", "#{queue}@#{host}"]
318
+ params += resources.map{|k,v| ["-l", "#{k}=#{v}"]}.flatten unless resources.empty?
319
+ params += ["-v", envvars.map{|k,v| "#{k}=#{v}"}.join(",")] unless envvars.empty?
320
+ params += headers.map do |k,v|
321
+ if param = ATTR.key(k) and param.length == 1
322
+ ["-#{param}", "#{v}"]
323
+ else
324
+ ["-W", "#{k}=#{v}"]
325
+ end
326
+ end.flatten
327
+ params << script.to_s
328
+
329
+ o, e, s = Open3.capture3(prefix.join("bin", "qsub").to_s, *params)
330
+ raise PBS::Error, e unless s.success?
331
+ o.chomp
332
+ end
333
+ end
334
+ end