pbs 1.1.4 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 3af819b077ee91da66818d771629351f3a82f157
4
- data.tar.gz: c732e8ea53bd0640b941421644e0555a1dcd2180
3
+ metadata.gz: 6b983a1e056e48ac1fb0973f565910ecaa7b59e8
4
+ data.tar.gz: 72ea5ac4e9b96137f9d234f44ea9912bb4aef298
5
5
  SHA512:
6
- metadata.gz: 45f4ba1bbc056a036a22276669d9a5db484b9b5308578ca4b4bdd90af1080ed9e1309cbaacc7b41a20e8e8c0b830244bbaedaa067f8fc9f38da832f4446c1d3a
7
- data.tar.gz: b88452f9007a1c6972808764f074adddfea6da6ca36d04f615228bfcffc680937cc3c601d618c8f29fc8f4590f8976d07d052229294704462c504c2660d9ec5d
6
+ metadata.gz: afe01c15267f13df46e452dab396cf568af6751fe4a92648f174b2d3a2af30443b902115938a28c9e4e7e37b0e436e1d37f13f616c3eff0a61c68f1912956c64
7
+ data.tar.gz: ef15244b9433ffff530d12da0a90a01a3e7d017c127028005f3831322f7cfac1a7c84c07cdf0c9ffdfc64d8ba3c47f53ae62cec06bd1d37d7141cfdf287e0a61
@@ -0,0 +1,8 @@
1
+ ## Unreleased
2
+
3
+ ## 2.0.0 (2016-08-05)
4
+
5
+ Features:
6
+
7
+ - initial release of v2.0.0
8
+ - added a changelog
data/README.md CHANGED
@@ -2,33 +2,217 @@
2
2
 
3
3
  ## Description
4
4
 
5
- Trimmed down Ruby wrapper for the Torque C Library utilizing Ruby-FFI.
6
-
7
- ## Requirements
8
-
9
- At minimum you will need:
10
- * Ruby 2.0
11
- * Ruby-FFI gem
12
- * Torque >= 4.2.10
5
+ Ruby wrapper for the Torque C Library utilizing Ruby-FFI. This has been
6
+ successfully tested with Torque 4.2.10 and greater. Your mileage may vary.
13
7
 
14
8
  ## Installation
15
9
 
16
10
  Add this to your application's Gemfile:
17
11
 
18
12
  ```ruby
19
- gem 'pbs'
13
+ gem 'pbs'
20
14
  ```
21
15
 
22
16
  And then execute:
23
17
 
24
18
  ```bash
25
- $ bundle install
19
+ $ bundle install
26
20
  ```
27
21
 
28
22
  ## Usage
29
23
 
30
- Most useful features are outlined in the `examples/simplejob.rb` provided. To run this simple example, type:
24
+ All communication with a specific batch server is handled through the `Batch`
25
+ object. You can generate this object for a given batch server and Torque client
26
+ installation as:
31
27
 
32
- ```bash
33
- $ ruby -Ilib examples/simplejob.rb
28
+ ```ruby
29
+ # Create new batch object for OSC's Oakley batch server
30
+ oakley = PBS::Batch.new(host: 'oak-batch.osc.edu', prefix: '/usr/local/torque/default')
31
+
32
+ # Get status information for this batch server
33
+ # see http://linux.die.net/man/7/pbs_server_attributes
34
+ oakley.get_status
35
+ #=>
36
+ #{
37
+ # "oak-batch.osc.edu:15001" => {
38
+ # :server_state => "Idle",
39
+ # :total_jobs => "2514",
40
+ # :default_queue => "batch",
41
+ # ...
42
+ # }
43
+ #}
44
+
45
+ # Get status information but only filter through specific attributes
46
+ oakley.get_status(filters: [:server_state, :total_jobs])
47
+ #=>
48
+ #{
49
+ # "oak-batch.osc.edu:15001" => {
50
+ # :server_state => "Idle",
51
+ # :total_jobs => "2514"
52
+ # }
53
+ #}
54
+ ```
55
+
56
+ You can also query information about the nodes, queues, and jobs running on
57
+ this batch server:
58
+
59
+ ```ruby
60
+ # Get list of nodes from batch server
61
+ b.get_nodes
62
+ #=>
63
+ #{
64
+ # "n0003" => {
65
+ # :state => "free",
66
+ # :power_state => "Running",
67
+ # :np => "12",
68
+ # ...
69
+ # },
70
+ # "n0004" => {
71
+ # :state => "free",
72
+ # :power_state => "Running",
73
+ # :np => "12",
74
+ # ...
75
+ # }, ...
76
+ #}
77
+
78
+ # To get info about a single node
79
+ b.get_node("n0003")
80
+ #=> { ... }
81
+
82
+ # Get list of queues from batch server
83
+ # see http://linux.die.net/man/7/pbs_queue_attributes
84
+ b.get_queues
85
+ #=>
86
+ #{
87
+ # "batch" => {
88
+ # :queue_type => "Route",
89
+ # :total_jobs => "2",
90
+ # :enabled => "True",
91
+ # ...
92
+ # },
93
+ # "serial" => {
94
+ # :queue_type => "Execution",
95
+ # :total_jobs => "2386",
96
+ # :enabled => "True",
97
+ # ...
98
+ # }, ...
99
+ #}
100
+
101
+ # To get info about a single queue
102
+ b.get_queue("serial")
103
+ #=> { ... }
104
+
105
+ # Get list of jobs from batch server
106
+ # see http://linux.die.net/man/7/pbs_server_attributes
107
+ b.get_jobs
108
+ #=>
109
+ #{
110
+ # "6621251.oak-batch.osc.edu" => {
111
+ # :Job_Name => "FEA_solver",
112
+ # :Job_Owner => "bob@oakley01.osc.edu",
113
+ # :job_state => "Q",
114
+ # ...
115
+ # },
116
+ # "6621252.oak-batch.osc.edu" => {
117
+ # :Job_Name => "CFD_solver",
118
+ # :Job_Owner => "sally@oakley02.osc.edu",
119
+ # :job_state => "R",
120
+ # ...
121
+ # }, ...
122
+ #}
123
+
124
+ # To get info about a single job
125
+ b.get_job("6621251.oak-batch.osc.edu")
126
+ #=> { ... }
127
+ ```
128
+
129
+ ### Simple Job Submission
130
+
131
+ To submit a script to the batch server:
132
+
133
+ ```ruby
134
+ # Simple job submission
135
+ job_id = b.submit_script("/path/to/script")
136
+ #=> "7166037.oak-batch.osc.edu"
137
+
138
+ # Get job information for this job
139
+ b.get_job(job_id)
140
+ #=> { ... }
141
+
142
+ # Hold this job
143
+ b.hold_job(job_id)
144
+
145
+ # Release this job
146
+ b.release_job(job_id)
147
+
148
+ # Delete this job
149
+ b.delete_job(job_id)
150
+ ```
151
+
152
+ To submit a string to the batch server:
153
+
154
+ ```ruby
155
+ # Submit a string to the batch server
156
+ job_id = b.submit_string("sleep 60")
157
+ ```
158
+
159
+ The above command will actually generate a temporary file on the local disk and
160
+ submit that to the batch server before it is cleaned up.
161
+
162
+ ### Advanced Job Submission
163
+
164
+ You can programmatically define the PBS directives of your choosing. They will
165
+ override any set within the batch script.
166
+
167
+ Define headers:
168
+
169
+ ```ruby
170
+ # Define headers:
171
+ # -N job_name
172
+ # -j oe
173
+ # -o /path/to/output
174
+ headers = {
175
+ PBS::ATTR[:N] => "job_name",
176
+ PBS::ATTR[:j] => "oe",
177
+ PBS::ATTR[:o] => "/path/to/output"
178
+ }
179
+
180
+ # or you can directly call the key
181
+ headers = {
182
+ Job_Name: "job_name",
183
+ Join_Path: "oe",
184
+ Output_Path: "/path/to/output"
185
+ }
186
+ ```
187
+
188
+ Define resources (directives that begin with `-l`):
189
+
190
+ ```ruby
191
+ # Define resources:
192
+ # -l nodes=1:ppn=12
193
+ # -l walltime=05:00:00
194
+ resources = {
195
+ nodes: "1:ppn=12",
196
+ walltime: "05:00:00"
197
+ }
198
+ ```
199
+
200
+ Define environment variables (directive that begins with `-v`):
201
+
202
+ ```ruby
203
+ # Define environment variables that will be exposed to batch job
204
+ envvars = {
205
+ TOKEN: 'a8dsjf873js0k',
206
+ USE_GUI: 1
207
+ }
208
+ ```
209
+
210
+ Submit job with these directives:
211
+
212
+ ```ruby
213
+ # Advanced job submission
214
+ job_id = b.submit_script("/path/to/script", headers: headers, resources: resources, envvars: envvars)
215
+
216
+ # Get job info
217
+ b.get_job(job_id)
34
218
  ```
data/lib/pbs.rb CHANGED
@@ -1,38 +1,9 @@
1
- require 'yaml'
2
- require 'socket'
3
-
4
1
  require_relative 'pbs/error'
5
2
  require_relative 'pbs/attributes'
6
3
  require_relative 'pbs/torque'
7
- require_relative 'pbs/conn'
8
- require_relative 'pbs/query'
9
- require_relative 'pbs/job'
4
+ require_relative 'pbs/batch'
10
5
  require_relative 'pbs/version'
11
6
 
7
+ # The main namespace for pbs
12
8
  module PBS
13
- # Path to the batch config yaml file describing the batch servers for
14
- # local batch schedulers.
15
- # @return [String] Path to the batch config yaml file.
16
- def self.default_batch_config_path
17
- default_config = File.expand_path("../../config/batch.yml", __FILE__)
18
- host_config = File.expand_path("../../config/#{Socket.gethostname}.yml", __FILE__)
19
- File.file?(host_config) ? host_config : default_config
20
- end
21
-
22
- # Get the path to the batch config yaml file.
23
- def self.batch_config_path
24
- @batch_config_path ||= self.default_batch_config_path
25
- end
26
-
27
- # Set the path to the batch config yaml file.
28
- # @param path [String] The path to the batch config yaml file.
29
- def self.batch_config_path=(path)
30
- @batch_config_path = File.expand_path(path)
31
- end
32
-
33
- # Hash generated from reading the batch config yaml file.
34
- # @return [Hash] Batch configuration generated from config yaml file.
35
- def self.batch_config
36
- YAML.load_file(batch_config_path)
37
- end
38
9
  end
@@ -1,12 +1,11 @@
1
- # Maintains a constant Hash of defined PBS attribute types
2
- #
3
- # Includes:
4
- # Attribute names used by user commands
5
- # Additional job and general attribute names
6
- # Additional queue attribute names
7
- # Additional server attribute names
8
- # Additional node attribute names
9
1
  module PBS
2
+ # Maintains a constant Hash of defined PBS attribute types
3
+ # Includes:
4
+ # Attribute names used by user commands
5
+ # Additional job and general attribute names
6
+ # Additional queue attribute names
7
+ # Additional server attribute names
8
+ # Additional node attribute names
10
9
  ATTR = {
11
10
  # Attribute names used by user commands
12
11
  a: :Execution_Time,
@@ -0,0 +1,334 @@
1
+ require 'open3'
2
+
3
+ module PBS
4
+ # Object used for simplified communication with a batch server
5
+ class Batch
6
+ # The host of the Torque batch server
7
+ # @example OSC's Oakley batch server
8
+ # my_conn.host #=> "oak-batch.osc.edu"
9
+ # @return [String] the batch server host
10
+ attr_reader :host
11
+
12
+ # The path to the Torque client installation
13
+ # @example For Torque 5.0.0
14
+ # my_conn.prefix.to_s #=> "/usr/local/torque/5.0.0"
15
+ # @return [Pathname, nil] path to torque installation
16
+ attr_reader :prefix
17
+
18
+ # @param host [#to_s] the batch server host
19
+ # @param prefix [#to_s, nil] path to torque installation
20
+ def initialize(host:, prefix: nil, **_)
21
+ @host = host.to_s
22
+ @prefix = Pathname.new(prefix) if prefix
23
+ end
24
+
25
+ # Convert object to hash
26
+ # @return [Hash] the hash describing this object
27
+ def to_h
28
+ {host: host, prefix: prefix}
29
+ end
30
+
31
+ # The comparison operator
32
+ # @param other [#to_h] batch server to compare against
33
+ # @return [Boolean] how batch servers compare
34
+ def ==(other)
35
+ to_h == other.to_h
36
+ end
37
+
38
+ # Checks whether two batch server objects are completely identical to each
39
+ # other
40
+ # @param other [Batch] batch server to compare against
41
+ # @return [Boolean] whether same objects
42
+ def eql?(other)
43
+ self.class == other.class && self == other
44
+ end
45
+
46
+ # Generates a hash value for this object
47
+ # @return [Fixnum] hash value of object
48
+ def hash
49
+ [self.class, to_h].hash
50
+ end
51
+
52
+ # Creates a connection to batch server and calls block in context of this
53
+ # connection
54
+ # @yieldparam cid [Fixnum] connection id from established batch server connection
55
+ # @yieldreturn the final value of the block
56
+ def connect(&block)
57
+ Torque.lib = prefix ? prefix.join('lib', 'libtorque.so') : nil
58
+ cid = Torque.pbs_connect(host)
59
+ Torque.raise_error(cid.abs) if cid < 0 # raise error if negative connection id
60
+ begin
61
+ value = yield cid
62
+ ensure
63
+ Torque.pbs_disconnect(cid) # always close connection
64
+ end
65
+ Torque.check_for_error # check for errors at end
66
+ value
67
+ end
68
+
69
+ # Get a hash with status info for this batch server
70
+ # @example Status info for OSC Oakley batch server
71
+ # my_conn.get_status
72
+ # #=>
73
+ # #{
74
+ # # "oak-batch.osc.edu:15001" => {
75
+ # # :server_state => "Idle",
76
+ # # ...
77
+ # # }
78
+ # #}
79
+ # @param filters [Array<Symbol>] list of attribs to filter on
80
+ # @return [Hash] status info for batch server
81
+ def get_status(filters: [])
82
+ connect do |cid|
83
+ filters = PBS::Torque::Attrl.from_list filters
84
+ batch_status = Torque.pbs_statserver cid, filters, nil
85
+ batch_status.to_h.tap { Torque.pbs_statfree batch_status }
86
+ end
87
+ end
88
+
89
+ # Get a list of hashes of the queues on the batch server
90
+ # @example Status info for OSC Oakley queues
91
+ # my_conn.get_queues
92
+ # #=>
93
+ # #{
94
+ # # "parallel" => {
95
+ # # :queue_type => "Execution",
96
+ # # ...
97
+ # # },
98
+ # # "serial" => {
99
+ # # :queue_type => "Execution",
100
+ # # ...
101
+ # # },
102
+ # # ...
103
+ # #}
104
+ # @param id [#to_s] the id of requested information
105
+ # @param filters [Array<Symbol>] list of attribs to filter on
106
+ # @return [Hash] hash of details for the queues
107
+ def get_queues(id: '', filters: [])
108
+ connect do |cid|
109
+ filters = PBS::Torque::Attrl.from_list(filters)
110
+ batch_status = Torque.pbs_statque cid, id.to_s, filters, nil
111
+ batch_status.to_h.tap { Torque.pbs_statfree batch_status }
112
+ end
113
+ end
114
+
115
+ # Get info for given batch server's queue
116
+ # @example Status info for OSC Oakley's parallel queue
117
+ # my_conn.get_queue("parallel")
118
+ # #=>
119
+ # #{
120
+ # # "parallel" => {
121
+ # # :queue_type => "Execution",
122
+ # # ...
123
+ # # }
124
+ # #}
125
+ # @param (see @get_queues)
126
+ # @return [Hash] status info for the queue
127
+ def get_queue(id, **kwargs)
128
+ get_queues(id: id, **kwargs)
129
+ end
130
+
131
+
132
+ # Get a list of hashes of the nodes on the batch server
133
+ # @example Status info for OSC Oakley nodes
134
+ # my_conn.get_nodes
135
+ # #=>
136
+ # #{
137
+ # # "n0001" => {
138
+ # # :np => "12",
139
+ # # ...
140
+ # # },
141
+ # # "n0002" => {
142
+ # # :np => "12",
143
+ # # ...
144
+ # # },
145
+ # # ...
146
+ # #}
147
+ # @param id [#to_s] the id of requested information
148
+ # @param filters [Array<Symbol>] list of attribs to filter on
149
+ # @return [Hash] hash of details for nodes
150
+ def get_nodes(id: '', filters: [])
151
+ connect do |cid|
152
+ filters = PBS::Torque::Attrl.from_list(filters)
153
+ batch_status = Torque.pbs_statnode cid, id.to_s, filters, nil
154
+ batch_status.to_h.tap { Torque.pbs_statfree batch_status }
155
+ end
156
+ end
157
+
158
+ # Get info for given batch server's node
159
+ # @example Status info for OSC Oakley's 'n0001' node
160
+ # my_conn.get_node('n0001')
161
+ # #=>
162
+ # #{
163
+ # # "n0001" => {
164
+ # # :np => "12",
165
+ # # ...
166
+ # # }
167
+ # #}
168
+ # @param (see #get_nodes)
169
+ # @return [Hash] status info for the node
170
+ def get_node(id, **kwargs)
171
+ get_nodes(id: id, **kwargs)
172
+ end
173
+
174
+ # Get a list of hashes of the jobs on the batch server
175
+ # @example Status info for OSC Oakley jobs
176
+ # my_conn.get_jobs
177
+ # #=>
178
+ # #{
179
+ # # "10219837.oak-batch.osc.edu" => {
180
+ # # :Job_Owner => "bob@oakley02.osc.edu",
181
+ # # :Job_Name => "CFD_Solver",
182
+ # # ...
183
+ # # },
184
+ # # "10219838.oak-batch.osc.edu" => {
185
+ # # :Job_Owner => "sally@oakley01.osc.edu",
186
+ # # :Job_Name => "FEA_Solver",
187
+ # # ...
188
+ # # },
189
+ # # ...
190
+ # #}
191
+ # @param id [#to_s] the id of requested information
192
+ # @param filters [Array<Symbol>] list of attribs to filter on
193
+ # @return [Hash] hash of details for jobs
194
+ def get_jobs(id: '', filters: [])
195
+ connect do |cid|
196
+ filters = PBS::Torque::Attrl.from_list(filters)
197
+ batch_status = Torque.pbs_statjob cid, id.to_s, filters, nil
198
+ batch_status.to_h.tap { Torque.pbs_statfree batch_status }
199
+ end
200
+ end
201
+
202
+ # Get info for given batch server's job
203
+ # @example Status info for OSC Oakley's '10219837.oak-batch.osc.edu' job
204
+ # my_conn.get_job('102719837.oak-batch.osc.edu')
205
+ # #=>
206
+ # #{
207
+ # # "10219837.oak-batch.osc.edu" => {
208
+ # # :Job_Owner => "bob@oakley02.osc.edu",
209
+ # # :Job_Name => "CFD_Solver",
210
+ # # ...
211
+ # # }
212
+ # #}
213
+ # @param (see #get_jobs)
214
+ # @return [Hash] hash with details of job
215
+ def get_job(id, **kwargs)
216
+ get_jobs(id: id, **kwargs)
217
+ end
218
+
219
+ # Put specified job on hold
220
+ # Possible hold types:
221
+ # :u => Available to the owner of the job, the batch operator and the batch administrator
222
+ # :o => Available to the batch operator and the batch administrator
223
+ # :s => Available to the batch administrator
224
+ # @example Put job '10219837.oak-batch.osc.edu' on hold
225
+ # my_conn.hold_job('10219837.oak-batch.osc.edu')
226
+ # @param id [#to_s] the id of the job
227
+ # @param type [Symbol] type of hold to be applied
228
+ # @return [void]
229
+ def hold_job(id, type: :u)
230
+ connect do |cid|
231
+ Torque.pbs_holdjob cid, id.to_s, type.to_s, nil
232
+ end
233
+ end
234
+
235
+ # Release a specified job that is on hold
236
+ # Possible hold types:
237
+ # :u => Available to the owner of the job, the batch operator and the batch administrator
238
+ # :o => Available to the batch operator and the batch administrator
239
+ # :s => Available to the batch administrator
240
+ # @example Release job '10219837.oak-batch.osc.edu' from hold
241
+ # my_conn.release_job('10219837.oak-batch.osc.edu')
242
+ # @param id [#to_s] the id of the job
243
+ # @param type [Symbol] type of hold to be removed
244
+ # @return [void]
245
+ def release_job(id, type: :u)
246
+ connect do |cid|
247
+ Torque.pbs_rlsjob cid, id.to_s, type.to_s, nil
248
+ end
249
+ end
250
+
251
+ # Delete a specified job from batch server
252
+ # @example Delete job '10219837.oak-batch.osc.edu' from batch
253
+ # my_conn.delete_job('10219837.oak-batch.osc.edu')
254
+ # @param id [#to_s] the id of the job
255
+ # @return [void]
256
+ def delete_job(id)
257
+ connect do |cid|
258
+ Torque.pbs_deljob cid, id.to_s, nil
259
+ end
260
+ end
261
+
262
+ # Submit a script to the batch server
263
+ # @example Submit a script with a few PBS directives
264
+ # my_conn.submit_script("/path/to/script",
265
+ # headers: {
266
+ # Job_Name: "myjob",
267
+ # Join_Path: "oe"
268
+ # },
269
+ # resources: {
270
+ # nodes: "4:ppn=12",
271
+ # walltime: "12:00:00"
272
+ # },
273
+ # envvars: {
274
+ # TOKEN: "asd90f9sd8g90hk34"
275
+ # }
276
+ # )
277
+ # #=> "6621251.oak-batch.osc.edu"
278
+ # @param script [#to_s] path to the script
279
+ # @param queue [#to_s] queue to submit script to
280
+ # @param headers [Hash] pbs headers
281
+ # @param resources [Hash] pbs resources
282
+ # @param envvars [Hash] pbs environment variables
283
+ # @param qsub [Boolean] whether use library or binary for submission
284
+ # @return [String] the id of the job that was created
285
+ def submit_script(script, queue: nil, headers: {}, resources: {}, envvars: {}, qsub: true)
286
+ send(qsub ? :qsub_submit : :pbs_submit, script, queue, headers, resources, envvars)
287
+ end
288
+
289
+ # Submit a script expanded into a string to the batch server
290
+ # @param string [#to_s] script as a string
291
+ # @param (see #submit_script)
292
+ # @return [String] the id of the job that was created
293
+ def submit_string(string, **kwargs)
294
+ Tempfile.open('qsub.') do |f|
295
+ f.write string.to_s
296
+ f.close
297
+ submit_script(f.path, **kwargs)
298
+ end
299
+ end
300
+
301
+ private
302
+ # Submit a script using Torque library
303
+ def pbs_submit(script, queue, headers, resources, envvars)
304
+ attribs = headers.dup
305
+ attribs[ATTR[:l]] = resources.dup unless resources.empty?
306
+ attribs[ATTR[:v]] = envvars.map{|k,v| "#{k}=#{v}"}.join(",") unless envvars.empty?
307
+
308
+ connect do |cid|
309
+ attropl = Torque::Attropl.from_hash attribs
310
+ Torque.pbs_submit cid, attropl, script.to_s, queue.to_s, nil
311
+ end
312
+ end
313
+
314
+ # Submit a script using Torque binary
315
+ # NB: The binary includes many useful filters and is preferred
316
+ def qsub_submit(script, queue, headers, resources, envvars)
317
+ params = ["-q", "#{queue}@#{host}"]
318
+ params += resources.map{|k,v| ["-l", "#{k}=#{v}"]}.flatten unless resources.empty?
319
+ params += ["-v", envvars.map{|k,v| "#{k}=#{v}"}.join(",")] unless envvars.empty?
320
+ params += headers.map do |k,v|
321
+ if param = ATTR.key(k) and param.length == 1
322
+ ["-#{param}", "#{v}"]
323
+ else
324
+ ["-W", "#{k}=#{v}"]
325
+ end
326
+ end.flatten
327
+ params << script.to_s
328
+
329
+ o, e, s = Open3.capture3(prefix.join("bin", "qsub").to_s, *params)
330
+ raise PBS::Error, e unless s.success?
331
+ o.chomp
332
+ end
333
+ end
334
+ end