once-only 0.2.1 → 0.2.2
Sign up to get free protection for your applications and to get access to all the features.
- data/README.md +59 -15
- data/VERSION +1 -1
- data/bin/once-only +34 -5
- data/lib/once-only/check.rb +29 -3
- metadata +3 -3
data/README.md
CHANGED
@@ -2,14 +2,46 @@
|
|
2
2
|
|
3
3
|
[![Build Status](https://secure.travis-ci.org/pjotrp/once-only.png)](http://travis-ci.org/pjotrp/once-only)
|
4
4
|
|
5
|
-
Relax with PBS!
|
5
|
+
Relax with PBS!
|
6
|
+
|
7
|
+
No worries about running jobs concurrently from the command line (also
|
8
|
+
on multi-core). Once-only is inspired by the Lisp once-only function,
|
9
|
+
which wraps another function and calculates a result only once, based
|
10
|
+
on the same inputs. Simply prepend your command with once-only:
|
11
|
+
|
12
|
+
When running
|
13
|
+
|
14
|
+
```bash
|
15
|
+
once-only -d cluster00073 --pbs --in output.best.dnd ~/opt/paml/bin/codeml ~/paml7-8.ctl
|
16
|
+
```
|
17
|
+
|
18
|
+
This is what you want to see when same the job was executed before
|
19
|
+
|
20
|
+
```bash
|
21
|
+
**STATUS** Job 00073codemla4817 already completed!
|
22
|
+
```
|
23
|
+
|
24
|
+
This is what you see when a job is running
|
25
|
+
|
26
|
+
```bash
|
27
|
+
**STATUS** Job 00073codemla4817 is locked!
|
28
|
+
```
|
29
|
+
|
30
|
+
With PBS, this is what you want to see when a job is already in the queue
|
31
|
+
|
32
|
+
```bash
|
33
|
+
**STATUS** Job 00073codemla4817 already in queue!
|
34
|
+
```
|
35
|
+
|
36
|
+
Features
|
6
37
|
|
7
38
|
* Computations only happen once
|
8
|
-
* A completed job does not get submitted again to PBS
|
39
|
+
* A completed job does not get submitted again (to PBS)
|
9
40
|
* A job already in the queue does not get submitted again to PBS
|
10
41
|
* A completed job in the PBS queue does not run again
|
42
|
+
* A running job is locked
|
11
43
|
* Guarantee independently executed jobs
|
12
|
-
* Do not worry about submitting serial jobs
|
44
|
+
* Do not worry about submitting serial jobs multiple times
|
13
45
|
|
14
46
|
and coming
|
15
47
|
|
@@ -19,12 +51,18 @@ and coming
|
|
19
51
|
Once-only makes a program or script run only *once*, provided the inputs don't
|
20
52
|
change (in a functional style!). This is very useful when running a range of
|
21
53
|
jobs on a compute cluster or GRID. It may even be useful in the context of
|
22
|
-
webservices.
|
23
|
-
|
24
|
-
|
25
|
-
|
26
|
-
|
27
|
-
|
54
|
+
webservices.
|
55
|
+
|
56
|
+
Once-only makes it relaxed to run many jobs on compute clusters! A
|
57
|
+
mistake, interruption, or even a parameter tweak, does not mean
|
58
|
+
everything has to be run again. When running jobs serially you can
|
59
|
+
just batch submit them after getting the first results. Any missed
|
60
|
+
jobs can be run later again. This way you can get better utilisation
|
61
|
+
of your cores or a cluster. You can even use it as a poor-mans PBS on
|
62
|
+
your multi-core machine, or over NFS by firing up scripts
|
63
|
+
concurrently.
|
64
|
+
|
65
|
+
Examples:
|
28
66
|
|
29
67
|
Instead of running a tool or script directly, such as
|
30
68
|
|
@@ -89,12 +127,6 @@ md5sum on the one-only has file, for example
|
|
89
127
|
grep MD5 bio-table-ce4ceee0d2ee08ef235662c35b8238ad47fed030.txt |awk 'BEGIN { FS = "[ \t\n]+" }{ print $2,"",$3 }'|md5sum -c
|
90
128
|
```
|
91
129
|
|
92
|
-
Once-only is inspired by the Lisp once-only function, which wraps another
|
93
|
-
function and calculates a result only once, based on the same inputs. It is
|
94
|
-
also inspired by the NixOS software deployment system, which guarantees
|
95
|
-
packages are uniquely deployed, based on the source code inputs and the
|
96
|
-
configuration at compile time.
|
97
|
-
|
98
130
|
## Installation
|
99
131
|
|
100
132
|
Note: once-only is written in Ruby, but you don't need to understand
|
@@ -234,6 +266,18 @@ Note that files that come with a path will be stripped of their path
|
|
234
266
|
before execution. When files are very large you may want to consider
|
235
267
|
the --scratch option.
|
236
268
|
|
269
|
+
### Precalculated hashes
|
270
|
+
|
271
|
+
The --precalc option allows for using precalculated hash values. The
|
272
|
+
extension says what hash to use. Example:
|
273
|
+
|
274
|
+
```sh
|
275
|
+
once-only --precalc hash.md5 /bin/cat ~/.bashrc
|
276
|
+
```
|
277
|
+
|
278
|
+
Once-only will pick up the values from 'hash.md5' and use those after
|
279
|
+
making sure the time stamp of the hash file is most recent.
|
280
|
+
|
237
281
|
### Use the scratch disk with --scratch (nyi)
|
238
282
|
|
239
283
|
watch this page
|
data/VERSION
CHANGED
@@ -1 +1 @@
|
|
1
|
-
0.2.
|
1
|
+
0.2.2
|
data/bin/once-only
CHANGED
@@ -19,10 +19,12 @@ Usage:
|
|
19
19
|
--skip-regex regex skip making checksumes of filenames that match the regex (multiple allowed)
|
20
20
|
--skip-glob regex skip making checksumes of filenames that match the glob (multiple allowed)
|
21
21
|
--include|--in file include input filename for making the checksums (file should exist)
|
22
|
+
--precalc file use precalculated Hash values (extension .md5)
|
22
23
|
-v increase verbosity
|
23
24
|
-q run quietly
|
24
25
|
--debug give debug information
|
25
26
|
--dry-run do not execute command
|
27
|
+
--ignore-lock ignore locked files (they expire normally after 5 hours)
|
26
28
|
--force force execute command
|
27
29
|
|
28
30
|
Examples:
|
@@ -67,7 +69,7 @@ def exit_error errval = 1, msg = nil
|
|
67
69
|
end
|
68
70
|
|
69
71
|
def parse_args(args)
|
70
|
-
options = { :skip => [], :skip_regex => [], :skip_glob => [], :include => [] }
|
72
|
+
options = { :precalc => [], :skip => [], :skip_regex => [], :skip_glob => [], :include => [] }
|
71
73
|
|
72
74
|
consume = lambda { |args|
|
73
75
|
if not args[0]
|
@@ -112,6 +114,10 @@ def parse_args(args)
|
|
112
114
|
when '--copy'
|
113
115
|
options[:copy] = true
|
114
116
|
consume.call(args[1..-1])
|
117
|
+
when '--precalc'
|
118
|
+
p args
|
119
|
+
options[:precalc] << args[1]
|
120
|
+
consume.call(args[2..-1])
|
115
121
|
when '-h', '--help'
|
116
122
|
print USAGE
|
117
123
|
exit 1
|
@@ -127,6 +133,9 @@ def parse_args(args)
|
|
127
133
|
when '--dry-run'
|
128
134
|
options[:dry_run] = true
|
129
135
|
consume.call(args[1..-1])
|
136
|
+
when '--ignore-lock'
|
137
|
+
options[:ignore_lock] = true
|
138
|
+
consume.call(args[1..-1])
|
130
139
|
when '--force'
|
131
140
|
options[:force] = true
|
132
141
|
consume.call(args[1..-1])
|
@@ -158,6 +167,10 @@ once_only_args = OnceOnly::Check.drop_pbs_option(once_only_args)
|
|
158
167
|
once_only_args = OnceOnly::Check.drop_dir_option(once_only_args)
|
159
168
|
once_only_command = once_only_args.join(' ')
|
160
169
|
|
170
|
+
# --- Fetch the pre-calculated checksums
|
171
|
+
precalc = OnceOnly::Check.precalculated_checksums(options[:precalc])
|
172
|
+
|
173
|
+
# --- Calculate the checksums for the items in the list
|
161
174
|
command = args.join(' ')
|
162
175
|
command_sorted = args.sort.join(' ')
|
163
176
|
command_sha1 = OnceOnly::Check::calc_checksum(command_sorted)
|
@@ -173,6 +186,7 @@ base_dir = Dir.pwd
|
|
173
186
|
executable = args[0]
|
174
187
|
args = args[1..-1] if options[:skip_exe]
|
175
188
|
|
189
|
+
# Handle the file list
|
176
190
|
file_list = OnceOnly::Check::get_file_list(args)
|
177
191
|
options[:skip_regex].each { |regex|
|
178
192
|
file_list = OnceOnly::Check::filter_file_list(file_list,regex)
|
@@ -186,16 +200,29 @@ OnceOnly::Check::check_files_exist(options[:include])
|
|
186
200
|
file_list += options[:include]
|
187
201
|
file_list = file_list.uniq
|
188
202
|
|
189
|
-
checksums = OnceOnly::Check::calc_file_checksums(file_list)
|
203
|
+
checksums = OnceOnly::Check::calc_file_checksums(file_list,precalc)
|
190
204
|
checksums.push ['SHA1',command_sha1,command_sorted] if not options[:skip_cli]
|
191
205
|
|
192
206
|
# ---- Create filenames
|
193
207
|
once_only_filename = OnceOnly::Check::make_once_filename(checksums,File.basename(executable))
|
194
208
|
$stderr.print "Check file name ",once_only_filename,"\n" if options[:verbose]
|
195
209
|
error_filename = once_only_filename + '.err'
|
196
|
-
tag_filename = once_only_filename + '.run'
|
197
210
|
$stderr.print "**STATUS** Job file exists ",once_only_filename,"!\n" if options[:debug] and File.exist?(once_only_filename)
|
198
211
|
|
212
|
+
# ---- The 'run' file is used to prepare for a job
|
213
|
+
tag_filename = once_only_filename + '.run'
|
214
|
+
|
215
|
+
# ---- The 'lock' file is used when the job is running
|
216
|
+
lock_filename = once_only_filename + '.lock'
|
217
|
+
if File.exist?(lock_filename) and not options[:force] and not options[:ignore_lock]
|
218
|
+
$stderr.print "**STATUS** Job is locked with #{lock_filename} '#{original_commands}'!\n" if not options[:quiet]
|
219
|
+
if File.mtime(lock_filename) < Time.now - 18000
|
220
|
+
$stderr.print "**STATUS ** Lock is stale, retrying now\n"
|
221
|
+
else
|
222
|
+
exit 0
|
223
|
+
end
|
224
|
+
end
|
225
|
+
|
199
226
|
# ---- Create job name
|
200
227
|
dirname = File.basename(Dir.pwd).rjust(8,"-") # make sure it is long enough
|
201
228
|
|
@@ -207,7 +234,7 @@ if options[:copy]
|
|
207
234
|
copy_dir = base_dir + '/' + File.basename(once_only_filename,".txt")
|
208
235
|
end
|
209
236
|
|
210
|
-
if options[:force] or not File.exist?(once_only_filename)
|
237
|
+
if options[:force] or not File.exist?(once_only_filename)
|
211
238
|
$stderr.print "Running #{command}\n" if not options[:quiet]
|
212
239
|
OnceOnly::Check::write_file(tag_filename,checksums)
|
213
240
|
if options[:pbs]
|
@@ -240,6 +267,7 @@ if options[:force] or not File.exist?(once_only_filename)
|
|
240
267
|
else
|
241
268
|
# --- Run on command line
|
242
269
|
if !options[:dry_run]
|
270
|
+
File.open(lock_filename, "w") {}
|
243
271
|
success =
|
244
272
|
if options[:copy]
|
245
273
|
exit_error(1,"Directory #{copy_dir} already exists!") if File.directory?(copy_dir)
|
@@ -274,6 +302,7 @@ if options[:force] or not File.exist?(once_only_filename)
|
|
274
302
|
system(command)
|
275
303
|
end
|
276
304
|
Dir.chdir(base_dir) if options[:copy]
|
305
|
+
File.unlink(lock_filename)
|
277
306
|
if not success
|
278
307
|
OnceOnly::Check::write_file(error_filename,checksums)
|
279
308
|
File.unlink(tag_filename) if File.exist?(tag_filename)
|
@@ -283,7 +312,7 @@ if options[:force] or not File.exist?(once_only_filename)
|
|
283
312
|
File.unlink(error_filename) if File.exist?(error_filename)
|
284
313
|
OnceOnly::Check::write_file(once_only_filename,checksums)
|
285
314
|
File.unlink(tag_filename) if File.exist?(tag_filename)
|
286
|
-
|
315
|
+
end
|
287
316
|
end
|
288
317
|
end
|
289
318
|
else
|
data/lib/once-only/check.rb
CHANGED
@@ -31,10 +31,36 @@ module OnceOnly
|
|
31
31
|
list.map { |name| ( Dir.glob(glob).index(name) ? nil : name ) }.compact
|
32
32
|
end
|
33
33
|
|
34
|
-
#
|
35
|
-
def Check::
|
34
|
+
# Return a hash of files with their hash type, hash value and check time
|
35
|
+
def Check::precalculated_checksums(files)
|
36
|
+
precalc = {}
|
37
|
+
files.each do | fn |
|
38
|
+
dir = File.dirname(fn)
|
39
|
+
raise "Precalculated hash file should have .md5 extension!" if fn !~ /\.md5$/
|
40
|
+
t = File.mtime(fn)
|
41
|
+
File.open(fn).each { |s|
|
42
|
+
a = s.split
|
43
|
+
checkfn = File.expand_path(a[1],dir)
|
44
|
+
precalc[checkfn] = { type: 'MD5', hash: a[0], time: t }
|
45
|
+
}
|
46
|
+
end
|
47
|
+
precalc
|
48
|
+
end
|
49
|
+
|
50
|
+
# Calculate the checksums for each file in the list and return a list
|
51
|
+
# of array - each row containing the Hash type (MD5), the value and the (relative)
|
52
|
+
# file path.
|
53
|
+
def Check::calc_file_checksums list, precalc
|
36
54
|
list.map { |fn|
|
37
|
-
|
55
|
+
# First see if fn is in the precalculated list
|
56
|
+
fqn = File.expand_path(fn)
|
57
|
+
if precalc[fqn] and File.mtime(fqn) < precalc[fqn][:time]
|
58
|
+
$stderr.print "Precalculated ",fn,"\n"
|
59
|
+
rec = precalc[fqn]
|
60
|
+
[rec[:type],rec[:hash],fqn]
|
61
|
+
else
|
62
|
+
['MD5'] + `/usr/bin/md5sum #{fqn}`.split
|
63
|
+
end
|
38
64
|
}
|
39
65
|
end
|
40
66
|
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: once-only
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.2.
|
4
|
+
version: 0.2.2
|
5
5
|
prerelease:
|
6
6
|
platform: ruby
|
7
7
|
authors:
|
@@ -9,7 +9,7 @@ authors:
|
|
9
9
|
autorequire:
|
10
10
|
bindir: bin
|
11
11
|
cert_chain: []
|
12
|
-
date: 2013-
|
12
|
+
date: 2013-11-02 00:00:00.000000000 Z
|
13
13
|
dependencies:
|
14
14
|
- !ruby/object:Gem::Dependency
|
15
15
|
name: rspec
|
@@ -118,7 +118,7 @@ required_ruby_version: !ruby/object:Gem::Requirement
|
|
118
118
|
version: '0'
|
119
119
|
segments:
|
120
120
|
- 0
|
121
|
-
hash: -
|
121
|
+
hash: -2270452427765269751
|
122
122
|
required_rubygems_version: !ruby/object:Gem::Requirement
|
123
123
|
none: false
|
124
124
|
requirements:
|