once-only 0.2.1 → 0.2.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/README.md +59 -15
- data/VERSION +1 -1
- data/bin/once-only +34 -5
- data/lib/once-only/check.rb +29 -3
- metadata +3 -3
data/README.md
CHANGED
@@ -2,14 +2,46 @@
|
|
2
2
|
|
3
3
|
[](http://travis-ci.org/pjotrp/once-only)
|
4
4
|
|
5
|
-
Relax with PBS!
|
5
|
+
Relax with PBS!
|
6
|
+
|
7
|
+
No worries about running jobs concurrently from the command line (also
|
8
|
+
on multi-core). Once-only is inspired by the Lisp once-only function,
|
9
|
+
which wraps another function and calculates a result only once, based
|
10
|
+
on the same inputs. Simply prepend your command with once-only:
|
11
|
+
|
12
|
+
When running
|
13
|
+
|
14
|
+
```bash
|
15
|
+
once-only -d cluster00073 --pbs --in output.best.dnd ~/opt/paml/bin/codeml ~/paml7-8.ctl
|
16
|
+
```
|
17
|
+
|
18
|
+
This is what you want to see when same the job was executed before
|
19
|
+
|
20
|
+
```bash
|
21
|
+
**STATUS** Job 00073codemla4817 already completed!
|
22
|
+
```
|
23
|
+
|
24
|
+
This is what you see when a job is running
|
25
|
+
|
26
|
+
```bash
|
27
|
+
**STATUS** Job 00073codemla4817 is locked!
|
28
|
+
```
|
29
|
+
|
30
|
+
With PBS, this is what you want to see when a job is already in the queue
|
31
|
+
|
32
|
+
```bash
|
33
|
+
**STATUS** Job 00073codemla4817 already in queue!
|
34
|
+
```
|
35
|
+
|
36
|
+
Features
|
6
37
|
|
7
38
|
* Computations only happen once
|
8
|
-
* A completed job does not get submitted again to PBS
|
39
|
+
* A completed job does not get submitted again (to PBS)
|
9
40
|
* A job already in the queue does not get submitted again to PBS
|
10
41
|
* A completed job in the PBS queue does not run again
|
42
|
+
* A running job is locked
|
11
43
|
* Guarantee independently executed jobs
|
12
|
-
* Do not worry about submitting serial jobs
|
44
|
+
* Do not worry about submitting serial jobs multiple times
|
13
45
|
|
14
46
|
and coming
|
15
47
|
|
@@ -19,12 +51,18 @@ and coming
|
|
19
51
|
Once-only makes a program or script run only *once*, provided the inputs don't
|
20
52
|
change (in a functional style!). This is very useful when running a range of
|
21
53
|
jobs on a compute cluster or GRID. It may even be useful in the context of
|
22
|
-
webservices.
|
23
|
-
|
24
|
-
|
25
|
-
|
26
|
-
|
27
|
-
|
54
|
+
webservices.
|
55
|
+
|
56
|
+
Once-only makes it relaxed to run many jobs on compute clusters! A
|
57
|
+
mistake, interruption, or even a parameter tweak, does not mean
|
58
|
+
everything has to be run again. When running jobs serially you can
|
59
|
+
just batch submit them after getting the first results. Any missed
|
60
|
+
jobs can be run later again. This way you can get better utilisation
|
61
|
+
of your cores or a cluster. You can even use it as a poor-mans PBS on
|
62
|
+
your multi-core machine, or over NFS by firing up scripts
|
63
|
+
concurrently.
|
64
|
+
|
65
|
+
Examples:
|
28
66
|
|
29
67
|
Instead of running a tool or script directly, such as
|
30
68
|
|
@@ -89,12 +127,6 @@ md5sum on the one-only has file, for example
|
|
89
127
|
grep MD5 bio-table-ce4ceee0d2ee08ef235662c35b8238ad47fed030.txt |awk 'BEGIN { FS = "[ \t\n]+" }{ print $2,"",$3 }'|md5sum -c
|
90
128
|
```
|
91
129
|
|
92
|
-
Once-only is inspired by the Lisp once-only function, which wraps another
|
93
|
-
function and calculates a result only once, based on the same inputs. It is
|
94
|
-
also inspired by the NixOS software deployment system, which guarantees
|
95
|
-
packages are uniquely deployed, based on the source code inputs and the
|
96
|
-
configuration at compile time.
|
97
|
-
|
98
130
|
## Installation
|
99
131
|
|
100
132
|
Note: once-only is written in Ruby, but you don't need to understand
|
@@ -234,6 +266,18 @@ Note that files that come with a path will be stripped of their path
|
|
234
266
|
before execution. When files are very large you may want to consider
|
235
267
|
the --scratch option.
|
236
268
|
|
269
|
+
### Precalculated hashes
|
270
|
+
|
271
|
+
The --precalc option allows for using precalculated hash values. The
|
272
|
+
extension says what hash to use. Example:
|
273
|
+
|
274
|
+
```sh
|
275
|
+
once-only --precalc hash.md5 /bin/cat ~/.bashrc
|
276
|
+
```
|
277
|
+
|
278
|
+
Once-only will pick up the values from 'hash.md5' and use those after
|
279
|
+
making sure the time stamp of the hash file is most recent.
|
280
|
+
|
237
281
|
### Use the scratch disk with --scratch (nyi)
|
238
282
|
|
239
283
|
watch this page
|
data/VERSION
CHANGED
@@ -1 +1 @@
|
|
1
|
-
0.2.
|
1
|
+
0.2.2
|
data/bin/once-only
CHANGED
@@ -19,10 +19,12 @@ Usage:
|
|
19
19
|
--skip-regex regex skip making checksumes of filenames that match the regex (multiple allowed)
|
20
20
|
--skip-glob regex skip making checksumes of filenames that match the glob (multiple allowed)
|
21
21
|
--include|--in file include input filename for making the checksums (file should exist)
|
22
|
+
--precalc file use precalculated Hash values (extension .md5)
|
22
23
|
-v increase verbosity
|
23
24
|
-q run quietly
|
24
25
|
--debug give debug information
|
25
26
|
--dry-run do not execute command
|
27
|
+
--ignore-lock ignore locked files (they expire normally after 5 hours)
|
26
28
|
--force force execute command
|
27
29
|
|
28
30
|
Examples:
|
@@ -67,7 +69,7 @@ def exit_error errval = 1, msg = nil
|
|
67
69
|
end
|
68
70
|
|
69
71
|
def parse_args(args)
|
70
|
-
options = { :skip => [], :skip_regex => [], :skip_glob => [], :include => [] }
|
72
|
+
options = { :precalc => [], :skip => [], :skip_regex => [], :skip_glob => [], :include => [] }
|
71
73
|
|
72
74
|
consume = lambda { |args|
|
73
75
|
if not args[0]
|
@@ -112,6 +114,10 @@ def parse_args(args)
|
|
112
114
|
when '--copy'
|
113
115
|
options[:copy] = true
|
114
116
|
consume.call(args[1..-1])
|
117
|
+
when '--precalc'
|
118
|
+
p args
|
119
|
+
options[:precalc] << args[1]
|
120
|
+
consume.call(args[2..-1])
|
115
121
|
when '-h', '--help'
|
116
122
|
print USAGE
|
117
123
|
exit 1
|
@@ -127,6 +133,9 @@ def parse_args(args)
|
|
127
133
|
when '--dry-run'
|
128
134
|
options[:dry_run] = true
|
129
135
|
consume.call(args[1..-1])
|
136
|
+
when '--ignore-lock'
|
137
|
+
options[:ignore_lock] = true
|
138
|
+
consume.call(args[1..-1])
|
130
139
|
when '--force'
|
131
140
|
options[:force] = true
|
132
141
|
consume.call(args[1..-1])
|
@@ -158,6 +167,10 @@ once_only_args = OnceOnly::Check.drop_pbs_option(once_only_args)
|
|
158
167
|
once_only_args = OnceOnly::Check.drop_dir_option(once_only_args)
|
159
168
|
once_only_command = once_only_args.join(' ')
|
160
169
|
|
170
|
+
# --- Fetch the pre-calculated checksums
|
171
|
+
precalc = OnceOnly::Check.precalculated_checksums(options[:precalc])
|
172
|
+
|
173
|
+
# --- Calculate the checksums for the items in the list
|
161
174
|
command = args.join(' ')
|
162
175
|
command_sorted = args.sort.join(' ')
|
163
176
|
command_sha1 = OnceOnly::Check::calc_checksum(command_sorted)
|
@@ -173,6 +186,7 @@ base_dir = Dir.pwd
|
|
173
186
|
executable = args[0]
|
174
187
|
args = args[1..-1] if options[:skip_exe]
|
175
188
|
|
189
|
+
# Handle the file list
|
176
190
|
file_list = OnceOnly::Check::get_file_list(args)
|
177
191
|
options[:skip_regex].each { |regex|
|
178
192
|
file_list = OnceOnly::Check::filter_file_list(file_list,regex)
|
@@ -186,16 +200,29 @@ OnceOnly::Check::check_files_exist(options[:include])
|
|
186
200
|
file_list += options[:include]
|
187
201
|
file_list = file_list.uniq
|
188
202
|
|
189
|
-
checksums = OnceOnly::Check::calc_file_checksums(file_list)
|
203
|
+
checksums = OnceOnly::Check::calc_file_checksums(file_list,precalc)
|
190
204
|
checksums.push ['SHA1',command_sha1,command_sorted] if not options[:skip_cli]
|
191
205
|
|
192
206
|
# ---- Create filenames
|
193
207
|
once_only_filename = OnceOnly::Check::make_once_filename(checksums,File.basename(executable))
|
194
208
|
$stderr.print "Check file name ",once_only_filename,"\n" if options[:verbose]
|
195
209
|
error_filename = once_only_filename + '.err'
|
196
|
-
tag_filename = once_only_filename + '.run'
|
197
210
|
$stderr.print "**STATUS** Job file exists ",once_only_filename,"!\n" if options[:debug] and File.exist?(once_only_filename)
|
198
211
|
|
212
|
+
# ---- The 'run' file is used to prepare for a job
|
213
|
+
tag_filename = once_only_filename + '.run'
|
214
|
+
|
215
|
+
# ---- The 'lock' file is used when the job is running
|
216
|
+
lock_filename = once_only_filename + '.lock'
|
217
|
+
if File.exist?(lock_filename) and not options[:force] and not options[:ignore_lock]
|
218
|
+
$stderr.print "**STATUS** Job is locked with #{lock_filename} '#{original_commands}'!\n" if not options[:quiet]
|
219
|
+
if File.mtime(lock_filename) < Time.now - 18000
|
220
|
+
$stderr.print "**STATUS ** Lock is stale, retrying now\n"
|
221
|
+
else
|
222
|
+
exit 0
|
223
|
+
end
|
224
|
+
end
|
225
|
+
|
199
226
|
# ---- Create job name
|
200
227
|
dirname = File.basename(Dir.pwd).rjust(8,"-") # make sure it is long enough
|
201
228
|
|
@@ -207,7 +234,7 @@ if options[:copy]
|
|
207
234
|
copy_dir = base_dir + '/' + File.basename(once_only_filename,".txt")
|
208
235
|
end
|
209
236
|
|
210
|
-
if options[:force] or not File.exist?(once_only_filename)
|
237
|
+
if options[:force] or not File.exist?(once_only_filename)
|
211
238
|
$stderr.print "Running #{command}\n" if not options[:quiet]
|
212
239
|
OnceOnly::Check::write_file(tag_filename,checksums)
|
213
240
|
if options[:pbs]
|
@@ -240,6 +267,7 @@ if options[:force] or not File.exist?(once_only_filename)
|
|
240
267
|
else
|
241
268
|
# --- Run on command line
|
242
269
|
if !options[:dry_run]
|
270
|
+
File.open(lock_filename, "w") {}
|
243
271
|
success =
|
244
272
|
if options[:copy]
|
245
273
|
exit_error(1,"Directory #{copy_dir} already exists!") if File.directory?(copy_dir)
|
@@ -274,6 +302,7 @@ if options[:force] or not File.exist?(once_only_filename)
|
|
274
302
|
system(command)
|
275
303
|
end
|
276
304
|
Dir.chdir(base_dir) if options[:copy]
|
305
|
+
File.unlink(lock_filename)
|
277
306
|
if not success
|
278
307
|
OnceOnly::Check::write_file(error_filename,checksums)
|
279
308
|
File.unlink(tag_filename) if File.exist?(tag_filename)
|
@@ -283,7 +312,7 @@ if options[:force] or not File.exist?(once_only_filename)
|
|
283
312
|
File.unlink(error_filename) if File.exist?(error_filename)
|
284
313
|
OnceOnly::Check::write_file(once_only_filename,checksums)
|
285
314
|
File.unlink(tag_filename) if File.exist?(tag_filename)
|
286
|
-
|
315
|
+
end
|
287
316
|
end
|
288
317
|
end
|
289
318
|
else
|
data/lib/once-only/check.rb
CHANGED
@@ -31,10 +31,36 @@ module OnceOnly
|
|
31
31
|
list.map { |name| ( Dir.glob(glob).index(name) ? nil : name ) }.compact
|
32
32
|
end
|
33
33
|
|
34
|
-
#
|
35
|
-
def Check::
|
34
|
+
# Return a hash of files with their hash type, hash value and check time
|
35
|
+
def Check::precalculated_checksums(files)
|
36
|
+
precalc = {}
|
37
|
+
files.each do | fn |
|
38
|
+
dir = File.dirname(fn)
|
39
|
+
raise "Precalculated hash file should have .md5 extension!" if fn !~ /\.md5$/
|
40
|
+
t = File.mtime(fn)
|
41
|
+
File.open(fn).each { |s|
|
42
|
+
a = s.split
|
43
|
+
checkfn = File.expand_path(a[1],dir)
|
44
|
+
precalc[checkfn] = { type: 'MD5', hash: a[0], time: t }
|
45
|
+
}
|
46
|
+
end
|
47
|
+
precalc
|
48
|
+
end
|
49
|
+
|
50
|
+
# Calculate the checksums for each file in the list and return a list
|
51
|
+
# of array - each row containing the Hash type (MD5), the value and the (relative)
|
52
|
+
# file path.
|
53
|
+
def Check::calc_file_checksums list, precalc
|
36
54
|
list.map { |fn|
|
37
|
-
|
55
|
+
# First see if fn is in the precalculated list
|
56
|
+
fqn = File.expand_path(fn)
|
57
|
+
if precalc[fqn] and File.mtime(fqn) < precalc[fqn][:time]
|
58
|
+
$stderr.print "Precalculated ",fn,"\n"
|
59
|
+
rec = precalc[fqn]
|
60
|
+
[rec[:type],rec[:hash],fqn]
|
61
|
+
else
|
62
|
+
['MD5'] + `/usr/bin/md5sum #{fqn}`.split
|
63
|
+
end
|
38
64
|
}
|
39
65
|
end
|
40
66
|
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: once-only
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.2.
|
4
|
+
version: 0.2.2
|
5
5
|
prerelease:
|
6
6
|
platform: ruby
|
7
7
|
authors:
|
@@ -9,7 +9,7 @@ authors:
|
|
9
9
|
autorequire:
|
10
10
|
bindir: bin
|
11
11
|
cert_chain: []
|
12
|
-
date: 2013-
|
12
|
+
date: 2013-11-02 00:00:00.000000000 Z
|
13
13
|
dependencies:
|
14
14
|
- !ruby/object:Gem::Dependency
|
15
15
|
name: rspec
|
@@ -118,7 +118,7 @@ required_ruby_version: !ruby/object:Gem::Requirement
|
|
118
118
|
version: '0'
|
119
119
|
segments:
|
120
120
|
- 0
|
121
|
-
hash: -
|
121
|
+
hash: -2270452427765269751
|
122
122
|
required_rubygems_version: !ruby/object:Gem::Requirement
|
123
123
|
none: false
|
124
124
|
requirements:
|