bio-gemma-wrapper 0.99.1 → 0.99.5
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/README.md +26 -10
- data/VERSION +1 -1
- data/bin/gemma-wrapper +113 -34
- data/gemma-wrapper.gemspec +2 -1
- data/lib/lock.rb +95 -0
- metadata +6 -6
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 5ec477c7560ae55b6d7c8a74b5e46cd95586a87f02ac698c540b2d6cc40c4392
|
4
|
+
data.tar.gz: bf79dc493baa99efd2e20a298d702ece9973c3fd377773cc80f0867c0a132ae5
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 0ca0c04ef86c22d332c10f21b66ef24aab7fd3dbfd1f2db67f78309ba8a5af47a9263bf94aaf2afeda021cdace3ed042af6f3a5ebf6628714a78ad18b494284f
|
7
|
+
data.tar.gz: b3692f74bb4437f70d2671fe48e1ec9f7c44c64b4553c93b511cc648e157e6a9f5b0f66acce2f07721a74a7ce20a1993c7eb2d2a74eb362c786cd7ee42eeab16
|
data/README.md
CHANGED
@@ -8,11 +8,12 @@ Nat. Genet., 2016)](cfw.gif)
|
|
8
8
|
## Introduction
|
9
9
|
|
10
10
|
Gemma-wrapper allows running GEMMA with LOCO, GEMMA with caching,
|
11
|
-
GEMMA in parallel (now the default), and GEMMA on
|
12
|
-
is used to run GEMMA as part of the
|
13
|
-
environment.
|
11
|
+
GEMMA in parallel (now the default with LOCO), and GEMMA on
|
12
|
+
PBS. Gemma-wrapper is used to run GEMMA as part of the
|
13
|
+
https://genenetwork.org/ environment.
|
14
14
|
|
15
|
-
Note that gemma-wrapper is projected to be integrated
|
15
|
+
Note that a version of gemma-wrapper is projected to be integrated
|
16
|
+
into gemma itself.
|
16
17
|
|
17
18
|
GEMMA is a software toolkit for fast application of linear mixed
|
18
19
|
models (LMMs) and related models to genome-wide association studies
|
@@ -29,6 +30,14 @@ does a pass-through of all standard GEMMA invocation switches. On
|
|
29
30
|
return gemma-wrapper can return a JSON object (--json) which is
|
30
31
|
useful for web-services.
|
31
32
|
|
33
|
+
## Performance
|
34
|
+
|
35
|
+
LOCO runs in parallel by default which is at least a 5x performance
|
36
|
+
improvement on a machine with enough cores. GEMMA without LOCO,
|
37
|
+
however, does not run in parallel by default. Performance
|
38
|
+
improvements with the parallel implementation for LOCO and non-LOCO
|
39
|
+
can be viewed [here](./test/performance/releases.gmi).
|
40
|
+
|
32
41
|
## Installation
|
33
42
|
|
34
43
|
Prerequisites are
|
@@ -53,15 +62,19 @@ and it will render something like
|
|
53
62
|
Usage: gemma-wrapper [options] -- [gemma-options]
|
54
63
|
--permutate n Permutate # times by shuffling phenotypes
|
55
64
|
--permute-phenotypes filen Phenotypes to be shuffled in permutations
|
56
|
-
--loco
|
65
|
+
--loco Run full leave-one-chromosome-out (LOCO)
|
66
|
+
--chromosomes [1,2,3] Run specific chromosomes
|
57
67
|
--input filen JSON input variables (used for LOCO)
|
58
68
|
--cache-dir path Use a cache directory
|
59
69
|
--json Create output file in JSON format
|
60
|
-
--force Force computation
|
61
|
-
--
|
70
|
+
--force Force computation (override cache)
|
71
|
+
--parallel Run jobs in parallel
|
72
|
+
--no-parallel Do not run jobs in parallel
|
73
|
+
--slurm[=opts] Use slurm PBS for submitting jobs
|
62
74
|
--q, --quiet Run quietly
|
63
75
|
-v, --verbose Run verbosely
|
64
|
-
|
76
|
+
-d, --debug Show debug messages and keep intermediate output
|
77
|
+
--dry-run Show commands, but don't execute
|
65
78
|
-- Anything after gets passed to GEMMA
|
66
79
|
|
67
80
|
-h, --help display this help and exit
|
@@ -99,6 +112,7 @@ the data files are found):
|
|
99
112
|
gemma-wrapper -- \
|
100
113
|
-g test/data/input/BXD_geno.txt.gz \
|
101
114
|
-p test/data/input/BXD_pheno.txt \
|
115
|
+
-a test/data/input/BXD_snps.txt \
|
102
116
|
-gk \
|
103
117
|
-debug
|
104
118
|
|
@@ -116,6 +130,7 @@ You can also get JSON output on STDOUT by providing the --json switch
|
|
116
130
|
gemma-wrapper --json -- \
|
117
131
|
-g test/data/input/BXD_geno.txt.gz \
|
118
132
|
-p test/data/input/BXD_pheno.txt \
|
133
|
+
-a test/data/input/BXD_snps.txt \
|
119
134
|
-gk \
|
120
135
|
-debug > K.json
|
121
136
|
|
@@ -133,6 +148,7 @@ default. If you want something else provide a --cache-dir, e.g.
|
|
133
148
|
gemma-wrapper --cache-dir ~/.gemma-cache -- \
|
134
149
|
-g test/data/input/BXD_geno.txt.gz \
|
135
150
|
-p test/data/input/BXD_pheno.txt \
|
151
|
+
-a test/data/input/BXD_snps.txt \
|
136
152
|
-gk \
|
137
153
|
-debug
|
138
154
|
|
@@ -143,7 +159,7 @@ will store K in ~/.gemma-cache.
|
|
143
159
|
Run the LMM using the K's captured earlier in K.json using the --input
|
144
160
|
switch
|
145
161
|
|
146
|
-
gemma-wrapper --json --
|
162
|
+
gemma-wrapper --json --input K.json -- \
|
147
163
|
-g test/data/input/BXD_geno.txt.gz \
|
148
164
|
-p test/data/input/BXD_pheno.txt \
|
149
165
|
-c test/data/input/BXD_covariates2.txt \
|
@@ -163,7 +179,7 @@ https://github.com/genetics-statistics/GEMMA/issues/46). To loop all
|
|
163
179
|
chromosomes first create all K's with
|
164
180
|
|
165
181
|
gemma-wrapper --json \
|
166
|
-
--loco
|
182
|
+
--loco -- \
|
167
183
|
-g test/data/input/BXD_geno.txt.gz \
|
168
184
|
-p test/data/input/BXD_pheno.txt \
|
169
185
|
-a test/data/input/BXD_snps.txt \
|
data/VERSION
CHANGED
@@ -1 +1 @@
|
|
1
|
-
0.99.
|
1
|
+
0.99.5
|
data/bin/gemma-wrapper
CHANGED
@@ -14,12 +14,12 @@ GEMMA wrapper example:
|
|
14
14
|
gemma-wrapper -- \\
|
15
15
|
-g test/data/input/BXD_geno.txt.gz \\
|
16
16
|
-p test/data/input/BXD_pheno.txt \\
|
17
|
+
-a test/data/input/BXD_snps.txt \
|
17
18
|
-gk
|
18
19
|
|
19
20
|
LOCO K computation with caching and JSON output
|
20
21
|
|
21
|
-
gemma-wrapper --json \\
|
22
|
-
--loco 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,X -- \\
|
22
|
+
gemma-wrapper --json --loco -- \\
|
23
23
|
-g test/data/input/BXD_geno.txt.gz \\
|
24
24
|
-p test/data/input/BXD_pheno.txt \\
|
25
25
|
-a test/data/input/BXD_snps.txt \\
|
@@ -41,7 +41,7 @@ Gemma gets used from the path. You can override by setting
|
|
41
41
|
"
|
42
42
|
# These are used for testing compatibility with the gemma tool
|
43
43
|
GEMMA_V_MAJOR = 98
|
44
|
-
GEMMA_V_MINOR =
|
44
|
+
GEMMA_V_MINOR = 4
|
45
45
|
|
46
46
|
basepath = File.dirname(File.dirname(__FILE__))
|
47
47
|
$: << File.join(basepath,'lib')
|
@@ -71,12 +71,15 @@ require 'optparse'
|
|
71
71
|
require 'tempfile'
|
72
72
|
require 'tmpdir'
|
73
73
|
|
74
|
+
require 'lock'
|
75
|
+
|
74
76
|
split_at = ARGV.index('--')
|
77
|
+
|
75
78
|
if split_at
|
76
79
|
gemma_args = ARGV[split_at+1..-1]
|
77
80
|
end
|
78
81
|
|
79
|
-
options = { show_help: false, source: 'https://github.com/genetics-statistics/gemma-wrapper', version: version+' (Pjotr Prins)', date: Time.now.to_s, gemma_command: gemma_command, cache_dir: Dir.tmpdir(), quiet: false, parallel:
|
82
|
+
options = { show_help: false, source: 'https://github.com/genetics-statistics/gemma-wrapper', version: version+' (Pjotr Prins)', date: Time.now.to_s, gemma_command: gemma_command, cache_dir: Dir.tmpdir(), quiet: false, permute_phenotypes: false, parallel: nil }
|
80
83
|
|
81
84
|
opts = OptionParser.new do |o|
|
82
85
|
o.banner = "\nUsage: #{File.basename($0)} [options] -- [gemma-options]"
|
@@ -91,8 +94,12 @@ opts = OptionParser.new do |o|
|
|
91
94
|
raise "Phenotype input file #{phenotypes} does not exist" if !File.exist?(phenotypes)
|
92
95
|
end
|
93
96
|
|
94
|
-
o.on('--loco
|
95
|
-
options[:loco] =
|
97
|
+
o.on('--loco', 'Run full leave-one-chromosome-out (LOCO)') do |b|
|
98
|
+
options[:loco] = b
|
99
|
+
end
|
100
|
+
|
101
|
+
o.on('--chromosomes [1,2,3]',Array,'Run specific chromosomes') do |lst|
|
102
|
+
options[:chromosomes] = lst
|
96
103
|
end
|
97
104
|
|
98
105
|
o.on('--input filen',String, 'JSON input variables (used for LOCO)') do |filen|
|
@@ -112,6 +119,10 @@ opts = OptionParser.new do |o|
|
|
112
119
|
options[:force] = true
|
113
120
|
end
|
114
121
|
|
122
|
+
o.on("--parallel", "Run jobs in parallel") do |b|
|
123
|
+
options[:parallel] = true
|
124
|
+
end
|
125
|
+
|
115
126
|
o.on("--no-parallel", "Do not run jobs in parallel") do |b|
|
116
127
|
options[:parallel] = false
|
117
128
|
end
|
@@ -185,11 +196,21 @@ warning = lambda do |*msg|
|
|
185
196
|
record[:warnings].push *msg.join("")
|
186
197
|
OUTPUT.print "WARNING: ",*msg,"\n"
|
187
198
|
end
|
199
|
+
|
188
200
|
info = lambda do |*msg|
|
189
201
|
record[:debug].push *msg.join("") if options[:json] and options[:debug]
|
190
202
|
OUTPUT.print *msg,"\n" if !options[:quiet]
|
191
203
|
end
|
192
204
|
|
205
|
+
# Fetch chromosomes
|
206
|
+
def get_chromosomes annofn
|
207
|
+
h = {}
|
208
|
+
File.open(annofn,"r").each_line do | line |
|
209
|
+
chr = line.split(/\s+/)[2]
|
210
|
+
h[chr] = true
|
211
|
+
end
|
212
|
+
h.map { |k,v| k }
|
213
|
+
end
|
193
214
|
# ---- Start banner
|
194
215
|
|
195
216
|
GEMMA_K_VERSION=version
|
@@ -197,14 +218,14 @@ GEMMA_K_BANNER = "gemma-wrapper #{version} (Ruby #{RUBY_VERSION}) by Pjotr Prins
|
|
197
218
|
info.call GEMMA_K_BANNER
|
198
219
|
|
199
220
|
# Check gemma version
|
200
|
-
GEMMA_COMMAND=options[:gemma_command]
|
201
|
-
info.call "NOTE: gemma-wrapper is soon to be replaced by gemma2/lib"
|
202
|
-
|
203
221
|
begin
|
204
|
-
|
222
|
+
gemma_command2 = options[:gemma_command]
|
223
|
+
info.call "NOTE: gemma-wrapper is soon to be replaced by gemma2/lib"
|
224
|
+
|
225
|
+
GEMMA_INFO = `#{gemma_command2}`
|
205
226
|
rescue Errno::ENOENT
|
206
|
-
|
207
|
-
error.call "<#{
|
227
|
+
gemma_command2 = "gemma"
|
228
|
+
error.call "<#{gemma_command2}> command not found"
|
208
229
|
end
|
209
230
|
|
210
231
|
gemma_version_header = GEMMA_INFO.split("\n").grep(/GEMMA|Version/)[0].strip
|
@@ -230,15 +251,21 @@ if RUBY_VERSION =~ /^1/
|
|
230
251
|
warning "runs on Ruby 2.x only\n"
|
231
252
|
end
|
232
253
|
|
254
|
+
# ---- LOCO defaults to parallel
|
255
|
+
if options[:parallel] == nil
|
256
|
+
options[:parallel] = true if options[:loco]
|
257
|
+
end
|
258
|
+
|
233
259
|
debug.call(options) # some debug output
|
234
260
|
debug.call(record)
|
235
261
|
|
236
262
|
DO_COMPUTE_KINSHIP = gemma_args.include?("-gk")
|
237
263
|
DO_COMPUTE_GWA = !DO_COMPUTE_KINSHIP
|
238
264
|
|
239
|
-
# ---- Set up parallel
|
240
265
|
if options[:parallel]
|
241
266
|
begin
|
267
|
+
skip_cite = `echo "will cite" |parallel --citation`
|
268
|
+
debug.call(skip_cite)
|
242
269
|
PARALLEL_INFO = `parallel --help`
|
243
270
|
rescue Errno::ENOENT
|
244
271
|
error.call "<parallel> command not found"
|
@@ -246,6 +273,11 @@ if options[:parallel]
|
|
246
273
|
parallel_cmds = []
|
247
274
|
end
|
248
275
|
|
276
|
+
# ---- Fetch chromosomes from SNP annotation file
|
277
|
+
anno_idx = gemma_args.index '-a'
|
278
|
+
raise "Expected GEMMA -a genotype file switch" if anno_idx == nil
|
279
|
+
CHROMOSOMES = get_chromosomes(gemma_args[anno_idx+1])
|
280
|
+
|
249
281
|
# ---- Compute HASH on inputs
|
250
282
|
hashme = []
|
251
283
|
geno_idx = gemma_args.index '-g'
|
@@ -256,7 +288,6 @@ if DO_COMPUTE_GWA and options[:permute_phenotypes]
|
|
256
288
|
raise "Did not expect GEMMA -p phenotype whith permutations (only use --permutate-phenotypes)" if pheno_idx
|
257
289
|
end
|
258
290
|
|
259
|
-
|
260
291
|
execute = lambda { |cmd|
|
261
292
|
info.call("Executing: #{cmd}")
|
262
293
|
err = 0
|
@@ -276,14 +307,6 @@ execute = lambda { |cmd|
|
|
276
307
|
err
|
277
308
|
}
|
278
309
|
|
279
|
-
hashme =
|
280
|
-
if DO_COMPUTE_KINSHIP and pheno_idx != nil
|
281
|
-
# Remove the phenotype file from the hash for GRM computation
|
282
|
-
gemma_args[0..pheno_idx-1] + gemma_args[pheno_idx+2..-1]
|
283
|
-
else
|
284
|
-
gemma_args
|
285
|
-
end
|
286
|
-
|
287
310
|
compute_hash = lambda do | phenofn = nil |
|
288
311
|
# Compute a HASH on the inputs
|
289
312
|
debug.call "Hashing on ",hashme,"\n"
|
@@ -302,31 +325,51 @@ compute_hash = lambda do | phenofn = nil |
|
|
302
325
|
hashes << item
|
303
326
|
end
|
304
327
|
end
|
328
|
+
debug.call(hashes)
|
305
329
|
Digest::SHA1.hexdigest hashes.join(' ')
|
306
330
|
end
|
307
331
|
|
308
332
|
HASH = compute_hash.call()
|
309
333
|
options[:hash] = HASH
|
310
334
|
|
335
|
+
at_exit do
|
336
|
+
Lock.release(HASH)
|
337
|
+
end
|
338
|
+
|
339
|
+
Lock.create(HASH) # this will wait for a lock to expire
|
340
|
+
|
341
|
+
joblog = options[:cache_dir]+"/"+HASH+"-parallel.log"
|
342
|
+
|
311
343
|
# Create cache dir
|
312
344
|
FileUtils::mkdir_p options[:cache_dir]
|
313
345
|
|
346
|
+
Dir.mktmpdir do |tmpdir| # tmpdir for GEMMA output
|
347
|
+
|
314
348
|
error.call "Do not use the GEMMA -o switch!" if gemma_args.include? '-o'
|
315
349
|
error.call "Do not use the GEMMA -outdir switch!" if gemma_args.include? '-outdir'
|
350
|
+
GEMMA_ARGS_HASH = gemma_args.dup # do not include outdir
|
316
351
|
gemma_args << '-outdir'
|
317
|
-
gemma_args <<
|
352
|
+
gemma_args << tmpdir
|
318
353
|
GEMMA_ARGS = gemma_args
|
319
354
|
|
355
|
+
hashme =
|
356
|
+
if DO_COMPUTE_KINSHIP and pheno_idx != nil
|
357
|
+
# Remove the phenotype file from the hash for GRM computation
|
358
|
+
GEMMA_ARGS_HASH[0..pheno_idx-1] + GEMMA_ARGS_HASH[pheno_idx+2..-1]
|
359
|
+
else
|
360
|
+
GEMMA_ARGS_HASH
|
361
|
+
end
|
362
|
+
|
320
363
|
debug.call "Options: ",options,"\n" if !options[:quiet]
|
321
364
|
|
322
365
|
invoke_gemma = lambda do |extra_args, cache_hit = false, chr = "full", permutation = 1|
|
323
|
-
cmd = "#{
|
366
|
+
cmd = "#{gemma_command2} #{GEMMA_ARGS.join(' ')} #{extra_args.join(' ')}"
|
324
367
|
record[:gemma_command] = cmd
|
325
368
|
return if cache_hit
|
326
369
|
if options[:slurm]
|
327
370
|
info.call cmd
|
328
371
|
hashi = HASH
|
329
|
-
prefix =
|
372
|
+
prefix = tmpdir+'/'+hashi
|
330
373
|
scriptfn = prefix+".#{chr}.#{permutation}-pbs.sh"
|
331
374
|
script = "#!/bin/bash
|
332
375
|
#SBATCH --job-name=gemma-#{scriptfn}
|
@@ -371,6 +414,7 @@ srun #{cmd}
|
|
371
414
|
end
|
372
415
|
end
|
373
416
|
|
417
|
+
# Takes the hash value and checks whether the (output) file exists
|
374
418
|
# returns datafn, logfn, cache_hit
|
375
419
|
cache = lambda do | chr, ext, h=HASH, permutation=0 |
|
376
420
|
inject = (chr==nil ? "" : ".#{chr}" )+ext
|
@@ -428,11 +472,17 @@ gwas = lambda do | chr, kfn, pfn, permutation=0 |
|
|
428
472
|
end
|
429
473
|
|
430
474
|
LOCO = options[:loco]
|
475
|
+
if LOCO
|
476
|
+
if options[:chromosomes]
|
477
|
+
CHROMOSOMES = options[:chromosomes]
|
478
|
+
end
|
479
|
+
end
|
480
|
+
|
431
481
|
if DO_COMPUTE_KINSHIP
|
432
482
|
# compute K
|
433
|
-
info.call
|
434
|
-
if LOCO
|
435
|
-
|
483
|
+
info.call CHROMOSOMES
|
484
|
+
if LOCO
|
485
|
+
CHROMOSOMES.each do |chr|
|
436
486
|
info.call "LOCO for ",chr
|
437
487
|
kinship.call(chr)
|
438
488
|
end
|
@@ -441,13 +491,24 @@ if DO_COMPUTE_KINSHIP
|
|
441
491
|
end
|
442
492
|
else
|
443
493
|
# DO_COMPUTE_GWA
|
444
|
-
|
494
|
+
begin
|
495
|
+
json_in = JSON.parse(File.read(options[:input]))
|
496
|
+
rescue TypeError
|
497
|
+
raise "Missing JSON input file?"
|
498
|
+
end
|
445
499
|
raise "JSON problem, file #{options[:input]} is not -gk derived" if json_in["type"] != "K"
|
446
500
|
|
447
501
|
pfn = options[:permute_phenotypes] # can be nil
|
448
|
-
|
449
|
-
|
450
|
-
|
502
|
+
if LOCO
|
503
|
+
k_files = json_in["files"].map { |rec| [rec[0],rec[2]] }
|
504
|
+
k_files.each do | chr, kfn | # call a GWA for each chromosome
|
505
|
+
gwas.call(chr,kfn,pfn)
|
506
|
+
end
|
507
|
+
else
|
508
|
+
kfn = json_in["files"][0][2]
|
509
|
+
CHROMOSOMES.each do | chr |
|
510
|
+
gwas.call(chr,kfn,pfn)
|
511
|
+
end
|
451
512
|
end
|
452
513
|
# Permute
|
453
514
|
if options[:permutate]
|
@@ -502,18 +563,36 @@ if options[:parallel]
|
|
502
563
|
cmd = parallel_cmds.join("\\n")
|
503
564
|
|
504
565
|
cmd = "echo -e \"#{cmd}\""
|
505
|
-
err = execute.call(cmd+"|parallel") # all jobs in parallel
|
566
|
+
err = execute.call(cmd+"|parallel --joblog #{joblog}") # first try optimistically to run all jobs in parallel
|
506
567
|
if err != 0
|
507
568
|
[16,8,4,1].each do |jobs|
|
508
569
|
info.call("Failed to complete parallel run -- retrying with smaller RAM footprint!")
|
509
|
-
err = execute.call(cmd+"|parallel -j #{jobs}")
|
570
|
+
err = execute.call(cmd+"|parallel -j #{jobs} --resume --joblog #{joblog}")
|
510
571
|
break if err == 0
|
511
572
|
end
|
512
573
|
if err != 0
|
513
574
|
info.call("Run failed!")
|
575
|
+
# Remove remaining files
|
576
|
+
FileUtils.rm_rf("#{tmpdir}/*", secure: true)
|
577
|
+
FileUtils.mv joblog, joblog+".bak", verbose: false, force: true
|
514
578
|
exit err
|
515
579
|
end
|
516
580
|
end
|
517
581
|
info.call("Run successful!")
|
582
|
+
FileUtils.mv joblog, joblog+".bak", verbose: false, force: true
|
518
583
|
end
|
519
584
|
json_out.call
|
585
|
+
|
586
|
+
# copy all output files to the cache_dir. If a file exists only emit a warning
|
587
|
+
Dir.glob("*.txt", base: tmpdir) do | fn |
|
588
|
+
source = tmpdir + "/" + fn
|
589
|
+
dest = options[:cache_dir] + "/" + fn
|
590
|
+
if not File.exist?(dest) or options[:force]
|
591
|
+
info.call "Move #{source} to #{dest}"
|
592
|
+
FileUtils.mv source, dest, verbose: false
|
593
|
+
else
|
594
|
+
warning.call "File #{dest} already exists. Not overwriting"
|
595
|
+
end
|
596
|
+
end
|
597
|
+
|
598
|
+
end # tmpdir
|
data/gemma-wrapper.gemspec
CHANGED
@@ -2,10 +2,11 @@ Gem::Specification.new do |s|
|
|
2
2
|
s.name = 'bio-gemma-wrapper'
|
3
3
|
s.version = File.read('VERSION')
|
4
4
|
s.summary = "GEMMA with LOCO and permutations"
|
5
|
-
s.description = "GEMMA wrapper adds LOCO and permutation support. Also caches K between runs with LOCO support"
|
5
|
+
s.description = "GEMMA wrapper adds LOCO and permutation support. Also runs in parallel and caches K between runs with LOCO support"
|
6
6
|
s.authors = ["Pjotr Prins"]
|
7
7
|
s.email = 'pjotr.public01@thebird.nl'
|
8
8
|
s.files = ["bin/gemma-wrapper",
|
9
|
+
"lib/lock.rb",
|
9
10
|
"Gemfile",
|
10
11
|
"LICENSE.txt",
|
11
12
|
"README.md",
|
data/lib/lock.rb
ADDED
@@ -0,0 +1,95 @@
|
|
1
|
+
# Locking module for gemma (wrapper)
|
2
|
+
#
|
3
|
+
|
4
|
+
=begin
|
5
|
+
|
6
|
+
The logic is as follows:
|
7
|
+
|
8
|
+
1. a program creates a named lock file (based on a hash of its inputs) with its PID
|
9
|
+
2. on exit it destroys the file
|
10
|
+
3. a new program checks for the lock file
|
11
|
+
4. if it exists and the PID is still in the ps table - wait
|
12
|
+
5. when the pid disappears or the lock file - continue
|
13
|
+
6. a timeout will return an error in 3 minutes
|
14
|
+
|
15
|
+
Note that there is a theoretical chance the lock file existed, but disappeared. I think I have it covered by ignoring the unlink errors. Also the use of /proc/PID is Linux specific.
|
16
|
+
|
17
|
+
=end
|
18
|
+
|
19
|
+
|
20
|
+
require 'timeout'
|
21
|
+
|
22
|
+
module Lock
|
23
|
+
|
24
|
+
def self.local name
|
25
|
+
ENV['HOME']+"/."+name.gsub("/","-")+".lck"
|
26
|
+
end
|
27
|
+
|
28
|
+
def self.lock_pid name
|
29
|
+
lockfn = local(name)
|
30
|
+
if File.exist?(lockfn)
|
31
|
+
File.read(lockfn).to_i
|
32
|
+
else
|
33
|
+
0
|
34
|
+
end
|
35
|
+
end
|
36
|
+
|
37
|
+
def self.locked? name
|
38
|
+
lockfn = local(name)
|
39
|
+
pid = lock_pid(name)
|
40
|
+
if File.exist?("/proc/#{pid}")
|
41
|
+
true
|
42
|
+
else
|
43
|
+
# the program went away - remove any 'stale' lock
|
44
|
+
begin
|
45
|
+
File.unlink(lockfn)
|
46
|
+
rescue Errno::ENOENT
|
47
|
+
# ignore error when the lock file went missing
|
48
|
+
end
|
49
|
+
false # --> no longer locked
|
50
|
+
end
|
51
|
+
end
|
52
|
+
|
53
|
+
def Lock::create name
|
54
|
+
wait_for(name)
|
55
|
+
lockfn = local(name)
|
56
|
+
if File.exist?(lockfn)
|
57
|
+
$stderr.print "\nERROR: Can not steal #{lockfn}"
|
58
|
+
exit 1
|
59
|
+
end
|
60
|
+
File.open(lockfn, File::RDWR|File::CREAT, 0644) do |f|
|
61
|
+
f.flock(File::LOCK_EX)
|
62
|
+
f.print(Process.pid)
|
63
|
+
end
|
64
|
+
end
|
65
|
+
|
66
|
+
def Lock::wait_for name
|
67
|
+
lockfn = local(name)
|
68
|
+
begin
|
69
|
+
status = Timeout::timeout(180) { # 3 minutes
|
70
|
+
while locked?(name)
|
71
|
+
$stderr.print("\nWaiting for lock #{lockfn}...")
|
72
|
+
sleep 2
|
73
|
+
end
|
74
|
+
}
|
75
|
+
rescue Timeout::Error
|
76
|
+
$stderr.print "\nERROR: Timed out, but I can not steal #{lockfn}"
|
77
|
+
exit 1
|
78
|
+
end
|
79
|
+
# yah! lock is released
|
80
|
+
end
|
81
|
+
|
82
|
+
def Lock::release name
|
83
|
+
lockfn = local(name)
|
84
|
+
if Process.pid == lock_pid(name)
|
85
|
+
begin
|
86
|
+
File.unlink(lockfn) # PID expired
|
87
|
+
rescue Errno::ENOENT
|
88
|
+
# ignore error when the lock file went missing
|
89
|
+
end
|
90
|
+
else
|
91
|
+
$stderr.print "\nERROR: can not release #{lockfn} because it is not owned by me"
|
92
|
+
end
|
93
|
+
end
|
94
|
+
|
95
|
+
end
|
metadata
CHANGED
@@ -1,17 +1,17 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: bio-gemma-wrapper
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.99.
|
4
|
+
version: 0.99.5
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Pjotr Prins
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2021-
|
11
|
+
date: 2021-11-26 00:00:00.000000000 Z
|
12
12
|
dependencies: []
|
13
|
-
description: GEMMA wrapper adds LOCO and permutation support. Also
|
14
|
-
runs with LOCO support
|
13
|
+
description: GEMMA wrapper adds LOCO and permutation support. Also runs in parallel
|
14
|
+
and caches K between runs with LOCO support
|
15
15
|
email: pjotr.public01@thebird.nl
|
16
16
|
executables:
|
17
17
|
- gemma-wrapper
|
@@ -24,6 +24,7 @@ files:
|
|
24
24
|
- VERSION
|
25
25
|
- bin/gemma-wrapper
|
26
26
|
- gemma-wrapper.gemspec
|
27
|
+
- lib/lock.rb
|
27
28
|
homepage: https://github.com/genetics-statistics/gemma-wrapper
|
28
29
|
licenses:
|
29
30
|
- GPL3
|
@@ -43,8 +44,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
43
44
|
- !ruby/object:Gem::Version
|
44
45
|
version: '0'
|
45
46
|
requirements: []
|
46
|
-
|
47
|
-
rubygems_version: 2.7.6.2
|
47
|
+
rubygems_version: 3.1.4
|
48
48
|
signing_key:
|
49
49
|
specification_version: 4
|
50
50
|
summary: GEMMA with LOCO and permutations
|