bio-gemma-wrapper 0.99.1 → 0.99.5
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.md +26 -10
- data/VERSION +1 -1
- data/bin/gemma-wrapper +113 -34
- data/gemma-wrapper.gemspec +2 -1
- data/lib/lock.rb +95 -0
- metadata +6 -6
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 5ec477c7560ae55b6d7c8a74b5e46cd95586a87f02ac698c540b2d6cc40c4392
|
4
|
+
data.tar.gz: bf79dc493baa99efd2e20a298d702ece9973c3fd377773cc80f0867c0a132ae5
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 0ca0c04ef86c22d332c10f21b66ef24aab7fd3dbfd1f2db67f78309ba8a5af47a9263bf94aaf2afeda021cdace3ed042af6f3a5ebf6628714a78ad18b494284f
|
7
|
+
data.tar.gz: b3692f74bb4437f70d2671fe48e1ec9f7c44c64b4553c93b511cc648e157e6a9f5b0f66acce2f07721a74a7ce20a1993c7eb2d2a74eb362c786cd7ee42eeab16
|
data/README.md
CHANGED
@@ -8,11 +8,12 @@ Nat. Genet., 2016)](cfw.gif)
|
|
8
8
|
## Introduction
|
9
9
|
|
10
10
|
Gemma-wrapper allows running GEMMA with LOCO, GEMMA with caching,
|
11
|
-
GEMMA in parallel (now the default), and GEMMA on
|
12
|
-
is used to run GEMMA as part of the
|
13
|
-
environment.
|
11
|
+
GEMMA in parallel (now the default with LOCO), and GEMMA on
|
12
|
+
PBS. Gemma-wrapper is used to run GEMMA as part of the
|
13
|
+
https://genenetwork.org/ environment.
|
14
14
|
|
15
|
-
Note that gemma-wrapper is projected to be integrated
|
15
|
+
Note that a version of gemma-wrapper is projected to be integrated
|
16
|
+
into gemma itself.
|
16
17
|
|
17
18
|
GEMMA is a software toolkit for fast application of linear mixed
|
18
19
|
models (LMMs) and related models to genome-wide association studies
|
@@ -29,6 +30,14 @@ does a pass-through of all standard GEMMA invocation switches. On
|
|
29
30
|
return gemma-wrapper can return a JSON object (--json) which is
|
30
31
|
useful for web-services.
|
31
32
|
|
33
|
+
## Performance
|
34
|
+
|
35
|
+
LOCO runs in parallel by default which is at least a 5x performance
|
36
|
+
improvement on a machine with enough cores. GEMMA without LOCO,
|
37
|
+
however, does not run in parallel by default. Performance
|
38
|
+
improvements with the parallel implementation for LOCO and non-LOCO
|
39
|
+
can be viewed [here](./test/performance/releases.gmi).
|
40
|
+
|
32
41
|
## Installation
|
33
42
|
|
34
43
|
Prerequisites are
|
@@ -53,15 +62,19 @@ and it will render something like
|
|
53
62
|
Usage: gemma-wrapper [options] -- [gemma-options]
|
54
63
|
--permutate n Permutate # times by shuffling phenotypes
|
55
64
|
--permute-phenotypes filen Phenotypes to be shuffled in permutations
|
56
|
-
--loco
|
65
|
+
--loco Run full leave-one-chromosome-out (LOCO)
|
66
|
+
--chromosomes [1,2,3] Run specific chromosomes
|
57
67
|
--input filen JSON input variables (used for LOCO)
|
58
68
|
--cache-dir path Use a cache directory
|
59
69
|
--json Create output file in JSON format
|
60
|
-
--force Force computation
|
61
|
-
--
|
70
|
+
--force Force computation (override cache)
|
71
|
+
--parallel Run jobs in parallel
|
72
|
+
--no-parallel Do not run jobs in parallel
|
73
|
+
--slurm[=opts] Use slurm PBS for submitting jobs
|
62
74
|
--q, --quiet Run quietly
|
63
75
|
-v, --verbose Run verbosely
|
64
|
-
|
76
|
+
-d, --debug Show debug messages and keep intermediate output
|
77
|
+
--dry-run Show commands, but don't execute
|
65
78
|
-- Anything after gets passed to GEMMA
|
66
79
|
|
67
80
|
-h, --help display this help and exit
|
@@ -99,6 +112,7 @@ the data files are found):
|
|
99
112
|
gemma-wrapper -- \
|
100
113
|
-g test/data/input/BXD_geno.txt.gz \
|
101
114
|
-p test/data/input/BXD_pheno.txt \
|
115
|
+
-a test/data/input/BXD_snps.txt \
|
102
116
|
-gk \
|
103
117
|
-debug
|
104
118
|
|
@@ -116,6 +130,7 @@ You can also get JSON output on STDOUT by providing the --json switch
|
|
116
130
|
gemma-wrapper --json -- \
|
117
131
|
-g test/data/input/BXD_geno.txt.gz \
|
118
132
|
-p test/data/input/BXD_pheno.txt \
|
133
|
+
-a test/data/input/BXD_snps.txt \
|
119
134
|
-gk \
|
120
135
|
-debug > K.json
|
121
136
|
|
@@ -133,6 +148,7 @@ default. If you want something else provide a --cache-dir, e.g.
|
|
133
148
|
gemma-wrapper --cache-dir ~/.gemma-cache -- \
|
134
149
|
-g test/data/input/BXD_geno.txt.gz \
|
135
150
|
-p test/data/input/BXD_pheno.txt \
|
151
|
+
-a test/data/input/BXD_snps.txt \
|
136
152
|
-gk \
|
137
153
|
-debug
|
138
154
|
|
@@ -143,7 +159,7 @@ will store K in ~/.gemma-cache.
|
|
143
159
|
Run the LMM using the K's captured earlier in K.json using the --input
|
144
160
|
switch
|
145
161
|
|
146
|
-
gemma-wrapper --json --
|
162
|
+
gemma-wrapper --json --input K.json -- \
|
147
163
|
-g test/data/input/BXD_geno.txt.gz \
|
148
164
|
-p test/data/input/BXD_pheno.txt \
|
149
165
|
-c test/data/input/BXD_covariates2.txt \
|
@@ -163,7 +179,7 @@ https://github.com/genetics-statistics/GEMMA/issues/46). To loop all
|
|
163
179
|
chromosomes first create all K's with
|
164
180
|
|
165
181
|
gemma-wrapper --json \
|
166
|
-
--loco
|
182
|
+
--loco -- \
|
167
183
|
-g test/data/input/BXD_geno.txt.gz \
|
168
184
|
-p test/data/input/BXD_pheno.txt \
|
169
185
|
-a test/data/input/BXD_snps.txt \
|
data/VERSION
CHANGED
@@ -1 +1 @@
|
|
1
|
-
0.99.
|
1
|
+
0.99.5
|
data/bin/gemma-wrapper
CHANGED
@@ -14,12 +14,12 @@ GEMMA wrapper example:
|
|
14
14
|
gemma-wrapper -- \\
|
15
15
|
-g test/data/input/BXD_geno.txt.gz \\
|
16
16
|
-p test/data/input/BXD_pheno.txt \\
|
17
|
+
-a test/data/input/BXD_snps.txt \
|
17
18
|
-gk
|
18
19
|
|
19
20
|
LOCO K computation with caching and JSON output
|
20
21
|
|
21
|
-
gemma-wrapper --json \\
|
22
|
-
--loco 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,X -- \\
|
22
|
+
gemma-wrapper --json --loco -- \\
|
23
23
|
-g test/data/input/BXD_geno.txt.gz \\
|
24
24
|
-p test/data/input/BXD_pheno.txt \\
|
25
25
|
-a test/data/input/BXD_snps.txt \\
|
@@ -41,7 +41,7 @@ Gemma gets used from the path. You can override by setting
|
|
41
41
|
"
|
42
42
|
# These are used for testing compatibility with the gemma tool
|
43
43
|
GEMMA_V_MAJOR = 98
|
44
|
-
GEMMA_V_MINOR =
|
44
|
+
GEMMA_V_MINOR = 4
|
45
45
|
|
46
46
|
basepath = File.dirname(File.dirname(__FILE__))
|
47
47
|
$: << File.join(basepath,'lib')
|
@@ -71,12 +71,15 @@ require 'optparse'
|
|
71
71
|
require 'tempfile'
|
72
72
|
require 'tmpdir'
|
73
73
|
|
74
|
+
require 'lock'
|
75
|
+
|
74
76
|
split_at = ARGV.index('--')
|
77
|
+
|
75
78
|
if split_at
|
76
79
|
gemma_args = ARGV[split_at+1..-1]
|
77
80
|
end
|
78
81
|
|
79
|
-
options = { show_help: false, source: 'https://github.com/genetics-statistics/gemma-wrapper', version: version+' (Pjotr Prins)', date: Time.now.to_s, gemma_command: gemma_command, cache_dir: Dir.tmpdir(), quiet: false, parallel:
|
82
|
+
options = { show_help: false, source: 'https://github.com/genetics-statistics/gemma-wrapper', version: version+' (Pjotr Prins)', date: Time.now.to_s, gemma_command: gemma_command, cache_dir: Dir.tmpdir(), quiet: false, permute_phenotypes: false, parallel: nil }
|
80
83
|
|
81
84
|
opts = OptionParser.new do |o|
|
82
85
|
o.banner = "\nUsage: #{File.basename($0)} [options] -- [gemma-options]"
|
@@ -91,8 +94,12 @@ opts = OptionParser.new do |o|
|
|
91
94
|
raise "Phenotype input file #{phenotypes} does not exist" if !File.exist?(phenotypes)
|
92
95
|
end
|
93
96
|
|
94
|
-
o.on('--loco
|
95
|
-
options[:loco] =
|
97
|
+
o.on('--loco', 'Run full leave-one-chromosome-out (LOCO)') do |b|
|
98
|
+
options[:loco] = b
|
99
|
+
end
|
100
|
+
|
101
|
+
o.on('--chromosomes [1,2,3]',Array,'Run specific chromosomes') do |lst|
|
102
|
+
options[:chromosomes] = lst
|
96
103
|
end
|
97
104
|
|
98
105
|
o.on('--input filen',String, 'JSON input variables (used for LOCO)') do |filen|
|
@@ -112,6 +119,10 @@ opts = OptionParser.new do |o|
|
|
112
119
|
options[:force] = true
|
113
120
|
end
|
114
121
|
|
122
|
+
o.on("--parallel", "Run jobs in parallel") do |b|
|
123
|
+
options[:parallel] = true
|
124
|
+
end
|
125
|
+
|
115
126
|
o.on("--no-parallel", "Do not run jobs in parallel") do |b|
|
116
127
|
options[:parallel] = false
|
117
128
|
end
|
@@ -185,11 +196,21 @@ warning = lambda do |*msg|
|
|
185
196
|
record[:warnings].push *msg.join("")
|
186
197
|
OUTPUT.print "WARNING: ",*msg,"\n"
|
187
198
|
end
|
199
|
+
|
188
200
|
info = lambda do |*msg|
|
189
201
|
record[:debug].push *msg.join("") if options[:json] and options[:debug]
|
190
202
|
OUTPUT.print *msg,"\n" if !options[:quiet]
|
191
203
|
end
|
192
204
|
|
205
|
+
# Fetch chromosomes
|
206
|
+
def get_chromosomes annofn
|
207
|
+
h = {}
|
208
|
+
File.open(annofn,"r").each_line do | line |
|
209
|
+
chr = line.split(/\s+/)[2]
|
210
|
+
h[chr] = true
|
211
|
+
end
|
212
|
+
h.map { |k,v| k }
|
213
|
+
end
|
193
214
|
# ---- Start banner
|
194
215
|
|
195
216
|
GEMMA_K_VERSION=version
|
@@ -197,14 +218,14 @@ GEMMA_K_BANNER = "gemma-wrapper #{version} (Ruby #{RUBY_VERSION}) by Pjotr Prins
|
|
197
218
|
info.call GEMMA_K_BANNER
|
198
219
|
|
199
220
|
# Check gemma version
|
200
|
-
GEMMA_COMMAND=options[:gemma_command]
|
201
|
-
info.call "NOTE: gemma-wrapper is soon to be replaced by gemma2/lib"
|
202
|
-
|
203
221
|
begin
|
204
|
-
|
222
|
+
gemma_command2 = options[:gemma_command]
|
223
|
+
info.call "NOTE: gemma-wrapper is soon to be replaced by gemma2/lib"
|
224
|
+
|
225
|
+
GEMMA_INFO = `#{gemma_command2}`
|
205
226
|
rescue Errno::ENOENT
|
206
|
-
|
207
|
-
error.call "<#{
|
227
|
+
gemma_command2 = "gemma"
|
228
|
+
error.call "<#{gemma_command2}> command not found"
|
208
229
|
end
|
209
230
|
|
210
231
|
gemma_version_header = GEMMA_INFO.split("\n").grep(/GEMMA|Version/)[0].strip
|
@@ -230,15 +251,21 @@ if RUBY_VERSION =~ /^1/
|
|
230
251
|
warning "runs on Ruby 2.x only\n"
|
231
252
|
end
|
232
253
|
|
254
|
+
# ---- LOCO defaults to parallel
|
255
|
+
if options[:parallel] == nil
|
256
|
+
options[:parallel] = true if options[:loco]
|
257
|
+
end
|
258
|
+
|
233
259
|
debug.call(options) # some debug output
|
234
260
|
debug.call(record)
|
235
261
|
|
236
262
|
DO_COMPUTE_KINSHIP = gemma_args.include?("-gk")
|
237
263
|
DO_COMPUTE_GWA = !DO_COMPUTE_KINSHIP
|
238
264
|
|
239
|
-
# ---- Set up parallel
|
240
265
|
if options[:parallel]
|
241
266
|
begin
|
267
|
+
skip_cite = `echo "will cite" |parallel --citation`
|
268
|
+
debug.call(skip_cite)
|
242
269
|
PARALLEL_INFO = `parallel --help`
|
243
270
|
rescue Errno::ENOENT
|
244
271
|
error.call "<parallel> command not found"
|
@@ -246,6 +273,11 @@ if options[:parallel]
|
|
246
273
|
parallel_cmds = []
|
247
274
|
end
|
248
275
|
|
276
|
+
# ---- Fetch chromosomes from SNP annotation file
|
277
|
+
anno_idx = gemma_args.index '-a'
|
278
|
+
raise "Expected GEMMA -a genotype file switch" if anno_idx == nil
|
279
|
+
CHROMOSOMES = get_chromosomes(gemma_args[anno_idx+1])
|
280
|
+
|
249
281
|
# ---- Compute HASH on inputs
|
250
282
|
hashme = []
|
251
283
|
geno_idx = gemma_args.index '-g'
|
@@ -256,7 +288,6 @@ if DO_COMPUTE_GWA and options[:permute_phenotypes]
|
|
256
288
|
raise "Did not expect GEMMA -p phenotype whith permutations (only use --permutate-phenotypes)" if pheno_idx
|
257
289
|
end
|
258
290
|
|
259
|
-
|
260
291
|
execute = lambda { |cmd|
|
261
292
|
info.call("Executing: #{cmd}")
|
262
293
|
err = 0
|
@@ -276,14 +307,6 @@ execute = lambda { |cmd|
|
|
276
307
|
err
|
277
308
|
}
|
278
309
|
|
279
|
-
hashme =
|
280
|
-
if DO_COMPUTE_KINSHIP and pheno_idx != nil
|
281
|
-
# Remove the phenotype file from the hash for GRM computation
|
282
|
-
gemma_args[0..pheno_idx-1] + gemma_args[pheno_idx+2..-1]
|
283
|
-
else
|
284
|
-
gemma_args
|
285
|
-
end
|
286
|
-
|
287
310
|
compute_hash = lambda do | phenofn = nil |
|
288
311
|
# Compute a HASH on the inputs
|
289
312
|
debug.call "Hashing on ",hashme,"\n"
|
@@ -302,31 +325,51 @@ compute_hash = lambda do | phenofn = nil |
|
|
302
325
|
hashes << item
|
303
326
|
end
|
304
327
|
end
|
328
|
+
debug.call(hashes)
|
305
329
|
Digest::SHA1.hexdigest hashes.join(' ')
|
306
330
|
end
|
307
331
|
|
308
332
|
HASH = compute_hash.call()
|
309
333
|
options[:hash] = HASH
|
310
334
|
|
335
|
+
at_exit do
|
336
|
+
Lock.release(HASH)
|
337
|
+
end
|
338
|
+
|
339
|
+
Lock.create(HASH) # this will wait for a lock to expire
|
340
|
+
|
341
|
+
joblog = options[:cache_dir]+"/"+HASH+"-parallel.log"
|
342
|
+
|
311
343
|
# Create cache dir
|
312
344
|
FileUtils::mkdir_p options[:cache_dir]
|
313
345
|
|
346
|
+
Dir.mktmpdir do |tmpdir| # tmpdir for GEMMA output
|
347
|
+
|
314
348
|
error.call "Do not use the GEMMA -o switch!" if gemma_args.include? '-o'
|
315
349
|
error.call "Do not use the GEMMA -outdir switch!" if gemma_args.include? '-outdir'
|
350
|
+
GEMMA_ARGS_HASH = gemma_args.dup # do not include outdir
|
316
351
|
gemma_args << '-outdir'
|
317
|
-
gemma_args <<
|
352
|
+
gemma_args << tmpdir
|
318
353
|
GEMMA_ARGS = gemma_args
|
319
354
|
|
355
|
+
hashme =
|
356
|
+
if DO_COMPUTE_KINSHIP and pheno_idx != nil
|
357
|
+
# Remove the phenotype file from the hash for GRM computation
|
358
|
+
GEMMA_ARGS_HASH[0..pheno_idx-1] + GEMMA_ARGS_HASH[pheno_idx+2..-1]
|
359
|
+
else
|
360
|
+
GEMMA_ARGS_HASH
|
361
|
+
end
|
362
|
+
|
320
363
|
debug.call "Options: ",options,"\n" if !options[:quiet]
|
321
364
|
|
322
365
|
invoke_gemma = lambda do |extra_args, cache_hit = false, chr = "full", permutation = 1|
|
323
|
-
cmd = "#{
|
366
|
+
cmd = "#{gemma_command2} #{GEMMA_ARGS.join(' ')} #{extra_args.join(' ')}"
|
324
367
|
record[:gemma_command] = cmd
|
325
368
|
return if cache_hit
|
326
369
|
if options[:slurm]
|
327
370
|
info.call cmd
|
328
371
|
hashi = HASH
|
329
|
-
prefix =
|
372
|
+
prefix = tmpdir+'/'+hashi
|
330
373
|
scriptfn = prefix+".#{chr}.#{permutation}-pbs.sh"
|
331
374
|
script = "#!/bin/bash
|
332
375
|
#SBATCH --job-name=gemma-#{scriptfn}
|
@@ -371,6 +414,7 @@ srun #{cmd}
|
|
371
414
|
end
|
372
415
|
end
|
373
416
|
|
417
|
+
# Takes the hash value and checks whether the (output) file exists
|
374
418
|
# returns datafn, logfn, cache_hit
|
375
419
|
cache = lambda do | chr, ext, h=HASH, permutation=0 |
|
376
420
|
inject = (chr==nil ? "" : ".#{chr}" )+ext
|
@@ -428,11 +472,17 @@ gwas = lambda do | chr, kfn, pfn, permutation=0 |
|
|
428
472
|
end
|
429
473
|
|
430
474
|
LOCO = options[:loco]
|
475
|
+
if LOCO
|
476
|
+
if options[:chromosomes]
|
477
|
+
CHROMOSOMES = options[:chromosomes]
|
478
|
+
end
|
479
|
+
end
|
480
|
+
|
431
481
|
if DO_COMPUTE_KINSHIP
|
432
482
|
# compute K
|
433
|
-
info.call
|
434
|
-
if LOCO
|
435
|
-
|
483
|
+
info.call CHROMOSOMES
|
484
|
+
if LOCO
|
485
|
+
CHROMOSOMES.each do |chr|
|
436
486
|
info.call "LOCO for ",chr
|
437
487
|
kinship.call(chr)
|
438
488
|
end
|
@@ -441,13 +491,24 @@ if DO_COMPUTE_KINSHIP
|
|
441
491
|
end
|
442
492
|
else
|
443
493
|
# DO_COMPUTE_GWA
|
444
|
-
|
494
|
+
begin
|
495
|
+
json_in = JSON.parse(File.read(options[:input]))
|
496
|
+
rescue TypeError
|
497
|
+
raise "Missing JSON input file?"
|
498
|
+
end
|
445
499
|
raise "JSON problem, file #{options[:input]} is not -gk derived" if json_in["type"] != "K"
|
446
500
|
|
447
501
|
pfn = options[:permute_phenotypes] # can be nil
|
448
|
-
|
449
|
-
|
450
|
-
|
502
|
+
if LOCO
|
503
|
+
k_files = json_in["files"].map { |rec| [rec[0],rec[2]] }
|
504
|
+
k_files.each do | chr, kfn | # call a GWA for each chromosome
|
505
|
+
gwas.call(chr,kfn,pfn)
|
506
|
+
end
|
507
|
+
else
|
508
|
+
kfn = json_in["files"][0][2]
|
509
|
+
CHROMOSOMES.each do | chr |
|
510
|
+
gwas.call(chr,kfn,pfn)
|
511
|
+
end
|
451
512
|
end
|
452
513
|
# Permute
|
453
514
|
if options[:permutate]
|
@@ -502,18 +563,36 @@ if options[:parallel]
|
|
502
563
|
cmd = parallel_cmds.join("\\n")
|
503
564
|
|
504
565
|
cmd = "echo -e \"#{cmd}\""
|
505
|
-
err = execute.call(cmd+"|parallel") # all jobs in parallel
|
566
|
+
err = execute.call(cmd+"|parallel --joblog #{joblog}") # first try optimistically to run all jobs in parallel
|
506
567
|
if err != 0
|
507
568
|
[16,8,4,1].each do |jobs|
|
508
569
|
info.call("Failed to complete parallel run -- retrying with smaller RAM footprint!")
|
509
|
-
err = execute.call(cmd+"|parallel -j #{jobs}")
|
570
|
+
err = execute.call(cmd+"|parallel -j #{jobs} --resume --joblog #{joblog}")
|
510
571
|
break if err == 0
|
511
572
|
end
|
512
573
|
if err != 0
|
513
574
|
info.call("Run failed!")
|
575
|
+
# Remove remaining files
|
576
|
+
FileUtils.rm_rf("#{tmpdir}/*", secure: true)
|
577
|
+
FileUtils.mv joblog, joblog+".bak", verbose: false, force: true
|
514
578
|
exit err
|
515
579
|
end
|
516
580
|
end
|
517
581
|
info.call("Run successful!")
|
582
|
+
FileUtils.mv joblog, joblog+".bak", verbose: false, force: true
|
518
583
|
end
|
519
584
|
json_out.call
|
585
|
+
|
586
|
+
# copy all output files to the cache_dir. If a file exists only emit a warning
|
587
|
+
Dir.glob("*.txt", base: tmpdir) do | fn |
|
588
|
+
source = tmpdir + "/" + fn
|
589
|
+
dest = options[:cache_dir] + "/" + fn
|
590
|
+
if not File.exist?(dest) or options[:force]
|
591
|
+
info.call "Move #{source} to #{dest}"
|
592
|
+
FileUtils.mv source, dest, verbose: false
|
593
|
+
else
|
594
|
+
warning.call "File #{dest} already exists. Not overwriting"
|
595
|
+
end
|
596
|
+
end
|
597
|
+
|
598
|
+
end # tmpdir
|
data/gemma-wrapper.gemspec
CHANGED
@@ -2,10 +2,11 @@ Gem::Specification.new do |s|
|
|
2
2
|
s.name = 'bio-gemma-wrapper'
|
3
3
|
s.version = File.read('VERSION')
|
4
4
|
s.summary = "GEMMA with LOCO and permutations"
|
5
|
-
s.description = "GEMMA wrapper adds LOCO and permutation support. Also caches K between runs with LOCO support"
|
5
|
+
s.description = "GEMMA wrapper adds LOCO and permutation support. Also runs in parallel and caches K between runs with LOCO support"
|
6
6
|
s.authors = ["Pjotr Prins"]
|
7
7
|
s.email = 'pjotr.public01@thebird.nl'
|
8
8
|
s.files = ["bin/gemma-wrapper",
|
9
|
+
"lib/lock.rb",
|
9
10
|
"Gemfile",
|
10
11
|
"LICENSE.txt",
|
11
12
|
"README.md",
|
data/lib/lock.rb
ADDED
@@ -0,0 +1,95 @@
|
|
1
|
+
# Locking module for gemma (wrapper)
|
2
|
+
#
|
3
|
+
|
4
|
+
=begin
|
5
|
+
|
6
|
+
The logic is as follows:
|
7
|
+
|
8
|
+
1. a program creates a named lock file (based on a hash of its inputs) with its PID
|
9
|
+
2. on exit it destroys the file
|
10
|
+
3. a new program checks for the lock file
|
11
|
+
4. if it exists and the PID is still in the ps table - wait
|
12
|
+
5. when the pid disappears or the lock file - continue
|
13
|
+
6. a timeout will return an error in 3 minutes
|
14
|
+
|
15
|
+
Note that there is a theoretical chance the lock file existed, but disappeared. I think I have it covered by ignoring the unlink errors. Also the use of /proc/PID is Linux specific.
|
16
|
+
|
17
|
+
=end
|
18
|
+
|
19
|
+
|
20
|
+
require 'timeout'
|
21
|
+
|
22
|
+
module Lock
|
23
|
+
|
24
|
+
def self.local name
|
25
|
+
ENV['HOME']+"/."+name.gsub("/","-")+".lck"
|
26
|
+
end
|
27
|
+
|
28
|
+
def self.lock_pid name
|
29
|
+
lockfn = local(name)
|
30
|
+
if File.exist?(lockfn)
|
31
|
+
File.read(lockfn).to_i
|
32
|
+
else
|
33
|
+
0
|
34
|
+
end
|
35
|
+
end
|
36
|
+
|
37
|
+
def self.locked? name
|
38
|
+
lockfn = local(name)
|
39
|
+
pid = lock_pid(name)
|
40
|
+
if File.exist?("/proc/#{pid}")
|
41
|
+
true
|
42
|
+
else
|
43
|
+
# the program went away - remove any 'stale' lock
|
44
|
+
begin
|
45
|
+
File.unlink(lockfn)
|
46
|
+
rescue Errno::ENOENT
|
47
|
+
# ignore error when the lock file went missing
|
48
|
+
end
|
49
|
+
false # --> no longer locked
|
50
|
+
end
|
51
|
+
end
|
52
|
+
|
53
|
+
def Lock::create name
|
54
|
+
wait_for(name)
|
55
|
+
lockfn = local(name)
|
56
|
+
if File.exist?(lockfn)
|
57
|
+
$stderr.print "\nERROR: Can not steal #{lockfn}"
|
58
|
+
exit 1
|
59
|
+
end
|
60
|
+
File.open(lockfn, File::RDWR|File::CREAT, 0644) do |f|
|
61
|
+
f.flock(File::LOCK_EX)
|
62
|
+
f.print(Process.pid)
|
63
|
+
end
|
64
|
+
end
|
65
|
+
|
66
|
+
def Lock::wait_for name
|
67
|
+
lockfn = local(name)
|
68
|
+
begin
|
69
|
+
status = Timeout::timeout(180) { # 3 minutes
|
70
|
+
while locked?(name)
|
71
|
+
$stderr.print("\nWaiting for lock #{lockfn}...")
|
72
|
+
sleep 2
|
73
|
+
end
|
74
|
+
}
|
75
|
+
rescue Timeout::Error
|
76
|
+
$stderr.print "\nERROR: Timed out, but I can not steal #{lockfn}"
|
77
|
+
exit 1
|
78
|
+
end
|
79
|
+
# yah! lock is released
|
80
|
+
end
|
81
|
+
|
82
|
+
def Lock::release name
|
83
|
+
lockfn = local(name)
|
84
|
+
if Process.pid == lock_pid(name)
|
85
|
+
begin
|
86
|
+
File.unlink(lockfn) # PID expired
|
87
|
+
rescue Errno::ENOENT
|
88
|
+
# ignore error when the lock file went missing
|
89
|
+
end
|
90
|
+
else
|
91
|
+
$stderr.print "\nERROR: can not release #{lockfn} because it is not owned by me"
|
92
|
+
end
|
93
|
+
end
|
94
|
+
|
95
|
+
end
|
metadata
CHANGED
@@ -1,17 +1,17 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: bio-gemma-wrapper
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.99.
|
4
|
+
version: 0.99.5
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Pjotr Prins
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2021-
|
11
|
+
date: 2021-11-26 00:00:00.000000000 Z
|
12
12
|
dependencies: []
|
13
|
-
description: GEMMA wrapper adds LOCO and permutation support. Also
|
14
|
-
runs with LOCO support
|
13
|
+
description: GEMMA wrapper adds LOCO and permutation support. Also runs in parallel
|
14
|
+
and caches K between runs with LOCO support
|
15
15
|
email: pjotr.public01@thebird.nl
|
16
16
|
executables:
|
17
17
|
- gemma-wrapper
|
@@ -24,6 +24,7 @@ files:
|
|
24
24
|
- VERSION
|
25
25
|
- bin/gemma-wrapper
|
26
26
|
- gemma-wrapper.gemspec
|
27
|
+
- lib/lock.rb
|
27
28
|
homepage: https://github.com/genetics-statistics/gemma-wrapper
|
28
29
|
licenses:
|
29
30
|
- GPL3
|
@@ -43,8 +44,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
43
44
|
- !ruby/object:Gem::Version
|
44
45
|
version: '0'
|
45
46
|
requirements: []
|
46
|
-
|
47
|
-
rubygems_version: 2.7.6.2
|
47
|
+
rubygems_version: 3.1.4
|
48
48
|
signing_key:
|
49
49
|
specification_version: 4
|
50
50
|
summary: GEMMA with LOCO and permutations
|