bio-grid 0.3.0 → 0.3.1
Sign up to get free protection for your applications and to get access to all the features.
- data/README.md +2 -2
- data/VERSION +1 -1
- data/bio-grid.gemspec +2 -2
- data/lib/bio/grid.rb +6 -4
- metadata +3 -3
data/README.md
CHANGED
@@ -25,9 +25,9 @@ What is happening here is the following:
|
|
25
25
|
|
26
26
|
* the ```-i``` options specifies the input files or, as in this case, the location where to find input files based on a typical wildcard expression. You can actually specify as many input files/locations as you need using a comma separated list.
|
27
27
|
* the ```-n``` specify the job name
|
28
|
-
* the ```-c``` is the command line to be executed on the cluster / grid system. What BioGrid does is to fill in the ```<input1>```,```<input2>``` and ```<output>``` placeholders with the corresponding parameters passed on the command line. This is done for each input file (or each group of input files)
|
28
|
+
* the ```-c``` is the command line to be executed on the cluster / grid system. What BioGrid does is to fill in the ```<input1>```,```<input2>``` and ```<output>``` placeholders with the corresponding parameters passed on the command line. This is done for each input file (or each group of input files), taking care of generating a unique output name for each job submitted.
|
29
29
|
* the ```-o``` set the location where output files for each job will be saved. Only provide the folder where you want to save the output file(s), BioGrid will take care of generating a unique file name for the output, if needed. Check the [Output management](https://github.com/fstrozzi/bioruby-grid#output-management) for more details.
|
30
|
-
* the ```-s``` is a key parameter to specify the granularity of the jobs, setting the number of input files (or group of files, when more than one input placeholder is present in the command line) to be used for each job. So, going back to the FastQ example, if
|
30
|
+
* the ```-s``` is a key parameter to specify the granularity of the jobs, setting the number of input files (or group of files, when more than one input placeholder is present in the command line) to be used for each job. So, going back to the FastQ example, if ```-s 1``` is specified, each job will be run with exactly one FastQ R1 file and one FastQ R2 file (corresponding to the ```<input1>``` and ```<input2>``` placeholders). This gives you a great power in deciding how to split the entire input dataset across multiple computing nodes to carry on the analysis.
|
31
31
|
* the ```-p``` parameter indicates how many processes we want to use for each job. This number needs to match with the actual number of threads / processes that our command or tool will use for the analysis.
|
32
32
|
|
33
33
|
All of this is just turned into a submission script that will look like this:
|
data/VERSION
CHANGED
@@ -1 +1 @@
|
|
1
|
-
0.3.
|
1
|
+
0.3.1
|
data/bio-grid.gemspec
CHANGED
@@ -5,11 +5,11 @@
|
|
5
5
|
|
6
6
|
Gem::Specification.new do |s|
|
7
7
|
s.name = "bio-grid"
|
8
|
-
s.version = "0.3.
|
8
|
+
s.version = "0.3.1"
|
9
9
|
|
10
10
|
s.required_rubygems_version = Gem::Requirement.new(">= 0") if s.respond_to? :required_rubygems_version=
|
11
11
|
s.authors = ["Francesco Strozzi"]
|
12
|
-
s.date = "2012-09-
|
12
|
+
s.date = "2012-09-25"
|
13
13
|
s.description = "A BioGem to submit jobs on a queue system"
|
14
14
|
s.email = "francesco.strozzi@gmail.com"
|
15
15
|
s.executables = ["bio-grid"]
|
data/lib/bio/grid.rb
CHANGED
@@ -10,7 +10,7 @@ module Bio
|
|
10
10
|
end
|
11
11
|
|
12
12
|
def self.run(options)
|
13
|
-
options[:number] =
|
13
|
+
options[:number] = "all" unless options[:number]
|
14
14
|
grid = self.new options[:input], options[:number]
|
15
15
|
options[:uuid] = grid.uuid
|
16
16
|
groups = grid.prepare_input_groups
|
@@ -41,13 +41,15 @@ module Bio
|
|
41
41
|
end
|
42
42
|
end
|
43
43
|
|
44
|
-
def prepare_input_groups
|
44
|
+
def prepare_input_groups
|
45
45
|
groups = Hash.new {|h,k| h[k] = [] }
|
46
46
|
self.input.each_with_index do |location,index|
|
47
|
+
list = Dir.glob(location).sort
|
48
|
+
raise ArgumentError,"Input file or folder #{location} do not exist!" if list.empty?
|
47
49
|
if self.number == "all"
|
48
|
-
groups["input#{index+1}"] = [
|
50
|
+
groups["input#{index+1}"] = [list]
|
49
51
|
else
|
50
|
-
|
52
|
+
list.each_slice(self.number.to_i) {|subgroup| groups["input#{index+1}"] << subgroup}
|
51
53
|
end
|
52
54
|
end
|
53
55
|
groups
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: bio-grid
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.3.
|
4
|
+
version: 0.3.1
|
5
5
|
prerelease:
|
6
6
|
platform: ruby
|
7
7
|
authors:
|
@@ -9,7 +9,7 @@ authors:
|
|
9
9
|
autorequire:
|
10
10
|
bindir: bin
|
11
11
|
cert_chain: []
|
12
|
-
date: 2012-09-
|
12
|
+
date: 2012-09-25 00:00:00.000000000 Z
|
13
13
|
dependencies:
|
14
14
|
- !ruby/object:Gem::Dependency
|
15
15
|
name: uuid
|
@@ -146,7 +146,7 @@ required_ruby_version: !ruby/object:Gem::Requirement
|
|
146
146
|
version: '0'
|
147
147
|
segments:
|
148
148
|
- 0
|
149
|
-
hash:
|
149
|
+
hash: -4026493835135905673
|
150
150
|
required_rubygems_version: !ruby/object:Gem::Requirement
|
151
151
|
none: false
|
152
152
|
requirements:
|