bio-grid 0.3.0 → 0.3.1

Sign up to get free protection for your applications and to get access to all the features.
Files changed (5) hide show
  1. data/README.md +2 -2
  2. data/VERSION +1 -1
  3. data/bio-grid.gemspec +2 -2
  4. data/lib/bio/grid.rb +6 -4
  5. metadata +3 -3
data/README.md CHANGED
@@ -25,9 +25,9 @@ What is happening here is the following:
25
25
 
26
26
  * the ```-i``` options specifies the input files or, as in this case, the location where to find input files based on a typical wildcard expression. You can actually specify as many input files/locations as you need using a comma separated list.
27
27
  * the ```-n``` specify the job name
28
- * the ```-c``` is the command line to be executed on the cluster / grid system. What BioGrid does is to fill in the ```<input1>```,```<input2>``` and ```<output>``` placeholders with the corresponding parameters passed on the command line. This is done for each input file (or each group of input files) and BioGrid will check if the ```<output>``` placeholder has an extension (like .sam, .out etc.) and will generate a unique output file name for each job.
28
+ * the ```-c``` is the command line to be executed on the cluster / grid system. What BioGrid does is to fill in the ```<input1>```,```<input2>``` and ```<output>``` placeholders with the corresponding parameters passed on the command line. This is done for each input file (or each group of input files), taking care of generating a unique output name for each job submitted.
29
29
  * the ```-o``` set the location where output files for each job will be saved. Only provide the folder where you want to save the output file(s), BioGrid will take care of generating a unique file name for the output, if needed. Check the [Output management](https://github.com/fstrozzi/bioruby-grid#output-management) for more details.
30
- * the ```-s``` is a key parameter to specify the granularity of the jobs, setting the number of input files (or group of files, when more than one input placeholder is present in the command line) to be used for each job. So, going back to the FastQ example, if -s 1 is specified, each job will be run with exactly one FastQ R1 file and one FastQ R2 file. This gives you a great power in deciding how to split the entire dataset analysis across multiple computing nodes.
30
+ * the ```-s``` is a key parameter to specify the granularity of the jobs, setting the number of input files (or group of files, when more than one input placeholder is present in the command line) to be used for each job. So, going back to the FastQ example, if ```-s 1``` is specified, each job will be run with exactly one FastQ R1 file and one FastQ R2 file (corresponding to the ```<input1>``` and ```<input2>``` placeholders). This gives you a great power in deciding how to split the entire input dataset across multiple computing nodes to carry on the analysis.
31
31
  * the ```-p``` parameter indicates how many processes we want to use for each job. This number needs to match with the actual number of threads / processes that our command or tool will use for the analysis.
32
32
 
33
33
  All of this is just turned into a submission script that will look like this:
data/VERSION CHANGED
@@ -1 +1 @@
1
- 0.3.0
1
+ 0.3.1
@@ -5,11 +5,11 @@
5
5
 
6
6
  Gem::Specification.new do |s|
7
7
  s.name = "bio-grid"
8
- s.version = "0.3.0"
8
+ s.version = "0.3.1"
9
9
 
10
10
  s.required_rubygems_version = Gem::Requirement.new(">= 0") if s.respond_to? :required_rubygems_version=
11
11
  s.authors = ["Francesco Strozzi"]
12
- s.date = "2012-09-24"
12
+ s.date = "2012-09-25"
13
13
  s.description = "A BioGem to submit jobs on a queue system"
14
14
  s.email = "francesco.strozzi@gmail.com"
15
15
  s.executables = ["bio-grid"]
@@ -10,7 +10,7 @@ module Bio
10
10
  end
11
11
 
12
12
  def self.run(options)
13
- options[:number] = 1 unless options[:number]
13
+ options[:number] = "all" unless options[:number]
14
14
  grid = self.new options[:input], options[:number]
15
15
  options[:uuid] = grid.uuid
16
16
  groups = grid.prepare_input_groups
@@ -41,13 +41,15 @@ module Bio
41
41
  end
42
42
  end
43
43
 
44
- def prepare_input_groups
44
+ def prepare_input_groups
45
45
  groups = Hash.new {|h,k| h[k] = [] }
46
46
  self.input.each_with_index do |location,index|
47
+ list = Dir.glob(location).sort
48
+ raise ArgumentError,"Input file or folder #{location} do not exist!" if list.empty?
47
49
  if self.number == "all"
48
- groups["input#{index+1}"] = [Dir.glob(location).sort]
50
+ groups["input#{index+1}"] = [list]
49
51
  else
50
- Dir.glob(location).sort.each_slice(self.number.to_i) {|subgroup| groups["input#{index+1}"] << subgroup}
52
+ list.each_slice(self.number.to_i) {|subgroup| groups["input#{index+1}"] << subgroup}
51
53
  end
52
54
  end
53
55
  groups
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: bio-grid
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.3.0
4
+ version: 0.3.1
5
5
  prerelease:
6
6
  platform: ruby
7
7
  authors:
@@ -9,7 +9,7 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2012-09-24 00:00:00.000000000 Z
12
+ date: 2012-09-25 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: uuid
@@ -146,7 +146,7 @@ required_ruby_version: !ruby/object:Gem::Requirement
146
146
  version: '0'
147
147
  segments:
148
148
  - 0
149
- hash: 2907219030788310971
149
+ hash: -4026493835135905673
150
150
  required_rubygems_version: !ruby/object:Gem::Requirement
151
151
  none: false
152
152
  requirements: