scbi_distributed_blast 0.0.3
Sign up to get free protection for your applications and to get access to all the features.
- data/History.txt +12 -0
- data/Manifest.txt +14 -0
- data/PostInstall.txt +7 -0
- data/README.rdoc +209 -0
- data/Rakefile +26 -0
- data/bin/scbi_distributed_blast +157 -0
- data/lib/scbi_distributed_blast/scbi_dblast_manager.rb +62 -0
- data/lib/scbi_distributed_blast/scbi_dblast_worker.rb +64 -0
- data/lib/scbi_distributed_blast.rb +11 -0
- data/script/console +10 -0
- data/script/destroy +14 -0
- data/script/generate +14 -0
- data/test/test_helper.rb +3 -0
- data/test/test_scbi_distributed_blast.rb +11 -0
- metadata +115 -0
data/History.txt
ADDED
data/Manifest.txt
ADDED
@@ -0,0 +1,14 @@
|
|
1
|
+
bin/scbi_distributed_blast
|
2
|
+
History.txt
|
3
|
+
lib/scbi_distributed_blast/scbi_dblast_manager.rb
|
4
|
+
lib/scbi_distributed_blast/scbi_dblast_worker.rb
|
5
|
+
lib/scbi_distributed_blast.rb
|
6
|
+
Manifest.txt
|
7
|
+
PostInstall.txt
|
8
|
+
Rakefile
|
9
|
+
README.rdoc
|
10
|
+
script/console
|
11
|
+
script/destroy
|
12
|
+
script/generate
|
13
|
+
test/test_helper.rb
|
14
|
+
test/test_scbi_distributed_blast.rb
|
data/PostInstall.txt
ADDED
data/README.rdoc
ADDED
@@ -0,0 +1,209 @@
|
|
1
|
+
= scbi_distributed_blast
|
2
|
+
|
3
|
+
* http://www.scbi.uma.es/downloads
|
4
|
+
|
5
|
+
== DESCRIPTION:
|
6
|
+
|
7
|
+
scbi_distributed_blast is a simple distribution mechanism for blast+ made on top of scbi_mapreduce. With scbi_distributed_blast you can perform distributed blasts using a cluster or a set of machines of your network. It uses the version of blast+ that you have installed.
|
8
|
+
|
9
|
+
== FEATURES:
|
10
|
+
|
11
|
+
* Automatically distribute blast+ jobs against multiple computers.
|
12
|
+
* Sequences are sent in chunks.
|
13
|
+
* scbi_distributed_blast uses scbi_mapreduce and thus is able to exploit all the benefits of a cluster environment. It also works in multi-core machines and big shared-memory servers.
|
14
|
+
|
15
|
+
== SYNOPSIS:
|
16
|
+
|
17
|
+
Once installed, scbi_distributed_blast is very easy to use. To launch it locally in your own personal computer using 8 cores, you can do:
|
18
|
+
|
19
|
+
$> scbi_distributed_blast -s 10.0.0 -w 8 'full_blast_cmd'
|
20
|
+
|
21
|
+
Where full_blast_cmd is the blast+ cmd that you would write to execute your desired blast search. Eg.:
|
22
|
+
|
23
|
+
$> scbi_distributed_blast -s 10.0.0 -w 8 'blastn -task blastn-short -db my_db.fasta -query ~/seqs/sample.fasta -outfmt 6 -out output_file'
|
24
|
+
|
25
|
+
Sequences are sent in chunks of 100, but you can change this value by using the -g parameter:
|
26
|
+
|
27
|
+
$> scbi_distributed_blast -s 10.0.0 -w 8 -g 200 'blastn -task blastn-short -db my_db.fasta -query ~/seqs/sample.fasta -outfmt 6 -out output_file'
|
28
|
+
|
29
|
+
To get additional help:
|
30
|
+
|
31
|
+
$> scbi_distributed_blast -h
|
32
|
+
|
33
|
+
=== CLUSTERED EXECUTION:
|
34
|
+
|
35
|
+
To take full advantage of a clustered installation, you can launch scbi_distributed_blast in distributed mode. You only need to provide it a list of machine names (or IPs) where workers will be launched (be sure you followed the clustered installation instructions).
|
36
|
+
|
37
|
+
Setup a workers file like this:
|
38
|
+
|
39
|
+
machine1
|
40
|
+
machine1
|
41
|
+
machine2
|
42
|
+
machine2
|
43
|
+
machine2
|
44
|
+
|
45
|
+
And launch scbi_distributed_blast this way:
|
46
|
+
|
47
|
+
$> scbi_distributed_blast -w workers_file -s 10.0.0 'blastn -task blast-short'
|
48
|
+
|
49
|
+
This will launch 2 workers on machine1 and 3 workers on machine2 using the network whose ip starts with 10.0.0 to communicate.
|
50
|
+
|
51
|
+
|
52
|
+
== REQUIREMENTS:
|
53
|
+
|
54
|
+
* Ruby: 1.9.2 recommended.
|
55
|
+
* Blast plus 2.24 or greater (prior versions have bugs that produces bad results)
|
56
|
+
|
57
|
+
== REQUIREMENTS INSTALL:
|
58
|
+
|
59
|
+
You can skip this section if you have ruby and blast+ already installed.
|
60
|
+
|
61
|
+
=== Installing Blast
|
62
|
+
|
63
|
+
*Download the latest version of Blast+ from ftp://ftp.ncbi.nlm.nih.gov/blast/executables/release/LATEST/
|
64
|
+
*You can also use a precompiled version if you like
|
65
|
+
*To install from source, decompress the downloaded file, cd to the decompressed folder, and issue the following commands:
|
66
|
+
|
67
|
+
./configure
|
68
|
+
make
|
69
|
+
sudo make install
|
70
|
+
|
71
|
+
|
72
|
+
=== Installing Ruby 1.9
|
73
|
+
|
74
|
+
*You can use RVM to install ruby:
|
75
|
+
|
76
|
+
Download latest certificates (maybe you don't need them):
|
77
|
+
|
78
|
+
$ curl -O http://curl.haxx.se/ca/cacert.pem
|
79
|
+
$ export CURL_CA_BUNDLE=`pwd`/cacert.pem # add this to your .bashrc or
|
80
|
+
equivalent
|
81
|
+
|
82
|
+
Install RVM:
|
83
|
+
|
84
|
+
$ bash < <(curl -k https://rvm.beginrescueend.com/install/rvm)
|
85
|
+
|
86
|
+
Setup environment:
|
87
|
+
|
88
|
+
$ echo '[[ -s "$HOME/.rvm/scripts/rvm" ]] && . "$HOME/.rvm/scripts/rvm" # Load RVM function' >> ~/.bash_profile
|
89
|
+
|
90
|
+
Install ruby 1.9.2 (this can take a while):
|
91
|
+
|
92
|
+
$ rvm install 1.9.2
|
93
|
+
|
94
|
+
Set it as the default:
|
95
|
+
|
96
|
+
$ rvm use 1.9.2 --default
|
97
|
+
|
98
|
+
== INSTALL:
|
99
|
+
|
100
|
+
=== Install scbi_distributed_blast
|
101
|
+
|
102
|
+
scbi_distributed_blast is very easy to install. It is distributed as a ruby gem:
|
103
|
+
|
104
|
+
gem install scbi_distributed_blast
|
105
|
+
|
106
|
+
This will install scbi_distributed_blast and all the required gems.
|
107
|
+
|
108
|
+
|
109
|
+
== CLUSTERED INSTALLATION
|
110
|
+
|
111
|
+
To install scbi_distributed_blast into a cluster, you need to have the software available on all machines. By installing it on a shared location, or installing it on each cluster node. Once installed, you need to create a init_file where your environment is correctly setup (paths, BLASTDB, etc). E.g.:
|
112
|
+
|
113
|
+
export PATH=/apps/blast+/bin
|
114
|
+
export BLASTDB=/var/DB/formatted
|
115
|
+
export SCBI_DISTRIBUTED_BLAST_INIT=path_to_init_file
|
116
|
+
|
117
|
+
|
118
|
+
And initialize the SCBI_DISTRIBUTED_BLAST_INIT environment variable on your main node (from where SCBI_DISTRIBUTED_BLAST_INIT will be initially launched):
|
119
|
+
|
120
|
+
source path_to_init_file
|
121
|
+
|
122
|
+
If you use any queue system like PBS Pro or Moab/Slurm, be sure to initialize the variables inside each submission script.
|
123
|
+
|
124
|
+
<b>NOTE</b>: all nodes on the cluster should use ssh keys to allow scbi_mapreduce to launch workers without asking for a password.
|
125
|
+
|
126
|
+
== SAMPLE INIT FILES FOR CLUSTERED INSTALLATION:
|
127
|
+
|
128
|
+
=== Init file
|
129
|
+
|
130
|
+
$> cat ~/scbi_distributed_blast_init_env
|
131
|
+
|
132
|
+
export BLASTDB=/BLAST_DATABASES/
|
133
|
+
export SCBI_DISTRIBUTED_BLAST_INIT=~/scbi_distributed_blast_init_env
|
134
|
+
|
135
|
+
|
136
|
+
=== PBS Submission script
|
137
|
+
|
138
|
+
$> cat sample_work.sh
|
139
|
+
|
140
|
+
# 40 distributed workers and 1 GB memory per worker:
|
141
|
+
#PBS -l select=40:ncpus=1:mpiprocs=1:mem=1gb
|
142
|
+
# request 10 hours of walltime:
|
143
|
+
#PBS -l walltime=10:00:00
|
144
|
+
# cd to working directory (from where job was submitted)
|
145
|
+
cd $PBS_O_WORKDIR
|
146
|
+
|
147
|
+
# create workers file with assigned node names
|
148
|
+
|
149
|
+
cat ${PBS_NODEFILE} > workers
|
150
|
+
|
151
|
+
# init scbi_distributed_blast
|
152
|
+
source path_to_init_file
|
153
|
+
|
154
|
+
time scbi_distributed_blast -s 10.0.0 -w workers 'blastn -task blastn-short -db my_db.fasta -query ~/seqs/sample.fasta -outfmt 6 -out output_file'
|
155
|
+
|
156
|
+
|
157
|
+
Once this submission script is created, you only need to launch it with:
|
158
|
+
|
159
|
+
qsub sample_work.sh
|
160
|
+
|
161
|
+
=== MOAB/SLURM submission script
|
162
|
+
|
163
|
+
$> cat sample_work_moab.sh
|
164
|
+
|
165
|
+
#!/bin/bash
|
166
|
+
# @ job_name = STN
|
167
|
+
# @ initialdir = .
|
168
|
+
# @ output = STN_%j.out
|
169
|
+
# @ error = STN_%j.err
|
170
|
+
# @ total_tasks = 40
|
171
|
+
# @ wall_clock_limit = 10:00:00
|
172
|
+
|
173
|
+
# guardar lista de workers
|
174
|
+
sl_get_machine_list > workers
|
175
|
+
|
176
|
+
# init scbi_distributed_blast
|
177
|
+
source path_to_init_file
|
178
|
+
|
179
|
+
time scbi_distributed_blast -s 10.0.0 -w workers 'blastn -task blastn-short -db my_db.fasta -query ~/seqs/sample.fasta -outfmt 6 -out output_file'
|
180
|
+
|
181
|
+
Then you only need to submit your job with mnsubmit
|
182
|
+
|
183
|
+
mnsubmit sample_work_moab.sh
|
184
|
+
|
185
|
+
|
186
|
+
== LICENSE:
|
187
|
+
|
188
|
+
(The MIT License)
|
189
|
+
|
190
|
+
Copyright (c) 2011 Dario Guerrero
|
191
|
+
|
192
|
+
Permission is hereby granted, free of charge, to any person obtaining
|
193
|
+
a copy of this software and associated documentation files (the
|
194
|
+
'Software'), to deal in the Software without restriction, including
|
195
|
+
without limitation the rights to use, copy, modify, merge, publish,
|
196
|
+
distribute, sublicense, and/or sell copies of the Software, and to
|
197
|
+
permit persons to whom the Software is furnished to do so, subject to
|
198
|
+
the following conditions:
|
199
|
+
|
200
|
+
The above copyright notice and this permission notice shall be
|
201
|
+
included in all copies or substantial portions of the Software.
|
202
|
+
|
203
|
+
THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND,
|
204
|
+
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
205
|
+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
|
206
|
+
IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
|
207
|
+
CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
|
208
|
+
TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
|
209
|
+
SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
data/Rakefile
ADDED
@@ -0,0 +1,26 @@
|
|
1
|
+
require 'rubygems'
|
2
|
+
gem 'hoe', '>= 2.1.0'
|
3
|
+
require 'hoe'
|
4
|
+
require 'fileutils'
|
5
|
+
require './lib/scbi_distributed_blast'
|
6
|
+
|
7
|
+
Hoe.plugin :newgem
|
8
|
+
# Hoe.plugin :website
|
9
|
+
# Hoe.plugin :cucumberfeatures
|
10
|
+
|
11
|
+
# Generate all the Rake tasks
|
12
|
+
# Run 'rake -T' to see list of generated tasks (from gem root directory)
|
13
|
+
$hoe = Hoe.spec 'scbi_distributed_blast' do
|
14
|
+
self.developer 'Dario Guerrero', 'dariogf@gmail.com'
|
15
|
+
self.post_install_message = 'PostInstall.txt' # TODO remove if post-install message not required
|
16
|
+
self.rubyforge_name = self.name # TODO this is default value
|
17
|
+
self.extra_deps = [['scbi_mapreduce','>= 0.0.33'], ['scbi_blast','>= 0.0.32'],['scbi_fasta','>= 0.1.7']]
|
18
|
+
|
19
|
+
end
|
20
|
+
|
21
|
+
require 'newgem/tasks'
|
22
|
+
Dir['tasks/**/*.rake'].each { |t| load t }
|
23
|
+
|
24
|
+
# TODO - want other tests/tasks run by default? Add them to the list
|
25
|
+
# remove_task :default
|
26
|
+
# task :default => [:spec, :features]
|
@@ -0,0 +1,157 @@
|
|
1
|
+
#!/usr/bin/env ruby
|
2
|
+
|
3
|
+
$: << File.join(File.dirname(File.dirname(__FILE__)),'lib')
|
4
|
+
|
5
|
+
# load required libraries
|
6
|
+
require 'scbi_mapreduce'
|
7
|
+
require 'scbi_distributed_blast'
|
8
|
+
require 'logger'
|
9
|
+
require 'optparse'
|
10
|
+
|
11
|
+
if ENV['SCBI_DISTRIBUTED_BLAST_INIT'] && File.exists?(ENV['SCBI_DISTRIBUTED_BLAST_INIT'])
|
12
|
+
$INIT_FILE=File.expand_path(ENV['SCBI_DISTRIBUTED_BLAST_INIT'])
|
13
|
+
elsif File.exists?(File.join('~','scbi_distributed_blast_init_env'))
|
14
|
+
$INIT_FILE=File.join('~','scbi_distributed_blast_init_env')
|
15
|
+
else
|
16
|
+
$INIT_FILE=File.join(ROOT_PATH,'scbi_distributed_blast_init_env')
|
17
|
+
end
|
18
|
+
|
19
|
+
|
20
|
+
options = {}
|
21
|
+
|
22
|
+
optparse = OptionParser.new do |opts|
|
23
|
+
|
24
|
+
# Set a banner, displayed at the top
|
25
|
+
# of the help screen.
|
26
|
+
opts.banner = "Usage: #{$0} [options] blast_command"
|
27
|
+
|
28
|
+
# Define the options, and what they do
|
29
|
+
|
30
|
+
options[:server_ip] = '0.0.0.0'
|
31
|
+
opts.on( '-s', '--server IP', 'Server ip. You can use a partial ip to select the apropriate interface' ) do |server_ip|
|
32
|
+
options[:server_ip] = server_ip
|
33
|
+
end
|
34
|
+
|
35
|
+
options[:port] = 0 # any free port
|
36
|
+
opts.on( '-p', '--port PORT', 'Server port. If set to 0, an arbitrary empty port will be used') do |port|
|
37
|
+
options[:port] = port.to_i
|
38
|
+
end
|
39
|
+
|
40
|
+
# set number of workers. You can also provide an array with worker names.
|
41
|
+
# Those workers names can be read from a file produced by the existing
|
42
|
+
# queue system, if any.
|
43
|
+
options[:workers] = 2
|
44
|
+
opts.on( '-w', '--workers COUNT', 'Number of workers, or file containing machine names to launch workers with ssh' ) do |workers|
|
45
|
+
if File.exists?(workers)
|
46
|
+
# use workers file
|
47
|
+
options[:workers] = File.read(workers).split("\n").map{|w| w.chomp}
|
48
|
+
else
|
49
|
+
begin
|
50
|
+
options[:workers] = Integer(workers)
|
51
|
+
rescue
|
52
|
+
STDERR.puts "ERROR:Invalid workers parameter #{options[:workers]}"
|
53
|
+
exit
|
54
|
+
end
|
55
|
+
end
|
56
|
+
end
|
57
|
+
|
58
|
+
options[:chunk_size] = 100
|
59
|
+
opts.on( '-g', '--group_size chunk_size', 'Group sequences in chunks of size <chunk_size>' ) do |cs|
|
60
|
+
options[:chunk_size] = cs.to_i
|
61
|
+
end
|
62
|
+
|
63
|
+
options[:log_file] = STDOUT
|
64
|
+
opts.on( '-l', '--log_file file', 'Define a log file. STDOUT by default' ) do |cs|
|
65
|
+
options[:log_file] = cs
|
66
|
+
end
|
67
|
+
|
68
|
+
|
69
|
+
# This displays the help screen, all programs are
|
70
|
+
# assumed to have this option.
|
71
|
+
opts.on_tail( '-h', '--help', 'Display this screen' ) do
|
72
|
+
puts opts
|
73
|
+
# show_additional_help
|
74
|
+
exit -1
|
75
|
+
end
|
76
|
+
end
|
77
|
+
|
78
|
+
# parse options and remove from ARGV
|
79
|
+
optparse.parse!
|
80
|
+
|
81
|
+
blast_cmd = ARGV.join(' ')
|
82
|
+
|
83
|
+
$LOG = Logger.new(options[:log_file])
|
84
|
+
$LOG.datetime_format = "%Y-%m-%d %H:%M:%S"
|
85
|
+
|
86
|
+
|
87
|
+
$LOG.info "Original Blast+ CMD: #{blast_cmd}"
|
88
|
+
|
89
|
+
input_file = nil
|
90
|
+
output_file = nil
|
91
|
+
|
92
|
+
if blast_cmd.upcase.index('-QUERY')
|
93
|
+
params=blast_cmd.split(' -')
|
94
|
+
params.reverse_each do |param|
|
95
|
+
if param.upcase.index('QUERY ')==0
|
96
|
+
$LOG.debug "found .#{param.strip}."
|
97
|
+
|
98
|
+
input_file=param.slice(5,param.length).strip
|
99
|
+
input_file=File.expand_path(input_file)
|
100
|
+
params.delete(param)
|
101
|
+
# break
|
102
|
+
end
|
103
|
+
|
104
|
+
if param.upcase.index('OUT ')==0
|
105
|
+
$LOG.debug "found .#{param.strip}."
|
106
|
+
|
107
|
+
output_file=param.slice(3,param.length).strip
|
108
|
+
output_file=File.expand_path(output_file)
|
109
|
+
params.delete(param)
|
110
|
+
# break
|
111
|
+
end
|
112
|
+
|
113
|
+
|
114
|
+
end
|
115
|
+
|
116
|
+
blast_cmd=params.join(' -')
|
117
|
+
end
|
118
|
+
|
119
|
+
# puts "BLASTCMD: #{blast_cmd}"
|
120
|
+
# puts "Input file: #{input_file}"
|
121
|
+
# puts "Output file: #{output_file}"
|
122
|
+
|
123
|
+
if !input_file.nil? and File.exists?(File.expand_path(input_file))
|
124
|
+
|
125
|
+
$LOG.info "Query input file: #{input_file}"
|
126
|
+
else
|
127
|
+
$LOG.error "No input file specified in blast command (-query parameter)"
|
128
|
+
exit -1
|
129
|
+
end
|
130
|
+
|
131
|
+
# we need the path to my_worker in order to launch it when necessary
|
132
|
+
# custom_worker_file = File.join(File.dirname(__FILE__),'scbi_dblast_worker.rb')
|
133
|
+
custom_worker_file=File.join(File.dirname(File.dirname(__FILE__)),'lib','scbi_distributed_blast','scbi_dblast_worker')
|
134
|
+
|
135
|
+
# initialize the work manager. Here you can pass parameters like file names
|
136
|
+
ScbiDblastManager.init_work_manager(input_file, blast_cmd, output_file)
|
137
|
+
|
138
|
+
# launch processor server
|
139
|
+
mgr = ScbiMapreduce::Manager.new(options[:server_ip], options[:port], options[:workers], ScbiDblastManager, custom_worker_file, options[:log_file],$INIT_FILE)
|
140
|
+
|
141
|
+
# you can set additional properties
|
142
|
+
# =================================
|
143
|
+
|
144
|
+
# if you want basic checkpointing. Some performance drop should be expected
|
145
|
+
# mgr.checkpointing=true
|
146
|
+
|
147
|
+
# if you want to keep the order of input data. Some performance drop should be expected
|
148
|
+
# mgr.keep_order=true
|
149
|
+
|
150
|
+
# you can set the size of packets of data sent to workers
|
151
|
+
mgr.chunk_size=options[:chunk_size]
|
152
|
+
|
153
|
+
# start processing
|
154
|
+
mgr.start_server
|
155
|
+
|
156
|
+
# this line is reached when all data has been processed
|
157
|
+
$LOG.info "Program finished"
|
@@ -0,0 +1,62 @@
|
|
1
|
+
require 'json'
|
2
|
+
|
3
|
+
require 'scbi_fasta'
|
4
|
+
|
5
|
+
# MyWorkerManager class is used to implement the methods
|
6
|
+
# to send and receive the data to or from workers
|
7
|
+
class ScbiDblastManager < ScbiMapreduce::WorkManager
|
8
|
+
|
9
|
+
# init_work_manager is executed at the start, prior to any processing.
|
10
|
+
# You can use init_work_manager to initialize global variables, open files, etc...
|
11
|
+
# Note that an instance of MyWorkerManager will be created for each
|
12
|
+
# worker connection, and thus, all global variables here should be
|
13
|
+
# class variables (starting with @@)
|
14
|
+
def self.init_work_manager(input_file, blast_cmd, output_file)
|
15
|
+
@@blast_cmd=blast_cmd
|
16
|
+
if output_file.nil?
|
17
|
+
@@output_file=STDOUT
|
18
|
+
else
|
19
|
+
@@output_file=File.open(output_file,'w')
|
20
|
+
end
|
21
|
+
|
22
|
+
@@fqr = FastaQualFile.new(input_file)
|
23
|
+
|
24
|
+
end
|
25
|
+
|
26
|
+
# end_work_manager is executed at the end, when all the process is done.
|
27
|
+
# You can use it to close files opened in init_work_manager
|
28
|
+
def self.end_work_manager
|
29
|
+
@@fqr.close
|
30
|
+
@@output_file.close if @@output_file!=STDOUT
|
31
|
+
end
|
32
|
+
|
33
|
+
# worker_initial_config is used to send initial parameters to workers.
|
34
|
+
# The method is executed once per each worker
|
35
|
+
def worker_initial_config
|
36
|
+
{:blast_cmd=>@@blast_cmd}
|
37
|
+
end
|
38
|
+
|
39
|
+
# next_work method is called every time a worker needs a new work
|
40
|
+
# Here you can read data from disk
|
41
|
+
# This method must return the work data or nil if no more data is available
|
42
|
+
def next_work
|
43
|
+
|
44
|
+
n,f = @@fqr.next_seq
|
45
|
+
|
46
|
+
if n.nil?
|
47
|
+
return nil
|
48
|
+
else
|
49
|
+
return [n,f]
|
50
|
+
end
|
51
|
+
|
52
|
+
end
|
53
|
+
|
54
|
+
|
55
|
+
# work_received is executed each time a worker has finished a job.
|
56
|
+
# Here you can write results down to disk, perform some aggregated statistics, etc...
|
57
|
+
def work_received(results)
|
58
|
+
@@output_file.puts results
|
59
|
+
# write_data_to_disk(results)
|
60
|
+
end
|
61
|
+
|
62
|
+
end
|
@@ -0,0 +1,64 @@
|
|
1
|
+
# MyWorker defines the behaviour of workers.
|
2
|
+
# Here is where the real processing takes place
|
3
|
+
|
4
|
+
require 'scbi_blast'
|
5
|
+
|
6
|
+
class ScbiDblastWorker < ScbiMapreduce::Worker
|
7
|
+
|
8
|
+
# starting_worker method is called one time at initialization
|
9
|
+
# and allows you to initialize your variables
|
10
|
+
def starting_worker
|
11
|
+
|
12
|
+
# You can use worker logs at any time in this way:
|
13
|
+
# $WORKER_LOG.info "Starting a worker"
|
14
|
+
|
15
|
+
end
|
16
|
+
|
17
|
+
|
18
|
+
# receive_initial_config is called only once just after
|
19
|
+
# the first connection, when initial parameters are
|
20
|
+
# received from manager
|
21
|
+
def receive_initial_config(parameters)
|
22
|
+
|
23
|
+
# Reads the parameters
|
24
|
+
|
25
|
+
# You can use worker logs at any time in this way:
|
26
|
+
# $WORKER_LOG.info "Params received"
|
27
|
+
|
28
|
+
# save received parameters, if any
|
29
|
+
@params = parameters
|
30
|
+
|
31
|
+
end
|
32
|
+
|
33
|
+
|
34
|
+
# process_object method is called for each received object.
|
35
|
+
# Be aware that objs is always an array, and you must iterate
|
36
|
+
# over it if you need to process it independently
|
37
|
+
#
|
38
|
+
# The value returned here will be received by the work_received
|
39
|
+
# method at your worker_manager subclass.
|
40
|
+
def process_object(objs)
|
41
|
+
chunk=[]
|
42
|
+
# iterate over all objects received
|
43
|
+
objs.each do |n,f|
|
44
|
+
chunk<< ">"+n
|
45
|
+
chunk<< f
|
46
|
+
|
47
|
+
# convert to uppercase
|
48
|
+
# f.downcase!
|
49
|
+
end
|
50
|
+
|
51
|
+
|
52
|
+
|
53
|
+
# puts "Doing blast to #{@params[:blast_cmd]}"
|
54
|
+
blast=BatchBlast.do_blast_cmd(chunk.join("\n"),@params[:blast_cmd])
|
55
|
+
|
56
|
+
# return objs back to manager
|
57
|
+
return blast
|
58
|
+
end
|
59
|
+
|
60
|
+
# called once, when the worker is about to be closed
|
61
|
+
def closing_worker
|
62
|
+
|
63
|
+
end
|
64
|
+
end
|
@@ -0,0 +1,11 @@
|
|
1
|
+
$:.unshift(File.dirname(__FILE__)) unless
|
2
|
+
$:.include?(File.dirname(__FILE__)) || $:.include?(File.expand_path(File.dirname(__FILE__)))
|
3
|
+
|
4
|
+
require 'scbi_mapreduce'
|
5
|
+
require 'scbi_distributed_blast/scbi_dblast_manager.rb'
|
6
|
+
|
7
|
+
ROOT_PATH=File.join(File.dirname(__FILE__),'scbi_distributed_blast')
|
8
|
+
|
9
|
+
module ScbiDistributedBlast
|
10
|
+
VERSION = '0.0.3'
|
11
|
+
end
|
data/script/console
ADDED
@@ -0,0 +1,10 @@
|
|
1
|
+
#!/usr/bin/env ruby
|
2
|
+
# File: script/console
|
3
|
+
irb = RUBY_PLATFORM =~ /(:?mswin|mingw)/ ? 'irb.bat' : 'irb'
|
4
|
+
|
5
|
+
libs = " -r irb/completion"
|
6
|
+
# Perhaps use a console_lib to store any extra methods I may want available in the cosole
|
7
|
+
# libs << " -r #{File.dirname(__FILE__) + '/../lib/console_lib/console_logger.rb'}"
|
8
|
+
libs << " -r #{File.dirname(__FILE__) + '/../lib/scbi_distributed_blast.rb'}"
|
9
|
+
puts "Loading scbi_distributed_blast gem"
|
10
|
+
exec "#{irb} #{libs} --simple-prompt"
|
data/script/destroy
ADDED
@@ -0,0 +1,14 @@
|
|
1
|
+
#!/usr/bin/env ruby
|
2
|
+
APP_ROOT = File.expand_path(File.join(File.dirname(__FILE__), '..'))
|
3
|
+
|
4
|
+
begin
|
5
|
+
require 'rubigen'
|
6
|
+
rescue LoadError
|
7
|
+
require 'rubygems'
|
8
|
+
require 'rubigen'
|
9
|
+
end
|
10
|
+
require 'rubigen/scripts/destroy'
|
11
|
+
|
12
|
+
ARGV.shift if ['--help', '-h'].include?(ARGV[0])
|
13
|
+
RubiGen::Base.use_component_sources! [:rubygems, :newgem, :newgem_theme, :test_unit]
|
14
|
+
RubiGen::Scripts::Destroy.new.run(ARGV)
|
data/script/generate
ADDED
@@ -0,0 +1,14 @@
|
|
1
|
+
#!/usr/bin/env ruby
|
2
|
+
APP_ROOT = File.expand_path(File.join(File.dirname(__FILE__), '..'))
|
3
|
+
|
4
|
+
begin
|
5
|
+
require 'rubigen'
|
6
|
+
rescue LoadError
|
7
|
+
require 'rubygems'
|
8
|
+
require 'rubigen'
|
9
|
+
end
|
10
|
+
require 'rubigen/scripts/generate'
|
11
|
+
|
12
|
+
ARGV.shift if ['--help', '-h'].include?(ARGV[0])
|
13
|
+
RubiGen::Base.use_component_sources! [:rubygems, :newgem, :newgem_theme, :test_unit]
|
14
|
+
RubiGen::Scripts::Generate.new.run(ARGV)
|
data/test/test_helper.rb
ADDED
metadata
ADDED
@@ -0,0 +1,115 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: scbi_distributed_blast
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
prerelease:
|
5
|
+
version: 0.0.3
|
6
|
+
platform: ruby
|
7
|
+
authors:
|
8
|
+
- Dario Guerrero
|
9
|
+
autorequire:
|
10
|
+
bindir: bin
|
11
|
+
cert_chain: []
|
12
|
+
|
13
|
+
date: 2011-07-08 00:00:00 Z
|
14
|
+
dependencies:
|
15
|
+
- !ruby/object:Gem::Dependency
|
16
|
+
name: scbi_mapreduce
|
17
|
+
prerelease: false
|
18
|
+
requirement: &id001 !ruby/object:Gem::Requirement
|
19
|
+
none: false
|
20
|
+
requirements:
|
21
|
+
- - ">="
|
22
|
+
- !ruby/object:Gem::Version
|
23
|
+
version: 0.0.33
|
24
|
+
type: :runtime
|
25
|
+
version_requirements: *id001
|
26
|
+
- !ruby/object:Gem::Dependency
|
27
|
+
name: scbi_blast
|
28
|
+
prerelease: false
|
29
|
+
requirement: &id002 !ruby/object:Gem::Requirement
|
30
|
+
none: false
|
31
|
+
requirements:
|
32
|
+
- - ">="
|
33
|
+
- !ruby/object:Gem::Version
|
34
|
+
version: 0.0.32
|
35
|
+
type: :runtime
|
36
|
+
version_requirements: *id002
|
37
|
+
- !ruby/object:Gem::Dependency
|
38
|
+
name: scbi_fasta
|
39
|
+
prerelease: false
|
40
|
+
requirement: &id003 !ruby/object:Gem::Requirement
|
41
|
+
none: false
|
42
|
+
requirements:
|
43
|
+
- - ">="
|
44
|
+
- !ruby/object:Gem::Version
|
45
|
+
version: 0.1.7
|
46
|
+
type: :runtime
|
47
|
+
version_requirements: *id003
|
48
|
+
- !ruby/object:Gem::Dependency
|
49
|
+
name: hoe
|
50
|
+
prerelease: false
|
51
|
+
requirement: &id004 !ruby/object:Gem::Requirement
|
52
|
+
none: false
|
53
|
+
requirements:
|
54
|
+
- - ">="
|
55
|
+
- !ruby/object:Gem::Version
|
56
|
+
version: 2.8.0
|
57
|
+
type: :development
|
58
|
+
version_requirements: *id004
|
59
|
+
description: scbi_distributed_blast is a simple distribution mechanism for blast+ made on top of scbi_mapreduce. With scbi_distributed_blast you can perform distributed blasts using a cluster or a set of machines of your network. It uses the version of blast+ that you have installed.
|
60
|
+
email:
|
61
|
+
- dariogf@gmail.com
|
62
|
+
executables:
|
63
|
+
- scbi_distributed_blast
|
64
|
+
extensions: []
|
65
|
+
|
66
|
+
extra_rdoc_files:
|
67
|
+
- History.txt
|
68
|
+
- Manifest.txt
|
69
|
+
- PostInstall.txt
|
70
|
+
files:
|
71
|
+
- bin/scbi_distributed_blast
|
72
|
+
- History.txt
|
73
|
+
- lib/scbi_distributed_blast/scbi_dblast_manager.rb
|
74
|
+
- lib/scbi_distributed_blast/scbi_dblast_worker.rb
|
75
|
+
- lib/scbi_distributed_blast.rb
|
76
|
+
- Manifest.txt
|
77
|
+
- PostInstall.txt
|
78
|
+
- Rakefile
|
79
|
+
- README.rdoc
|
80
|
+
- script/console
|
81
|
+
- script/destroy
|
82
|
+
- script/generate
|
83
|
+
- test/test_helper.rb
|
84
|
+
- test/test_scbi_distributed_blast.rb
|
85
|
+
homepage: http://www.scbi.uma.es/downloads
|
86
|
+
licenses: []
|
87
|
+
|
88
|
+
post_install_message: PostInstall.txt
|
89
|
+
rdoc_options:
|
90
|
+
- --main
|
91
|
+
- README.rdoc
|
92
|
+
require_paths:
|
93
|
+
- lib
|
94
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
95
|
+
none: false
|
96
|
+
requirements:
|
97
|
+
- - ">="
|
98
|
+
- !ruby/object:Gem::Version
|
99
|
+
version: "0"
|
100
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
101
|
+
none: false
|
102
|
+
requirements:
|
103
|
+
- - ">="
|
104
|
+
- !ruby/object:Gem::Version
|
105
|
+
version: "0"
|
106
|
+
requirements: []
|
107
|
+
|
108
|
+
rubyforge_project: scbi_distributed_blast
|
109
|
+
rubygems_version: 1.7.2
|
110
|
+
signing_key:
|
111
|
+
specification_version: 3
|
112
|
+
summary: scbi_distributed_blast is a simple distribution mechanism for blast+ made on top of scbi_mapreduce
|
113
|
+
test_files:
|
114
|
+
- test/test_helper.rb
|
115
|
+
- test/test_scbi_distributed_blast.rb
|