transfuse 0.4.3 → 0.4.5

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: ec08ab7f1e09b7ab11ed3aa80f48650c4cebd13c
4
- data.tar.gz: f56e079bb4c91c8d1636472409d5f2dccec54a0c
3
+ metadata.gz: 94b55c79a713ddce1aaba5c302e3ce0d7a3dbd4a
4
+ data.tar.gz: 4472cc10a3a5a5e29e11bc22ad3d17b07fb7c332
5
5
  SHA512:
6
- metadata.gz: 0bc631582f019658ece91918c0b2278c24217cc8bdfe0ccd721000c0d75892c59a3d70c78d13d3f32666ea696aa2886ff3a05b8fae4ce158c7debb16daacac06
7
- data.tar.gz: dcfd447ef6bedba5d5a0ede16421c01c6fc230ad81c9cd4e1215593ec47f59cad25cd10e364beaeacce0b6d33ce235af22523a3644b9b20f86bca366512c1e82
6
+ metadata.gz: 8b61add12cd4c12db7ae2a65739c7895ea31c82b188a717f17e860a2009036a28962d53e275e67e244ef3f85e21a291ca15384df14ec5c1c51606a6893c2e25c
7
+ data.tar.gz: 1443f82858f58c9371f2223fb91fefa802cf809aa6802bbb84c014781641304e8c1a02d9dc6e289291e8b52569bce71438931472b18dfc009d9e9472fcc130c4
data/README.md CHANGED
@@ -1,40 +1,55 @@
1
1
  ## Transfuse
2
2
 
3
- **Transfuse is currently in development and is not yet ready for use**
4
-
5
3
  Transfuse intelligently merges your multiple de novo transcriptome assemblies. Run multiple assemblies with different de novo assemblers, or different settings in the same assembler and have them combined into a single high quality transcriptome.
6
4
 
7
5
  Transfuse takes in the reads you used to do the assembly and a list of fasta files and produces a single output fasta file.
8
6
 
9
7
  ### Installation and Running
10
8
 
11
- To install Transfuse, clone this repo:
9
+ To install Transfuse you can get it from rubygems.com
10
+
11
+ `gem install transfuse`
12
+
13
+ or you can clone this repo:
12
14
 
13
15
  `git clone https://github.com/cboursnell/transfuse.git`
14
16
 
15
- Then build and install the ruby gem
17
+ then build and install the ruby gem
16
18
 
17
19
  `gem build *spec; gem install *gem`
18
20
 
21
+ Transfuse also requires `vsearch` to be installed which can be downloaded from:
22
+
23
+ `https://github.com/torognes/vsearch`
24
+
19
25
  ### Usage
20
26
 
21
27
  Transfuse is run on the command line. The options are:
22
28
 
23
29
  ```
24
- -a, --assembly=<s> assembly files in FASTA format, comma-separated
25
- -l, --left=<s> left reads file in FASTQ format
26
- -r, --right=<s> right reads file in FASTQ format
27
- -o, --output=<s> write merged assembly to file
28
- -t, --threads=<i> number of threads (default: 1)
29
- -v, --verbose be verbose
30
- -e, --version Print version and exit
31
- -h, --help Show this message
30
+ -a, --assemblies=<s> assembly files in FASTA format, comma-separated
31
+ -l, --left=<s> left reads file in FASTQ format
32
+ -r, --right=<s> right reads file in FASTQ format
33
+ -o, --output=<s> write merged assembly to file
34
+ -t, --threads=<i> number of threads (default: 1)
35
+ -i, --id=<f> sequence identity to cluster at (default: 1.0)
36
+ -v, --verbose be verbose
37
+ -e, --version Print version and exit
38
+ -h, --help Show this message
32
39
  ```
33
40
 
34
41
  An example command:
35
42
 
36
43
  `transfuse --assembly soap-k31.fa,soap-k41.fa,soap-k51.fa --left reads_1.fq --right reads_2.fq --output soap-merged.fa --threads 12`
37
44
 
45
+ ### Contributing
46
+
47
+ [![Join the chat at https://gitter.im/cboursnell/transfuse](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/cboursnell/transfuse?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
48
+
49
+ Tranfuse is currently in development.
50
+
51
+ If you want to suggest, and maybe implement, a new feature, please suggest it on the tracker first.
52
+
38
53
  ### License
39
54
 
40
55
  This is adademic software - please cite us if you use it in your work.
data/bin/transfuse CHANGED
@@ -23,26 +23,41 @@ opts = Trollop::options do
23
23
 
24
24
  EOS
25
25
  opt :assemblies, "assembly files in FASTA format, comma-separated",
26
- :type => String, :required => true
26
+ :type => String
27
27
  opt :left, "left reads file in FASTQ format",
28
28
  :type => String
29
29
  opt :right, "right reads file in FASTQ format",
30
30
  :type => String
31
- opt :scores, "transrate contig score output files, comma-separated. Ignored if reads are provided",
32
- :type => String
33
- opt :output, "write merged assembly to file",
34
- :type => String, :required => :true
31
+ opt :output, "write merged assembly to file", :type => String
35
32
  opt :threads, "number of threads", :type => :int, :default => 1
36
33
  opt :id, "sequence identity to cluster at", :type => :float, :default => 1.0
34
+ # opt :install, "install dependencies"
37
35
  opt :verbose, "be verbose"
38
36
  end
39
37
 
40
38
  transfuse = Transfuse::Transfuse.new opts.threads, opts.verbose
41
39
 
42
- assembly_files = transfuse.check_files opts.assemblies
43
- score_files = transfuse.check_files opts.score if opts.score
44
- left = transfuse.check_files opts.left if opts.left
45
- right = transfuse.check_files opts.right if opts.right
40
+ # if opts.install
41
+ # transfuse.install_dependencies
42
+ # else
43
+ # missing = transfuse.check_dependencies
44
+ # unless missing.empty?
45
+ # list = missing.collect {|i| "#{i.name}:#{i.version}"}.join("\n - ")
46
+ # msg = "Not installed: \n - #{list}"
47
+ # abort msg
48
+ # end
49
+ # end
50
+
51
+ assembly_files = transfuse.check_files(opts.assemblies, "assemblies")
52
+ left = transfuse.check_files(opts.left, "left")
53
+ right = transfuse.check_files(opts.right, "right")
54
+ if opts.output
55
+ if File.exist?(opts.output)
56
+ abort "Output #{opts.output} already exists"
57
+ end
58
+ else
59
+ abort "Please specify an output with the --output option"
60
+ end
46
61
 
47
62
  if opts.scores
48
63
  # load the scores from the comma separated list of files
@@ -12,87 +12,81 @@ module Transfuse
12
12
  @verbose = verbose
13
13
  end
14
14
 
15
+ def count_exons list
16
+ exon_counts = {}
17
+ list.each_with_index do |hash, index|
18
+ seq = hash[:seq]
19
+ exon_count=0
20
+ gap_count=0
21
+ prev=""
22
+ seq.each_char do |c|
23
+ if c=="-"
24
+ base = "-"
25
+ else
26
+ base = "*"
27
+ end
28
+
29
+ if base!=prev
30
+ if c=="-"
31
+ gap_count+=1
32
+ else
33
+ exon_count+=1
34
+ end
35
+ end
36
+
37
+ if c=="-"
38
+ prev = "-"
39
+ else
40
+ prev = "*"
41
+ end
42
+ end
43
+ exon_counts[exon_count] ||= []
44
+ exon_counts[exon_count] << index
45
+ end
46
+ return exon_counts
47
+ end
48
+
15
49
  def run msa, scores, output
16
- return 1 if File.exist?(output)
50
+ preoutput = "#{File.basename(output, File.extname(output))}_cons.fa"
51
+ return preoutput if File.exist?(preoutput)
17
52
  print "writing consensus " if @verbose
18
53
  # msa is a hash
19
54
  # key = cluster id
20
55
  # value = list
21
56
  # list of sequences in cluster aligned with gaps
22
- preoutput = "#{File.basename(output, File.extname(output))}_cons.fa"
23
57
  count = 0
24
58
  File.open("#{output}.data", "w") do |out2|
25
59
  File.open(preoutput, "w") do |out|
26
- msa.each do |id, list|
60
+ msa.each do |id, seq_list|
27
61
  count+=1
28
62
  print "." if count%5_000==0 and @verbose
29
63
  exons={}
30
- cons = []
31
- length = list[0][:seq].length
32
- list.each_with_index do |hash, index|
33
- seq = hash[:seq]
34
- name = hash[:name]
35
- out2.write "#{id}\t#{scores[name][:score]}\t#{name}\n"
36
- prev = ""
37
- gap = 0
38
- exon = 0
39
- seq.each_char do |c|
40
- if c=="-"
41
- base="-"
42
- else
43
- base="*"
44
- end
45
- if base!=prev
46
- if c=="-"
47
- gap+=1
48
- else
49
- exon+=1
50
- end
51
- end
52
- if c=="-"
53
- prev = "-"
54
- else
55
- prev = "*"
56
- end
57
- end
58
- exons[index] = exon
59
- end
64
+ cons = {}
65
+ length = seq_list[0][:seq].length
66
+ exons = count_exons(seq_list)
60
67
 
61
- consensus = ""
62
- 0.upto(length-1) do |i|
63
- base="N"
64
- counts = {}
65
- list.each_with_index do |hash, index|
66
- seq = hash[:seq]
67
- if seq[i] != "-" and seq[i] != "N"
68
- counts[seq[i]]||=0
69
- counts[seq[i]] += 1
70
- if exons[index]==1
71
- base = seq[i]
68
+ exons.each do |count, list|
69
+ 0.upto(length-1) do |pos|
70
+ base = "N"
71
+ list.each do |index|
72
+ b = seq_list[index][:seq][pos]
73
+ if b != "-" and b != "N"
74
+ base = b
72
75
  end
73
76
  end
74
- end
75
- if counts.size>0
76
- base = counts.sort.last.first
77
- end
78
- consensus << base
79
- end
80
-
81
- if consensus.count("N") < consensus.length.to_f*0.5
82
- cons << consensus
83
- end
84
-
85
- list.each_with_index do |hash, index|
86
- if exons[index] > 1
87
- cons << hash[:seq].delete("-")
77
+ if base != "N"
78
+ cons[count]||=""
79
+ cons[count]<<base
80
+ end
88
81
  end
89
82
  end
90
83
 
91
- cons.each_with_index do |s,index|
84
+ cons.each_with_index do |s, index|
92
85
  out.write ">contig#{id}.#{index+1}\n"
93
- out.write "#{s}\n"
86
+ out.write "#{s[1]}\n"
94
87
  end
95
88
 
89
+
96
90
  end # msa.each
97
91
  end # file
98
92
  end # file open
@@ -18,8 +18,29 @@ module Transfuse
18
18
  @verbose = verbose
19
19
  end
20
20
 
21
- def check_files string
21
+ def check_dependencies
22
+ # Check dependencies if they are relevant to the command issued,
23
+ # and handle any commands to install missing ones
24
+ gem_dir = Gem.loaded_specs['transfuse'].full_gem_path
25
+ gem_deps = File.join(gem_dir, 'deps', 'deps.yaml')
26
+
27
+ return Bindeps.missing gem_deps
28
+
29
+ end # check_dependencies
30
+
31
+ def install_dependencies
32
+ # Check dependencies if they are relevant to the command issued,
33
+ # and handle any commands to install missing ones
34
+ gem_dir = Gem.loaded_specs['transfuse'].full_gem_path
35
+ gem_deps = File.join(gem_dir, 'deps', 'deps.yaml')
36
+
37
+ Bindeps.require gem_deps
38
+
39
+ end # check_dependencies
40
+
41
+ def check_files string, option
22
42
  # puts "check file string: #{string}" if @verbose
43
+ abort "Please specify --#{option} option" if string.nil?
23
44
  list = []
24
45
  string.split(",").each do |file|
25
46
  file = File.expand_path(file)
@@ -205,8 +226,8 @@ module Transfuse
205
226
 
206
227
  end
207
228
  File.open("summary.txt","w") do |out|
208
- out.write "fasta\tscore\toptimal\n"
209
- out.write "#{fasta}\t#{transrater.assembly_score}\t#{transrater.assembly_optimal_score("prefix")}\n"
229
+ out.write "fasta\tscore\toptimal\tcutoff\n"
230
+ out.write "#{fasta}\t#{transrater.assembly_score}\t#{transrater.assembly_optimal_score("prefix").join("\t")}\n"
210
231
  end
211
232
  end
212
233
  end
@@ -8,7 +8,7 @@ module Transfuse
8
8
  module VERSION
9
9
  MAJOR = 0
10
10
  MINOR = 4
11
- PATCH = 3
11
+ PATCH = 5
12
12
  BUILD = nil
13
13
 
14
14
  STRING = [MAJOR, MINOR, PATCH, BUILD].compact.join('.')
data/test/test_cluster.rb CHANGED
@@ -8,7 +8,7 @@ class TestCluster < Test::Unit::TestCase
8
8
  context 'cluster' do
9
9
 
10
10
  setup do
11
- @cluster = Transfuse::Cluster.new 4
11
+ @cluster = Transfuse::Cluster.new 4, true, 1.0
12
12
  end
13
13
 
14
14
  teardown do
@@ -18,14 +18,11 @@ class TestCluster < Test::Unit::TestCase
18
18
  assert @cluster
19
19
  end
20
20
 
21
- should 'generate cd-hit command' do
22
- cmd = @cluster.generate_cdhit_command "assembly1.fasta", "output.fa"
23
- end
24
-
25
21
  should 'generate vsearch command' do
26
- output = @cluster.generate_vsearch_command "assembly1.fasta", "output.txt"
27
- a = "vsearch --cluster_fast assembly1.fasta --id 1.00 "
28
- a << "--strand both --uc output.txt --threads 4"
22
+ output = @cluster.generate_vsearch_command "assembly1.fasta", "output.txt", "output.msa"
23
+ a = "vsearch --cluster_fast assembly1.fasta --id 1.0 "
24
+ a << "--iddef 0 --qmask none --strand both --uc output.txt "
25
+ a << "--msaout output.msa --threads 4"
29
26
  b = output.split(" ")
30
27
  b[0] = File.basename(b[0])
31
28
  output = b.join(" ")
@@ -18,7 +18,7 @@ class TestTransfuse < Test::Unit::TestCase
18
18
  list = []
19
19
  list << File.join(File.dirname(__FILE__), 'data', 'assembly1.fasta')
20
20
  list << File.join(File.dirname(__FILE__), 'data', 'assembly2.fasta')
21
- files = @fuser.check_files list.join(",")
21
+ files = @fuser.check_files(list.join(","), "option")
22
22
  assert_equal 2, files.length, "length"
23
23
  end
24
24
 
@@ -41,7 +41,7 @@ class TestTransfuse < Test::Unit::TestCase
41
41
  tmpdir = Dir.mktmpdir
42
42
  Dir.chdir(tmpdir) do
43
43
  file = File.join(File.dirname(__FILE__), 'data', 'assembly1.fasta')
44
- hash = @fuser.cluster file
44
+ hash = @fuser.cluster(file, 1.0)
45
45
  assert_equal 250, hash.size, "output size"
46
46
  end
47
47
  # end
data/transfuse.gemspec CHANGED
@@ -15,15 +15,16 @@ Gem::Specification.new do |gem|
15
15
  gem.homepage = 'https://github.com/cboursnell/transfuse'
16
16
  gem.license = 'MIT'
17
17
 
18
- gem.add_dependency 'trollop', '~> 2.0'
19
- gem.add_dependency 'bio', '~> 1.4', '>= 1.4.3'
18
+ gem.add_dependency 'trollop', '~> 2.1', '>= 2.1.2'
19
+ gem.add_dependency 'bio', '~> 1.5', '>= 1.5.0'
20
20
  gem.add_dependency 'fixwhich', '~> 1.0', '>= 1.0.2'
21
- gem.add_dependency 'bindeps', '~> 1.0', '>= 1.0.1'
21
+ gem.add_dependency 'bindeps', '~> 1.2', '>= 1.2.0'
22
22
  gem.add_dependency 'transrate', '~> 1.0', '>= 1.0.1'
23
+ gem.add_dependency 'bundler', '~> 1.10', '>= 1.10.6'
23
24
 
24
- gem.add_development_dependency 'rake', '~> 10.3', '>= 10.3.2'
25
+ gem.add_development_dependency 'rake', '~> 10.4', '>= 10.4.2'
25
26
  gem.add_development_dependency 'turn', '~> 0.9', '>= 0.9.7'
26
- gem.add_development_dependency 'simplecov', '~> 0.8', '>= 0.8.2'
27
+ gem.add_development_dependency 'simplecov', '~> 0.10', '>= 0.10.0'
27
28
  gem.add_development_dependency 'shoulda-context', '~> 1.2', '>= 1.2.1'
28
- gem.add_development_dependency 'coveralls', '~> 0.7'
29
+ gem.add_development_dependency 'coveralls', '~> 0.8', '>= 0.8.2'
29
30
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: transfuse
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.4.3
4
+ version: 0.4.5
5
5
  platform: ruby
6
6
  authors:
7
7
  - Richard Smith-Unna
@@ -17,34 +17,40 @@ dependencies:
17
17
  requirements:
18
18
  - - "~>"
19
19
  - !ruby/object:Gem::Version
20
- version: '2.0'
20
+ version: '2.1'
21
+ - - ">="
22
+ - !ruby/object:Gem::Version
23
+ version: 2.1.2
21
24
  type: :runtime
22
25
  prerelease: false
23
26
  version_requirements: !ruby/object:Gem::Requirement
24
27
  requirements:
25
28
  - - "~>"
26
29
  - !ruby/object:Gem::Version
27
- version: '2.0'
30
+ version: '2.1'
31
+ - - ">="
32
+ - !ruby/object:Gem::Version
33
+ version: 2.1.2
28
34
  - !ruby/object:Gem::Dependency
29
35
  name: bio
30
36
  requirement: !ruby/object:Gem::Requirement
31
37
  requirements:
32
38
  - - "~>"
33
39
  - !ruby/object:Gem::Version
34
- version: '1.4'
40
+ version: '1.5'
35
41
  - - ">="
36
42
  - !ruby/object:Gem::Version
37
- version: 1.4.3
43
+ version: 1.5.0
38
44
  type: :runtime
39
45
  prerelease: false
40
46
  version_requirements: !ruby/object:Gem::Requirement
41
47
  requirements:
42
48
  - - "~>"
43
49
  - !ruby/object:Gem::Version
44
- version: '1.4'
50
+ version: '1.5'
45
51
  - - ">="
46
52
  - !ruby/object:Gem::Version
47
- version: 1.4.3
53
+ version: 1.5.0
48
54
  - !ruby/object:Gem::Dependency
49
55
  name: fixwhich
50
56
  requirement: !ruby/object:Gem::Requirement
@@ -71,20 +77,20 @@ dependencies:
71
77
  requirements:
72
78
  - - "~>"
73
79
  - !ruby/object:Gem::Version
74
- version: '1.0'
80
+ version: '1.2'
75
81
  - - ">="
76
82
  - !ruby/object:Gem::Version
77
- version: 1.0.1
83
+ version: 1.2.0
78
84
  type: :runtime
79
85
  prerelease: false
80
86
  version_requirements: !ruby/object:Gem::Requirement
81
87
  requirements:
82
88
  - - "~>"
83
89
  - !ruby/object:Gem::Version
84
- version: '1.0'
90
+ version: '1.2'
85
91
  - - ">="
86
92
  - !ruby/object:Gem::Version
87
- version: 1.0.1
93
+ version: 1.2.0
88
94
  - !ruby/object:Gem::Dependency
89
95
  name: transrate
90
96
  requirement: !ruby/object:Gem::Requirement
@@ -105,26 +111,46 @@ dependencies:
105
111
  - - ">="
106
112
  - !ruby/object:Gem::Version
107
113
  version: 1.0.1
114
+ - !ruby/object:Gem::Dependency
115
+ name: bundler
116
+ requirement: !ruby/object:Gem::Requirement
117
+ requirements:
118
+ - - "~>"
119
+ - !ruby/object:Gem::Version
120
+ version: '1.10'
121
+ - - ">="
122
+ - !ruby/object:Gem::Version
123
+ version: 1.10.6
124
+ type: :runtime
125
+ prerelease: false
126
+ version_requirements: !ruby/object:Gem::Requirement
127
+ requirements:
128
+ - - "~>"
129
+ - !ruby/object:Gem::Version
130
+ version: '1.10'
131
+ - - ">="
132
+ - !ruby/object:Gem::Version
133
+ version: 1.10.6
108
134
  - !ruby/object:Gem::Dependency
109
135
  name: rake
110
136
  requirement: !ruby/object:Gem::Requirement
111
137
  requirements:
112
138
  - - "~>"
113
139
  - !ruby/object:Gem::Version
114
- version: '10.3'
140
+ version: '10.4'
115
141
  - - ">="
116
142
  - !ruby/object:Gem::Version
117
- version: 10.3.2
143
+ version: 10.4.2
118
144
  type: :development
119
145
  prerelease: false
120
146
  version_requirements: !ruby/object:Gem::Requirement
121
147
  requirements:
122
148
  - - "~>"
123
149
  - !ruby/object:Gem::Version
124
- version: '10.3'
150
+ version: '10.4'
125
151
  - - ">="
126
152
  - !ruby/object:Gem::Version
127
- version: 10.3.2
153
+ version: 10.4.2
128
154
  - !ruby/object:Gem::Dependency
129
155
  name: turn
130
156
  requirement: !ruby/object:Gem::Requirement
@@ -151,20 +177,20 @@ dependencies:
151
177
  requirements:
152
178
  - - "~>"
153
179
  - !ruby/object:Gem::Version
154
- version: '0.8'
180
+ version: '0.10'
155
181
  - - ">="
156
182
  - !ruby/object:Gem::Version
157
- version: 0.8.2
183
+ version: 0.10.0
158
184
  type: :development
159
185
  prerelease: false
160
186
  version_requirements: !ruby/object:Gem::Requirement
161
187
  requirements:
162
188
  - - "~>"
163
189
  - !ruby/object:Gem::Version
164
- version: '0.8'
190
+ version: '0.10'
165
191
  - - ">="
166
192
  - !ruby/object:Gem::Version
167
- version: 0.8.2
193
+ version: 0.10.0
168
194
  - !ruby/object:Gem::Dependency
169
195
  name: shoulda-context
170
196
  requirement: !ruby/object:Gem::Requirement
@@ -191,14 +217,20 @@ dependencies:
191
217
  requirements:
192
218
  - - "~>"
193
219
  - !ruby/object:Gem::Version
194
- version: '0.7'
220
+ version: '0.8'
221
+ - - ">="
222
+ - !ruby/object:Gem::Version
223
+ version: 0.8.2
195
224
  type: :development
196
225
  prerelease: false
197
226
  version_requirements: !ruby/object:Gem::Requirement
198
227
  requirements:
199
228
  - - "~>"
200
229
  - !ruby/object:Gem::Version
201
- version: '0.7'
230
+ version: '0.8'
231
+ - - ">="
232
+ - !ruby/object:Gem::Version
233
+ version: 0.8.2
202
234
  description: See summary
203
235
  email:
204
236
  - rds45@cam.ac.uk