transfuse 0.4.3 → 0.4.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: ec08ab7f1e09b7ab11ed3aa80f48650c4cebd13c
4
- data.tar.gz: f56e079bb4c91c8d1636472409d5f2dccec54a0c
3
+ metadata.gz: 94b55c79a713ddce1aaba5c302e3ce0d7a3dbd4a
4
+ data.tar.gz: 4472cc10a3a5a5e29e11bc22ad3d17b07fb7c332
5
5
  SHA512:
6
- metadata.gz: 0bc631582f019658ece91918c0b2278c24217cc8bdfe0ccd721000c0d75892c59a3d70c78d13d3f32666ea696aa2886ff3a05b8fae4ce158c7debb16daacac06
7
- data.tar.gz: dcfd447ef6bedba5d5a0ede16421c01c6fc230ad81c9cd4e1215593ec47f59cad25cd10e364beaeacce0b6d33ce235af22523a3644b9b20f86bca366512c1e82
6
+ metadata.gz: 8b61add12cd4c12db7ae2a65739c7895ea31c82b188a717f17e860a2009036a28962d53e275e67e244ef3f85e21a291ca15384df14ec5c1c51606a6893c2e25c
7
+ data.tar.gz: 1443f82858f58c9371f2223fb91fefa802cf809aa6802bbb84c014781641304e8c1a02d9dc6e289291e8b52569bce71438931472b18dfc009d9e9472fcc130c4
data/README.md CHANGED
@@ -1,40 +1,55 @@
1
1
  ## Transfuse
2
2
 
3
- **Transfuse is currently in development and is not yet ready for use**
4
-
5
3
  Transfuse intelligently merges your multiple de novo transcriptome assemblies. Run multiple assemblies with different de novo assemblers, or different settings in the same assembler and have them combined into a single high quality transcriptome.
6
4
 
7
5
  Transfuse takes in the reads you used to do the assembly and a list of fasta files and produces a single output fasta file.
8
6
 
9
7
  ### Installation and Running
10
8
 
11
- To install Transfuse, clone this repo:
9
+ To install Transfuse you can get it from rubygems.com
10
+
11
+ `gem install transfuse`
12
+
13
+ or you can clone this repo:
12
14
 
13
15
  `git clone https://github.com/cboursnell/transfuse.git`
14
16
 
15
- Then build and install the ruby gem
17
+ then build and install the ruby gem
16
18
 
17
19
  `gem build *spec; gem install *gem`
18
20
 
21
+ Transfuse also requires `vsearch` to be installed which can be downloaded from:
22
+
23
+ `https://github.com/torognes/vsearch`
24
+
19
25
  ### Usage
20
26
 
21
27
  Transfuse is run on the command line. The options are:
22
28
 
23
29
  ```
24
- -a, --assembly=<s> assembly files in FASTA format, comma-separated
25
- -l, --left=<s> left reads file in FASTQ format
26
- -r, --right=<s> right reads file in FASTQ format
27
- -o, --output=<s> write merged assembly to file
28
- -t, --threads=<i> number of threads (default: 1)
29
- -v, --verbose be verbose
30
- -e, --version Print version and exit
31
- -h, --help Show this message
30
+ -a, --assemblies=<s> assembly files in FASTA format, comma-separated
31
+ -l, --left=<s> left reads file in FASTQ format
32
+ -r, --right=<s> right reads file in FASTQ format
33
+ -o, --output=<s> write merged assembly to file
34
+ -t, --threads=<i> number of threads (default: 1)
35
+ -i, --id=<f> sequence identity to cluster at (default: 1.0)
36
+ -v, --verbose be verbose
37
+ -e, --version Print version and exit
38
+ -h, --help Show this message
32
39
  ```
33
40
 
34
41
  An example command:
35
42
 
36
43
  `transfuse --assembly soap-k31.fa,soap-k41.fa,soap-k51.fa --left reads_1.fq --right reads_2.fq --output soap-merged.fa --threads 12`
37
44
 
45
+ ### Contributing
46
+
47
+ [![Join the chat at https://gitter.im/cboursnell/transfuse](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/cboursnell/transfuse?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
48
+
49
+ Tranfuse is currently in development.
50
+
51
+ If you want to suggest, and maybe implement, a new feature, please suggest it on the tracker first.
52
+
38
53
  ### License
39
54
 
40
55
  This is adademic software - please cite us if you use it in your work.
data/bin/transfuse CHANGED
@@ -23,26 +23,41 @@ opts = Trollop::options do
23
23
 
24
24
  EOS
25
25
  opt :assemblies, "assembly files in FASTA format, comma-separated",
26
- :type => String, :required => true
26
+ :type => String
27
27
  opt :left, "left reads file in FASTQ format",
28
28
  :type => String
29
29
  opt :right, "right reads file in FASTQ format",
30
30
  :type => String
31
- opt :scores, "transrate contig score output files, comma-separated. Ignored if reads are provided",
32
- :type => String
33
- opt :output, "write merged assembly to file",
34
- :type => String, :required => :true
31
+ opt :output, "write merged assembly to file", :type => String
35
32
  opt :threads, "number of threads", :type => :int, :default => 1
36
33
  opt :id, "sequence identity to cluster at", :type => :float, :default => 1.0
34
+ # opt :install, "install dependencies"
37
35
  opt :verbose, "be verbose"
38
36
  end
39
37
 
40
38
  transfuse = Transfuse::Transfuse.new opts.threads, opts.verbose
41
39
 
42
- assembly_files = transfuse.check_files opts.assemblies
43
- score_files = transfuse.check_files opts.score if opts.score
44
- left = transfuse.check_files opts.left if opts.left
45
- right = transfuse.check_files opts.right if opts.right
40
+ # if opts.install
41
+ # transfuse.install_dependencies
42
+ # else
43
+ # missing = transfuse.check_dependencies
44
+ # unless missing.empty?
45
+ # list = missing.collect {|i| "#{i.name}:#{i.version}"}.join("\n - ")
46
+ # msg = "Not installed: \n - #{list}"
47
+ # abort msg
48
+ # end
49
+ # end
50
+
51
+ assembly_files = transfuse.check_files(opts.assemblies, "assemblies")
52
+ left = transfuse.check_files(opts.left, "left")
53
+ right = transfuse.check_files(opts.right, "right")
54
+ if opts.output
55
+ if File.exist?(opts.output)
56
+ abort "Output #{opts.output} already exists"
57
+ end
58
+ else
59
+ abort "Please specify an output with the --output option"
60
+ end
46
61
 
47
62
  if opts.scores
48
63
  # load the scores from the comma separated list of files
@@ -12,87 +12,81 @@ module Transfuse
12
12
  @verbose = verbose
13
13
  end
14
14
 
15
+ def count_exons list
16
+ exon_counts = {}
17
+ list.each_with_index do |hash, index|
18
+ seq = hash[:seq]
19
+ exon_count=0
20
+ gap_count=0
21
+ prev=""
22
+ seq.each_char do |c|
23
+ if c=="-"
24
+ base = "-"
25
+ else
26
+ base = "*"
27
+ end
28
+
29
+ if base!=prev
30
+ if c=="-"
31
+ gap_count+=1
32
+ else
33
+ exon_count+=1
34
+ end
35
+ end
36
+
37
+ if c=="-"
38
+ prev = "-"
39
+ else
40
+ prev = "*"
41
+ end
42
+ end
43
+ exon_counts[exon_count] ||= []
44
+ exon_counts[exon_count] << index
45
+ end
46
+ return exon_counts
47
+ end
48
+
15
49
  def run msa, scores, output
16
- return 1 if File.exist?(output)
50
+ preoutput = "#{File.basename(output, File.extname(output))}_cons.fa"
51
+ return preoutput if File.exist?(preoutput)
17
52
  print "writing consensus " if @verbose
18
53
  # msa is a hash
19
54
  # key = cluster id
20
55
  # value = list
21
56
  # list of sequences in cluster aligned with gaps
22
- preoutput = "#{File.basename(output, File.extname(output))}_cons.fa"
23
57
  count = 0
24
58
  File.open("#{output}.data", "w") do |out2|
25
59
  File.open(preoutput, "w") do |out|
26
- msa.each do |id, list|
60
+ msa.each do |id, seq_list|
27
61
  count+=1
28
62
  print "." if count%5_000==0 and @verbose
29
63
  exons={}
30
- cons = []
31
- length = list[0][:seq].length
32
- list.each_with_index do |hash, index|
33
- seq = hash[:seq]
34
- name = hash[:name]
35
- out2.write "#{id}\t#{scores[name][:score]}\t#{name}\n"
36
- prev = ""
37
- gap = 0
38
- exon = 0
39
- seq.each_char do |c|
40
- if c=="-"
41
- base="-"
42
- else
43
- base="*"
44
- end
45
- if base!=prev
46
- if c=="-"
47
- gap+=1
48
- else
49
- exon+=1
50
- end
51
- end
52
- if c=="-"
53
- prev = "-"
54
- else
55
- prev = "*"
56
- end
57
- end
58
- exons[index] = exon
59
- end
64
+ cons = {}
65
+ length = seq_list[0][:seq].length
66
+ exons = count_exons(seq_list)
60
67
 
61
- consensus = ""
62
- 0.upto(length-1) do |i|
63
- base="N"
64
- counts = {}
65
- list.each_with_index do |hash, index|
66
- seq = hash[:seq]
67
- if seq[i] != "-" and seq[i] != "N"
68
- counts[seq[i]]||=0
69
- counts[seq[i]] += 1
70
- if exons[index]==1
71
- base = seq[i]
68
+ exons.each do |count, list|
69
+ 0.upto(length-1) do |pos|
70
+ base = "N"
71
+ list.each do |index|
72
+ b = seq_list[index][:seq][pos]
73
+ if b != "-" and b != "N"
74
+ base = b
72
75
  end
73
76
  end
74
- end
75
- if counts.size>0
76
- base = counts.sort.last.first
77
- end
78
- consensus << base
79
- end
80
-
81
- if consensus.count("N") < consensus.length.to_f*0.5
82
- cons << consensus
83
- end
84
-
85
- list.each_with_index do |hash, index|
86
- if exons[index] > 1
87
- cons << hash[:seq].delete("-")
77
+ if base != "N"
78
+ cons[count]||=""
79
+ cons[count]<<base
80
+ end
88
81
  end
89
82
  end
90
83
 
91
- cons.each_with_index do |s,index|
84
+ cons.each_with_index do |s, index|
92
85
  out.write ">contig#{id}.#{index+1}\n"
93
- out.write "#{s}\n"
86
+ out.write "#{s[1]}\n"
94
87
  end
95
88
 
89
+
96
90
  end # msa.each
97
91
  end # file
98
92
  end # file open
@@ -18,8 +18,29 @@ module Transfuse
18
18
  @verbose = verbose
19
19
  end
20
20
 
21
- def check_files string
21
+ def check_dependencies
22
+ # Check dependencies if they are relevant to the command issued,
23
+ # and handle any commands to install missing ones
24
+ gem_dir = Gem.loaded_specs['transfuse'].full_gem_path
25
+ gem_deps = File.join(gem_dir, 'deps', 'deps.yaml')
26
+
27
+ return Bindeps.missing gem_deps
28
+
29
+ end # check_dependencies
30
+
31
+ def install_dependencies
32
+ # Check dependencies if they are relevant to the command issued,
33
+ # and handle any commands to install missing ones
34
+ gem_dir = Gem.loaded_specs['transfuse'].full_gem_path
35
+ gem_deps = File.join(gem_dir, 'deps', 'deps.yaml')
36
+
37
+ Bindeps.require gem_deps
38
+
39
+ end # check_dependencies
40
+
41
+ def check_files string, option
22
42
  # puts "check file string: #{string}" if @verbose
43
+ abort "Please specify --#{option} option" if string.nil?
23
44
  list = []
24
45
  string.split(",").each do |file|
25
46
  file = File.expand_path(file)
@@ -205,8 +226,8 @@ module Transfuse
205
226
 
206
227
  end
207
228
  File.open("summary.txt","w") do |out|
208
- out.write "fasta\tscore\toptimal\n"
209
- out.write "#{fasta}\t#{transrater.assembly_score}\t#{transrater.assembly_optimal_score("prefix")}\n"
229
+ out.write "fasta\tscore\toptimal\tcutoff\n"
230
+ out.write "#{fasta}\t#{transrater.assembly_score}\t#{transrater.assembly_optimal_score("prefix").join("\t")}\n"
210
231
  end
211
232
  end
212
233
  end
@@ -8,7 +8,7 @@ module Transfuse
8
8
  module VERSION
9
9
  MAJOR = 0
10
10
  MINOR = 4
11
- PATCH = 3
11
+ PATCH = 5
12
12
  BUILD = nil
13
13
 
14
14
  STRING = [MAJOR, MINOR, PATCH, BUILD].compact.join('.')
data/test/test_cluster.rb CHANGED
@@ -8,7 +8,7 @@ class TestCluster < Test::Unit::TestCase
8
8
  context 'cluster' do
9
9
 
10
10
  setup do
11
- @cluster = Transfuse::Cluster.new 4
11
+ @cluster = Transfuse::Cluster.new 4, true, 1.0
12
12
  end
13
13
 
14
14
  teardown do
@@ -18,14 +18,11 @@ class TestCluster < Test::Unit::TestCase
18
18
  assert @cluster
19
19
  end
20
20
 
21
- should 'generate cd-hit command' do
22
- cmd = @cluster.generate_cdhit_command "assembly1.fasta", "output.fa"
23
- end
24
-
25
21
  should 'generate vsearch command' do
26
- output = @cluster.generate_vsearch_command "assembly1.fasta", "output.txt"
27
- a = "vsearch --cluster_fast assembly1.fasta --id 1.00 "
28
- a << "--strand both --uc output.txt --threads 4"
22
+ output = @cluster.generate_vsearch_command "assembly1.fasta", "output.txt", "output.msa"
23
+ a = "vsearch --cluster_fast assembly1.fasta --id 1.0 "
24
+ a << "--iddef 0 --qmask none --strand both --uc output.txt "
25
+ a << "--msaout output.msa --threads 4"
29
26
  b = output.split(" ")
30
27
  b[0] = File.basename(b[0])
31
28
  output = b.join(" ")
@@ -18,7 +18,7 @@ class TestTransfuse < Test::Unit::TestCase
18
18
  list = []
19
19
  list << File.join(File.dirname(__FILE__), 'data', 'assembly1.fasta')
20
20
  list << File.join(File.dirname(__FILE__), 'data', 'assembly2.fasta')
21
- files = @fuser.check_files list.join(",")
21
+ files = @fuser.check_files(list.join(","), "option")
22
22
  assert_equal 2, files.length, "length"
23
23
  end
24
24
 
@@ -41,7 +41,7 @@ class TestTransfuse < Test::Unit::TestCase
41
41
  tmpdir = Dir.mktmpdir
42
42
  Dir.chdir(tmpdir) do
43
43
  file = File.join(File.dirname(__FILE__), 'data', 'assembly1.fasta')
44
- hash = @fuser.cluster file
44
+ hash = @fuser.cluster(file, 1.0)
45
45
  assert_equal 250, hash.size, "output size"
46
46
  end
47
47
  # end
data/transfuse.gemspec CHANGED
@@ -15,15 +15,16 @@ Gem::Specification.new do |gem|
15
15
  gem.homepage = 'https://github.com/cboursnell/transfuse'
16
16
  gem.license = 'MIT'
17
17
 
18
- gem.add_dependency 'trollop', '~> 2.0'
19
- gem.add_dependency 'bio', '~> 1.4', '>= 1.4.3'
18
+ gem.add_dependency 'trollop', '~> 2.1', '>= 2.1.2'
19
+ gem.add_dependency 'bio', '~> 1.5', '>= 1.5.0'
20
20
  gem.add_dependency 'fixwhich', '~> 1.0', '>= 1.0.2'
21
- gem.add_dependency 'bindeps', '~> 1.0', '>= 1.0.1'
21
+ gem.add_dependency 'bindeps', '~> 1.2', '>= 1.2.0'
22
22
  gem.add_dependency 'transrate', '~> 1.0', '>= 1.0.1'
23
+ gem.add_dependency 'bundler', '~> 1.10', '>= 1.10.6'
23
24
 
24
- gem.add_development_dependency 'rake', '~> 10.3', '>= 10.3.2'
25
+ gem.add_development_dependency 'rake', '~> 10.4', '>= 10.4.2'
25
26
  gem.add_development_dependency 'turn', '~> 0.9', '>= 0.9.7'
26
- gem.add_development_dependency 'simplecov', '~> 0.8', '>= 0.8.2'
27
+ gem.add_development_dependency 'simplecov', '~> 0.10', '>= 0.10.0'
27
28
  gem.add_development_dependency 'shoulda-context', '~> 1.2', '>= 1.2.1'
28
- gem.add_development_dependency 'coveralls', '~> 0.7'
29
+ gem.add_development_dependency 'coveralls', '~> 0.8', '>= 0.8.2'
29
30
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: transfuse
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.4.3
4
+ version: 0.4.5
5
5
  platform: ruby
6
6
  authors:
7
7
  - Richard Smith-Unna
@@ -17,34 +17,40 @@ dependencies:
17
17
  requirements:
18
18
  - - "~>"
19
19
  - !ruby/object:Gem::Version
20
- version: '2.0'
20
+ version: '2.1'
21
+ - - ">="
22
+ - !ruby/object:Gem::Version
23
+ version: 2.1.2
21
24
  type: :runtime
22
25
  prerelease: false
23
26
  version_requirements: !ruby/object:Gem::Requirement
24
27
  requirements:
25
28
  - - "~>"
26
29
  - !ruby/object:Gem::Version
27
- version: '2.0'
30
+ version: '2.1'
31
+ - - ">="
32
+ - !ruby/object:Gem::Version
33
+ version: 2.1.2
28
34
  - !ruby/object:Gem::Dependency
29
35
  name: bio
30
36
  requirement: !ruby/object:Gem::Requirement
31
37
  requirements:
32
38
  - - "~>"
33
39
  - !ruby/object:Gem::Version
34
- version: '1.4'
40
+ version: '1.5'
35
41
  - - ">="
36
42
  - !ruby/object:Gem::Version
37
- version: 1.4.3
43
+ version: 1.5.0
38
44
  type: :runtime
39
45
  prerelease: false
40
46
  version_requirements: !ruby/object:Gem::Requirement
41
47
  requirements:
42
48
  - - "~>"
43
49
  - !ruby/object:Gem::Version
44
- version: '1.4'
50
+ version: '1.5'
45
51
  - - ">="
46
52
  - !ruby/object:Gem::Version
47
- version: 1.4.3
53
+ version: 1.5.0
48
54
  - !ruby/object:Gem::Dependency
49
55
  name: fixwhich
50
56
  requirement: !ruby/object:Gem::Requirement
@@ -71,20 +77,20 @@ dependencies:
71
77
  requirements:
72
78
  - - "~>"
73
79
  - !ruby/object:Gem::Version
74
- version: '1.0'
80
+ version: '1.2'
75
81
  - - ">="
76
82
  - !ruby/object:Gem::Version
77
- version: 1.0.1
83
+ version: 1.2.0
78
84
  type: :runtime
79
85
  prerelease: false
80
86
  version_requirements: !ruby/object:Gem::Requirement
81
87
  requirements:
82
88
  - - "~>"
83
89
  - !ruby/object:Gem::Version
84
- version: '1.0'
90
+ version: '1.2'
85
91
  - - ">="
86
92
  - !ruby/object:Gem::Version
87
- version: 1.0.1
93
+ version: 1.2.0
88
94
  - !ruby/object:Gem::Dependency
89
95
  name: transrate
90
96
  requirement: !ruby/object:Gem::Requirement
@@ -105,26 +111,46 @@ dependencies:
105
111
  - - ">="
106
112
  - !ruby/object:Gem::Version
107
113
  version: 1.0.1
114
+ - !ruby/object:Gem::Dependency
115
+ name: bundler
116
+ requirement: !ruby/object:Gem::Requirement
117
+ requirements:
118
+ - - "~>"
119
+ - !ruby/object:Gem::Version
120
+ version: '1.10'
121
+ - - ">="
122
+ - !ruby/object:Gem::Version
123
+ version: 1.10.6
124
+ type: :runtime
125
+ prerelease: false
126
+ version_requirements: !ruby/object:Gem::Requirement
127
+ requirements:
128
+ - - "~>"
129
+ - !ruby/object:Gem::Version
130
+ version: '1.10'
131
+ - - ">="
132
+ - !ruby/object:Gem::Version
133
+ version: 1.10.6
108
134
  - !ruby/object:Gem::Dependency
109
135
  name: rake
110
136
  requirement: !ruby/object:Gem::Requirement
111
137
  requirements:
112
138
  - - "~>"
113
139
  - !ruby/object:Gem::Version
114
- version: '10.3'
140
+ version: '10.4'
115
141
  - - ">="
116
142
  - !ruby/object:Gem::Version
117
- version: 10.3.2
143
+ version: 10.4.2
118
144
  type: :development
119
145
  prerelease: false
120
146
  version_requirements: !ruby/object:Gem::Requirement
121
147
  requirements:
122
148
  - - "~>"
123
149
  - !ruby/object:Gem::Version
124
- version: '10.3'
150
+ version: '10.4'
125
151
  - - ">="
126
152
  - !ruby/object:Gem::Version
127
- version: 10.3.2
153
+ version: 10.4.2
128
154
  - !ruby/object:Gem::Dependency
129
155
  name: turn
130
156
  requirement: !ruby/object:Gem::Requirement
@@ -151,20 +177,20 @@ dependencies:
151
177
  requirements:
152
178
  - - "~>"
153
179
  - !ruby/object:Gem::Version
154
- version: '0.8'
180
+ version: '0.10'
155
181
  - - ">="
156
182
  - !ruby/object:Gem::Version
157
- version: 0.8.2
183
+ version: 0.10.0
158
184
  type: :development
159
185
  prerelease: false
160
186
  version_requirements: !ruby/object:Gem::Requirement
161
187
  requirements:
162
188
  - - "~>"
163
189
  - !ruby/object:Gem::Version
164
- version: '0.8'
190
+ version: '0.10'
165
191
  - - ">="
166
192
  - !ruby/object:Gem::Version
167
- version: 0.8.2
193
+ version: 0.10.0
168
194
  - !ruby/object:Gem::Dependency
169
195
  name: shoulda-context
170
196
  requirement: !ruby/object:Gem::Requirement
@@ -191,14 +217,20 @@ dependencies:
191
217
  requirements:
192
218
  - - "~>"
193
219
  - !ruby/object:Gem::Version
194
- version: '0.7'
220
+ version: '0.8'
221
+ - - ">="
222
+ - !ruby/object:Gem::Version
223
+ version: 0.8.2
195
224
  type: :development
196
225
  prerelease: false
197
226
  version_requirements: !ruby/object:Gem::Requirement
198
227
  requirements:
199
228
  - - "~>"
200
229
  - !ruby/object:Gem::Version
201
- version: '0.7'
230
+ version: '0.8'
231
+ - - ">="
232
+ - !ruby/object:Gem::Version
233
+ version: 0.8.2
202
234
  description: See summary
203
235
  email:
204
236
  - rds45@cam.ac.uk