bio-alignment 0.0.7 → 0.0.8
Sign up to get free protection for your applications and to get access to all the features.
- data/README.md +72 -15
- data/TODO +2 -0
- data/VERSION +1 -1
- data/bin/bio-alignment +120 -42
- data/features/phylogeny/tree-feature.rb +5 -0
- data/features/phylogeny/tree.feature +1 -0
- data/features/support/env.rb +19 -0
- data/lib/bio-alignment.rb +5 -0
- data/lib/bio-alignment/alignment.rb +13 -5
- data/lib/bio-alignment/bioruby.rb +2 -2
- data/lib/bio-alignment/codonsequence.rb +18 -3
- data/lib/bio-alignment/coerce.rb +45 -0
- data/lib/bio-alignment/columns.rb +14 -4
- data/lib/bio-alignment/edit/del_non_informative_sequences.rb +6 -2
- data/lib/bio-alignment/edit/edit_rows.rb +15 -9
- data/lib/bio-alignment/edit/mask_islands.rb +41 -40
- data/lib/bio-alignment/elements.rb +10 -1
- data/lib/bio-alignment/format/fasta.rb +17 -0
- data/lib/bio-alignment/format/phylip.rb +25 -0
- data/lib/bio-alignment/format/text.rb +18 -0
- data/lib/bio-alignment/rows.rb +1 -0
- data/lib/bio-alignment/sequence.rb +5 -0
- data/lib/bio-alignment/tree.rb +16 -5
- data/spec/bio-alignment_spec.rb +15 -0
- metadata +67 -20
data/README.md
CHANGED
@@ -19,6 +19,8 @@ Features are:
|
|
19
19
|
* Support for BioRuby trees and node distance calculation
|
20
20
|
* bio-alignment interacts well with BioRuby structures,
|
21
21
|
including sequence objects and alignment/tree parsers
|
22
|
+
* Support for textual and HTML output of MSA (planned)
|
23
|
+
* Support for Clayton's MAF parser is (planned)
|
22
24
|
|
23
25
|
When possible, BioRuby functionality is merged in. For example, by
|
24
26
|
supporting Bio::Sequence objects, standard BioRuby alignment
|
@@ -34,7 +36,39 @@ Bio::BioAlignment
|
|
34
36
|
document](https://github.com/pjotrp/bioruby-alignment/blob/master/doc/bio-alignment-design.md)
|
35
37
|
for Ruby.
|
36
38
|
|
37
|
-
##
|
39
|
+
## Command line
|
40
|
+
|
41
|
+
bio-alignment comes with a command line interface (CLI), which can apply a number
|
42
|
+
of editing functions on an alignment, and generate textual and HTML
|
43
|
+
output. Note that the CLI does not cover the full library. The CLI can be useful
|
44
|
+
for non-Rubyists, pipeline setups, and simply as examples
|
45
|
+
|
46
|
+
Remove bridges (columns with mostly gaps) from an alignment
|
47
|
+
|
48
|
+
bio-alignment aa-alignment.fa --type aminoacid --edit bridges
|
49
|
+
|
50
|
+
Mask islands (short misaligned 'floating' parts in a sequence)
|
51
|
+
|
52
|
+
coming soon...
|
53
|
+
|
54
|
+
Mask serial mutations
|
55
|
+
|
56
|
+
coming soon...
|
57
|
+
|
58
|
+
Remove all sequences consisting of mostly gaps (30% informative) and output to FASTA
|
59
|
+
|
60
|
+
bio-alignment codon-alignment.fa --type codon --edit info --out fasta
|
61
|
+
|
62
|
+
or output codon style
|
63
|
+
|
64
|
+
bio-alignment codon-alignment.fa --type codon --edit info --style codon
|
65
|
+
|
66
|
+
Remove all sequences containing gaps from an alignment (why would you
|
67
|
+
want to do that?)
|
68
|
+
|
69
|
+
bio-alignment codon-alignment.fa --type codon --edit info --perc 100 --out fasta
|
70
|
+
|
71
|
+
## Section for developers
|
38
72
|
|
39
73
|
### Codon alignment example
|
40
74
|
|
@@ -50,7 +84,7 @@ aligmment (note codon gaps are represented by '---')
|
|
50
84
|
aln = Alignment.new
|
51
85
|
fasta = FastaReader.new('codon-alignment.fa')
|
52
86
|
fasta.each do | rec |
|
53
|
-
aln
|
87
|
+
aln << CodonSequence.new(rec.id, rec.seq)
|
54
88
|
end
|
55
89
|
# write a matching amino acid alignment
|
56
90
|
fasta = FastaWriter.new('aa-aln.fa')
|
@@ -106,18 +140,35 @@ BioAlignment supports adding BioRuby's Bio::Sequence objects:
|
|
106
140
|
include Bio::BioAlignment
|
107
141
|
|
108
142
|
aln = Alignment.new
|
109
|
-
aln
|
110
|
-
aln
|
143
|
+
aln << Bio::Sequence::NA.new("atgcatgcaaaa")
|
144
|
+
aln << Bio::Sequence::NA.new("atg---tcaaaa")
|
145
|
+
```
|
146
|
+
|
147
|
+
or use BioRuby's flat file reader
|
148
|
+
|
149
|
+
```ruby
|
150
|
+
aln = Alignment.new
|
151
|
+
Bio::FlatFile.auto(filename).each_entry do |entry|
|
152
|
+
aln << entry
|
153
|
+
end
|
111
154
|
```
|
112
155
|
|
113
|
-
and we can transform BioAlignment into BioRuby's
|
114
|
-
use BioRuby functions
|
156
|
+
and, the other way, we can transform BioAlignment into BioRuby's
|
157
|
+
Bio::Alignment and use BioRuby functions
|
115
158
|
|
116
159
|
```ruby
|
117
160
|
bioruby_aln = aln.to_bioruby_alignment
|
118
161
|
bioruby_aln.consensus_iupac
|
119
162
|
```
|
120
163
|
|
164
|
+
Note that native BioRuby objects may not always work. In the first
|
165
|
+
case, using Bio::Sequence::NA, no ID is passed in, so each sequence is
|
166
|
+
labeled 'id?'. In the second case BioRuby's FlatFile returns a
|
167
|
+
FastaFormat object, this time with ID, but FastaFormat does not
|
168
|
+
support indexing. In general, it is recommended to stay with the
|
169
|
+
bio-alignment Sequence classes (or roll your own, as long as they are
|
170
|
+
Enumerable).
|
171
|
+
|
121
172
|
### Pal2nal
|
122
173
|
|
123
174
|
A protein (amino acid) to nucleotide alignment would first load
|
@@ -132,7 +183,7 @@ the sequences
|
|
132
183
|
aln2 = Alignment.new
|
133
184
|
fasta2 = FastaReader.new('nt.fa')
|
134
185
|
fasta2.each do | rec |
|
135
|
-
aln2
|
186
|
+
aln2 << Sequence.new(rec.id, rec.seq)
|
136
187
|
end
|
137
188
|
```
|
138
189
|
|
@@ -174,15 +225,17 @@ BioAlignment has support for attaching a phylogenetic tree to an
|
|
174
225
|
alignment, and traversing the tree using an intuitive interface
|
175
226
|
|
176
227
|
```ruby
|
177
|
-
|
178
|
-
tree = aln.attach_tree(
|
179
|
-
# now do stuff with the tree, which has improved bio-
|
228
|
+
newick_tree = Bio::Newick.new(string).tree # use BioRuby's tree parser
|
229
|
+
tree = aln.attach_tree(newick_tree) # attach the tree
|
230
|
+
# now do stuff with the tree, which has improved bio-alignment support
|
180
231
|
root = tree.root
|
181
232
|
children = root.children
|
182
233
|
children.map { |n| n.name }.sort.should == ["","seq7"]
|
183
234
|
seq7 = children.last
|
184
235
|
seq4 = tree.find("seq4")
|
185
236
|
seq4.distance(seq7).should == 19.387756600000003
|
237
|
+
# find the sequence in the alignment belonging to the node
|
238
|
+
print seq4.sequence
|
186
239
|
print tree.output_newick # BioRuby Newick output
|
187
240
|
```
|
188
241
|
|
@@ -201,13 +254,14 @@ programming. Primitives are provided which take out much of the
|
|
201
254
|
plumbing, such as maintaining row/column/element state, and allow
|
202
255
|
copy-on-edit (so no conflicts arise in concurrent code). For example,
|
203
256
|
to walk an alignment by row, and update the row state, you can mark
|
204
|
-
all rows
|
257
|
+
all rows (sequences) which contain many gaps for deletion
|
205
258
|
|
206
259
|
```ruby
|
207
260
|
include MarkRows
|
208
261
|
mark_rows { |rowstate,row| # for every row/sequence
|
209
262
|
num = row.count { |e| e.gap? }
|
210
263
|
if (num.to_f/row.length) > 0.5
|
264
|
+
# this row in the alignment consists mostly of gaps
|
211
265
|
rowstate.delete! # mark row for deletion
|
212
266
|
end
|
213
267
|
rowstate # returns the updated row state
|
@@ -225,9 +279,9 @@ The general idea is that there are many potential ways of selecting
|
|
225
279
|
rows, and changing some state. The 'mark_rows' function/iterator takes
|
226
280
|
care of the plumbing. All the programmer needs to do is to set the
|
227
281
|
criterion, in this case a gap percentage, and tell the library what
|
228
|
-
state has to change. In this example we only access one row
|
229
|
-
can also access the other rows. You won't be surprised that
|
230
|
-
columns looks much the same
|
282
|
+
state has to change. In this example we only access one row at a time,
|
283
|
+
but you can also access the other rows. You won't be surprised that
|
284
|
+
marking columns looks much the same
|
231
285
|
|
232
286
|
```ruby
|
233
287
|
include MarkColumns
|
@@ -262,7 +316,7 @@ and, here we remove every marked element by turning it into a gap
|
|
262
316
|
the old with the new.
|
263
317
|
|
264
318
|
It is important to note that, instead of directly editing alignments
|
265
|
-
in place,
|
319
|
+
in place, bio-alignment always makes it a two step process. First items
|
266
320
|
are masked/marked through the state of the rows/columns/elements, next
|
267
321
|
the alignment is rewritten using this state. The advantage of using an
|
268
322
|
intermediate state is that the state can be queried for creating (for
|
@@ -286,6 +340,9 @@ An edit feature is added at runtime(!) Example:
|
|
286
340
|
|
287
341
|
where aln2 is a copy of aln with bridging columns deleted.
|
288
342
|
|
343
|
+
More examples can be found in the features/edit directory of the
|
344
|
+
source.
|
345
|
+
|
289
346
|
### See also
|
290
347
|
|
291
348
|
For more on the design of bio-alignment, read the
|
data/TODO
ADDED
data/VERSION
CHANGED
@@ -1 +1 @@
|
|
1
|
-
0.0.
|
1
|
+
0.0.8
|
data/bin/bio-alignment
CHANGED
@@ -1,12 +1,22 @@
|
|
1
1
|
#!/usr/bin/env ruby
|
2
2
|
#
|
3
3
|
# BioRuby bio-alignment Plugin
|
4
|
-
# Version 0.0.0
|
5
4
|
# Author:: Pjotr Prins
|
6
5
|
# Copyright:: 2012
|
7
6
|
# License:: The Ruby License
|
8
7
|
|
9
|
-
|
8
|
+
rootpath = File.dirname(File.dirname(__FILE__))
|
9
|
+
$: << File.join(rootpath,'lib')
|
10
|
+
|
11
|
+
_VERSION = File.new(File.join(rootpath,'VERSION')).read.chomp
|
12
|
+
|
13
|
+
$stderr.print "bio-alignment "+_VERSION+" Copyright (C) 2012 Pjotr Prins <pjotr.prins@thebird.nl>\n\n"
|
14
|
+
|
15
|
+
USAGE =<<EOU
|
16
|
+
|
17
|
+
bio-alignment transforms alignments
|
18
|
+
|
19
|
+
EOU
|
10
20
|
|
11
21
|
if ARGV.size == 0
|
12
22
|
print USAGE
|
@@ -14,51 +24,63 @@ end
|
|
14
24
|
|
15
25
|
require 'bio-alignment'
|
16
26
|
require 'optparse'
|
27
|
+
include Bio::BioAlignment
|
17
28
|
|
18
|
-
|
19
|
-
# require 'bio-logger'
|
20
|
-
# Bio::Log::CLI.logger('stderr')
|
21
|
-
# Bio::Log::CLI.trace('info')
|
29
|
+
log = Bio::Log::LoggerPlus.new 'bio-alignment'
|
22
30
|
|
23
|
-
|
31
|
+
Bio::Log::CLI.logger('stderr')
|
32
|
+
Bio::Log::CLI.trace('info')
|
33
|
+
|
34
|
+
options = {show_help: false}
|
35
|
+
options[:show_help] = true if ARGV.size == 0
|
24
36
|
opts = OptionParser.new do |o|
|
25
|
-
o.banner = "Usage: #{File.basename($0)} [options]
|
37
|
+
o.banner = "Usage: #{File.basename($0)} [options] filename\n\n"
|
26
38
|
|
27
|
-
o.on('--
|
28
|
-
|
29
|
-
options[:example_parameter] = 'this is a parameter'
|
39
|
+
o.on('--type codon|nucleotide|aminoacid', [:codon,:nucleotide,:aminoacid], 'Type of sequence data (default auto)') do |type|
|
40
|
+
options[:type] = type.to_sym
|
30
41
|
end
|
31
|
-
|
32
|
-
o.
|
33
|
-
|
34
|
-
# TODO: your logic here, below an example
|
35
|
-
self[:example_switch] = true
|
42
|
+
|
43
|
+
o.on('--edit bridges|islands|info', [:bridges,:islands,:info], 'Apply edit function') do |edit|
|
44
|
+
options[:edit] = edit.to_sym
|
36
45
|
end
|
37
46
|
|
38
|
-
|
39
|
-
|
40
|
-
|
41
|
-
|
42
|
-
|
43
|
-
|
44
|
-
|
45
|
-
|
46
|
-
|
47
|
-
|
48
|
-
|
49
|
-
|
50
|
-
|
51
|
-
|
52
|
-
|
53
|
-
|
54
|
-
|
55
|
-
|
56
|
-
|
57
|
-
|
58
|
-
|
47
|
+
o.on('--perc value', Integer, 'Percentage') do |v|
|
48
|
+
options[:perc] = v
|
49
|
+
end
|
50
|
+
|
51
|
+
o.on('--out fasta', [:fasta], 'Output format') do |format|
|
52
|
+
options[:out] = format.to_sym
|
53
|
+
end
|
54
|
+
|
55
|
+
o.on('--style codon', [:codon], 'Output style') do |style|
|
56
|
+
options[:style] = style.to_sym
|
57
|
+
end
|
58
|
+
|
59
|
+
o.separator ""
|
60
|
+
|
61
|
+
o.on("--logger filename",String,"Log to file (default stderr)") do | name |
|
62
|
+
Bio::Log::CLI.logger(name)
|
63
|
+
end
|
64
|
+
|
65
|
+
o.on("--trace options",String,"Set log level (default INFO, see bio-logger)") do | s |
|
66
|
+
Bio::Log::CLI.trace(s)
|
67
|
+
end
|
68
|
+
|
69
|
+
o.on("-q", "--quiet", "Run quietly") do |q|
|
70
|
+
Bio::Log::CLI.trace('error')
|
71
|
+
end
|
72
|
+
|
73
|
+
o.on("-v", "--verbose", "Run verbosely") do |v|
|
74
|
+
Bio::Log::CLI.trace('info')
|
75
|
+
end
|
76
|
+
|
77
|
+
o.on("--debug", "Show debug messages") do |v|
|
78
|
+
Bio::Log::CLI.trace('debug')
|
79
|
+
end
|
59
80
|
|
60
81
|
o.separator ""
|
61
|
-
|
82
|
+
|
83
|
+
o.on_tail('-h', '--help', 'Display this help and exit') do
|
62
84
|
options[:show_help] = true
|
63
85
|
end
|
64
86
|
end
|
@@ -66,11 +88,67 @@ end
|
|
66
88
|
begin
|
67
89
|
opts.parse!(ARGV)
|
68
90
|
|
69
|
-
|
70
|
-
|
91
|
+
if options[:show_help]
|
92
|
+
print opts
|
93
|
+
print USAGE
|
94
|
+
end
|
71
95
|
|
72
|
-
# TODO: your code here
|
73
|
-
# use options for your logic
|
74
96
|
rescue OptionParser::InvalidOption => e
|
75
97
|
options[:invalid_argument] = e.message
|
76
98
|
end
|
99
|
+
|
100
|
+
Bio::Log::CLI.configure('bio-alignment')
|
101
|
+
logger = Bio::Log::LoggerPlus['bio-alignment']
|
102
|
+
logger.info [options, ARGV]
|
103
|
+
|
104
|
+
ARGV.each do |fn|
|
105
|
+
aln = Alignment.new
|
106
|
+
Bio::FlatFile.auto(fn).each_entry do |entry|
|
107
|
+
case options[:type]
|
108
|
+
when :codon
|
109
|
+
aln << CodonSequence.new(entry.entry_id,entry.seq)
|
110
|
+
when :nucleotide
|
111
|
+
aln << Sequence.new(entry.entry_id,entry.seq)
|
112
|
+
when :aminoacid
|
113
|
+
aln << Sequence.new(entry.entry_id,entry.seq)
|
114
|
+
else
|
115
|
+
# auto uses BioRuby sequence type
|
116
|
+
logger.warn "Using native type, if you encounter a problem, set the --type explicitly"
|
117
|
+
aln << entry
|
118
|
+
end
|
119
|
+
end
|
120
|
+
case options[:edit]
|
121
|
+
when :bridges
|
122
|
+
logger.info "Apply delete bridges"
|
123
|
+
require 'bio-alignment/edit/del_bridges'
|
124
|
+
aln.extend(DelBridges)
|
125
|
+
aln2 = aln.del_bridges
|
126
|
+
aln = aln2
|
127
|
+
when :islands
|
128
|
+
logger.info "Apply mask islands filter"
|
129
|
+
require 'bio-alignment/edit/mask_islands'
|
130
|
+
aln.extend(MaskIslands)
|
131
|
+
marked_aln = aln.mark_islands
|
132
|
+
aln2 = marked_aln.update_each_element { |e| (e.state.masked? ? Element.new("X"):e)}
|
133
|
+
aln = aln2
|
134
|
+
when :info
|
135
|
+
logger.info "Apply sequence information filter"
|
136
|
+
require 'bio-alignment/edit/del_non_informative_sequences'
|
137
|
+
aln.extend(DelNonInformativeSequences)
|
138
|
+
aln.each { |seq| seq.extend(State) }
|
139
|
+
aln2 = aln.del_non_informative_sequences(options[:perc])
|
140
|
+
aln = aln2
|
141
|
+
else
|
142
|
+
# do nothing
|
143
|
+
end
|
144
|
+
case options[:out]
|
145
|
+
when :fasta
|
146
|
+
aln.each do | seq |
|
147
|
+
print FastaOutput::to_fasta(seq)
|
148
|
+
end
|
149
|
+
else
|
150
|
+
aln.each do | seq |
|
151
|
+
print TextOutput::to_text(seq,options[:style])
|
152
|
+
end
|
153
|
+
end
|
154
|
+
end
|
@@ -78,6 +78,11 @@ Then /^find that "([^"]*)" is on the same branch as "([^"]*)"$/ do |arg1, arg2|
|
|
78
78
|
seq.nearest.map{|n|n.to_s}.sort.join(',').should == arg2
|
79
79
|
end
|
80
80
|
|
81
|
+
Then /^find that the alignment sequence matching tree node "(.*?)" is "(.*?)"$/ do |arg1, arg2|
|
82
|
+
tree = @aln.attach_tree(@tree)
|
83
|
+
node = tree.find(arg1)
|
84
|
+
node.sequence.to_s.should == arg2
|
85
|
+
end
|
81
86
|
|
82
87
|
Then /^draw the MSA with the tree$/ do | string |
|
83
88
|
# textual drawing, like tabtree, or http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/149701
|
@@ -28,6 +28,7 @@ Feature: Tree support for alignments
|
|
28
28
|
And find that the nearest sequence to "seq1" is "seq2,seq3"
|
29
29
|
And find that "seq1" is on the same branch as "seq2,seq3"
|
30
30
|
And find that "seq4" is on the same branch as "seq1,seq2,seq3,seq5,seq8"
|
31
|
+
And find that the alignment sequence matching tree node "seq4" is "----PKLFSRPTIIFSGCSTACSGK--SEPVCGFRSFMLSDV"
|
31
32
|
And draw the MSA with the tree
|
32
33
|
"""
|
33
34
|
,--9.69----------------------------------------- seq7 ----------PTIIFSGCSKACSGK-----VCGIFHAVRSFM
|
@@ -0,0 +1,19 @@
|
|
1
|
+
require 'bundler'
|
2
|
+
begin
|
3
|
+
Bundler.setup(:default, :development)
|
4
|
+
rescue Bundler::BundlerError => e
|
5
|
+
$stderr.puts e.message
|
6
|
+
$stderr.puts "Run `bundle install` to install missing gems"
|
7
|
+
exit e.status_code
|
8
|
+
end
|
9
|
+
|
10
|
+
$LOAD_PATH.unshift(File.dirname(__FILE__) + '/../../lib')
|
11
|
+
require 'bio-alignment'
|
12
|
+
|
13
|
+
require 'rspec/expectations'
|
14
|
+
|
15
|
+
log = Bio::Log::LoggerPlus.new 'bio-alignment'
|
16
|
+
|
17
|
+
Bio::Log::CLI.logger('stderr')
|
18
|
+
Bio::Log::CLI.trace('info')
|
19
|
+
|
data/lib/bio-alignment.rb
CHANGED
@@ -2,9 +2,14 @@
|
|
2
2
|
# bioruby directory tree.
|
3
3
|
#
|
4
4
|
|
5
|
+
require 'bio-logger'
|
6
|
+
require 'bio-alignment/coerce'
|
5
7
|
require 'bio-alignment/state'
|
6
8
|
require 'bio-alignment/elements'
|
7
9
|
require 'bio-alignment/sequence'
|
8
10
|
require 'bio-alignment/codonsequence'
|
9
11
|
require 'bio-alignment/tree'
|
10
12
|
require 'bio-alignment/alignment'
|
13
|
+
require 'bio-alignment/format/text'
|
14
|
+
require 'bio-alignment/format/fasta'
|
15
|
+
require 'bio-alignment/format/phylip'
|
@@ -13,6 +13,7 @@ module Bio
|
|
13
13
|
include Pal2Nal
|
14
14
|
include Rows
|
15
15
|
include Columns
|
16
|
+
include Coerce
|
16
17
|
|
17
18
|
attr_accessor :sequences
|
18
19
|
attr_reader :tree
|
@@ -44,7 +45,7 @@ module Bio
|
|
44
45
|
|
45
46
|
# return an array of sequence ids
|
46
47
|
def ids
|
47
|
-
rows.map { |r| r
|
48
|
+
rows.map { |r| Coerce::fetch_id(r) }
|
48
49
|
end
|
49
50
|
|
50
51
|
def size
|
@@ -56,6 +57,10 @@ module Bio
|
|
56
57
|
rows[index]
|
57
58
|
end
|
58
59
|
|
60
|
+
def << seq
|
61
|
+
@sequences << seq
|
62
|
+
end
|
63
|
+
|
59
64
|
def each
|
60
65
|
rows.each { | seq | yield seq }
|
61
66
|
self
|
@@ -68,21 +73,23 @@ module Bio
|
|
68
73
|
|
69
74
|
def find name
|
70
75
|
each do | seq |
|
71
|
-
return seq if seq
|
76
|
+
return seq if Coerce::fetch_id(seq) == name
|
72
77
|
end
|
73
78
|
raise "ERROR: Sequence not found by its name, looking for <#{name}>"
|
74
79
|
end
|
75
80
|
|
76
|
-
#
|
81
|
+
# copy alignment and allow updating elements. Returns alignment.
|
77
82
|
def update_each_element
|
78
83
|
aln = self.clone
|
79
84
|
aln.each { |seq| seq.each_with_index { |e,i| seq.seq[i] = yield e }}
|
85
|
+
aln
|
80
86
|
end
|
81
87
|
|
82
88
|
def to_s
|
83
89
|
res = ""
|
84
90
|
res += "\t" + columns_to_s + "\n" if @columns
|
85
|
-
|
91
|
+
# fetch each sequence in turn
|
92
|
+
res += map{ |seq| Coerce::fetch_id(seq).to_s + "\t" + Coerce::fetch_seq_string(seq) }.join("\n")
|
86
93
|
res
|
87
94
|
end
|
88
95
|
|
@@ -106,7 +113,7 @@ module Bio
|
|
106
113
|
# the tree traverser
|
107
114
|
def attach_tree tree
|
108
115
|
extend Tree
|
109
|
-
@tree = Tree::init(tree)
|
116
|
+
@tree = Tree::init(tree,self)
|
110
117
|
@tree
|
111
118
|
end
|
112
119
|
|
@@ -122,6 +129,7 @@ module Bio
|
|
122
129
|
new_aln.attach_tree(new_tree.clone)
|
123
130
|
new_aln
|
124
131
|
end
|
132
|
+
|
125
133
|
end
|
126
134
|
end
|
127
135
|
end
|
@@ -4,7 +4,7 @@ module Bio
|
|
4
4
|
class NA
|
5
5
|
include Enumerable
|
6
6
|
def each
|
7
|
-
to_s.
|
7
|
+
to_s.each_char do | c |
|
8
8
|
yield c
|
9
9
|
end
|
10
10
|
end
|
@@ -12,7 +12,7 @@ module Bio
|
|
12
12
|
class AA
|
13
13
|
include Enumerable
|
14
14
|
def each
|
15
|
-
to_s.
|
15
|
+
to_s.each_char do | c |
|
16
16
|
yield c
|
17
17
|
end
|
18
18
|
end
|
@@ -7,13 +7,16 @@ module Bio
|
|
7
7
|
# Codon element for the matrix, used by CodonSequence.
|
8
8
|
class Codon
|
9
9
|
GAP = '---'
|
10
|
-
UNDEFINED = '
|
10
|
+
UNDEFINED = 'XXX'
|
11
11
|
|
12
12
|
attr_reader :codon_table
|
13
|
+
include State
|
13
14
|
|
14
15
|
def initialize codon, codon_table = 1
|
15
16
|
@codon = codon
|
16
17
|
@codon_table = codon_table
|
18
|
+
@codon.freeze
|
19
|
+
@codon_table.freeze
|
17
20
|
end
|
18
21
|
|
19
22
|
def gap?
|
@@ -32,6 +35,10 @@ module Bio
|
|
32
35
|
@codon
|
33
36
|
end
|
34
37
|
|
38
|
+
def == other
|
39
|
+
@codon == other.to_s
|
40
|
+
end
|
41
|
+
|
35
42
|
# lazily convert to Amino acid (once only)
|
36
43
|
def to_aa
|
37
44
|
aa = translate
|
@@ -52,7 +59,7 @@ module Bio
|
|
52
59
|
# lazy translation of codon to amino acid
|
53
60
|
def translate
|
54
61
|
@aa ||= Bio::CodonTable[@codon_table][@codon]
|
55
|
-
@aa
|
62
|
+
@aa.freeze
|
56
63
|
end
|
57
64
|
end
|
58
65
|
|
@@ -72,6 +79,9 @@ module Bio
|
|
72
79
|
seq.scan(/\S\S\S/).each do | codon |
|
73
80
|
@seq << Codon.new(codon, @codon_table)
|
74
81
|
end
|
82
|
+
@id.freeze
|
83
|
+
@codon_table.freeze
|
84
|
+
# @seq is not immutable, as we can add new codes to the list
|
75
85
|
end
|
76
86
|
|
77
87
|
def [] index
|
@@ -86,6 +96,7 @@ module Bio
|
|
86
96
|
@seq.each { | codon | yield codon }
|
87
97
|
end
|
88
98
|
|
99
|
+
# Output codon style
|
89
100
|
def to_s
|
90
101
|
@seq.map { |codon| codon.to_s }.join(' ')
|
91
102
|
end
|
@@ -107,7 +118,11 @@ module Bio
|
|
107
118
|
def << codon
|
108
119
|
@seq << codon
|
109
120
|
end
|
110
|
-
|
121
|
+
|
122
|
+
# Return Sequence (string) as an Elements object
|
123
|
+
def to_elements
|
124
|
+
self
|
125
|
+
end
|
111
126
|
end
|
112
127
|
|
113
128
|
end
|
@@ -0,0 +1,45 @@
|
|
1
|
+
module Bio
|
2
|
+
module BioAlignment
|
3
|
+
module Coerce
|
4
|
+
# Make BioRuby's entry_id compatible with id
|
5
|
+
def Coerce::fetch_id seq
|
6
|
+
if seq.respond_to?(:id)
|
7
|
+
seq.id
|
8
|
+
elsif seq.respond_to?(:entry_id)
|
9
|
+
seq.entry_id
|
10
|
+
else
|
11
|
+
"id?"
|
12
|
+
end
|
13
|
+
end
|
14
|
+
|
15
|
+
# Coerce BioRuby's sequence objects to return the sequence itself
|
16
|
+
def Coerce::fetch_seq seq
|
17
|
+
if seq.respond_to?(:seq)
|
18
|
+
seq.seq
|
19
|
+
else
|
20
|
+
seq
|
21
|
+
end
|
22
|
+
end
|
23
|
+
|
24
|
+
# Coerce sequence objects into a string
|
25
|
+
def Coerce::fetch_seq_string seq
|
26
|
+
s = fetch_seq(seq)
|
27
|
+
if s.respond_to?(:join)
|
28
|
+
s.join
|
29
|
+
else
|
30
|
+
s.to_s
|
31
|
+
end
|
32
|
+
end
|
33
|
+
|
34
|
+
# Coerce sequence objects into elements
|
35
|
+
def Coerce::to_elements seq
|
36
|
+
if seq.respond_to?(:to_elements)
|
37
|
+
seq.to_elements
|
38
|
+
else
|
39
|
+
Elements.new(fetch_id(seq),fetch_seq(seq))
|
40
|
+
end
|
41
|
+
end
|
42
|
+
end
|
43
|
+
end
|
44
|
+
end
|
45
|
+
|
@@ -37,7 +37,7 @@ module Bio
|
|
37
37
|
end
|
38
38
|
|
39
39
|
def columns_to_s
|
40
|
-
columns.map { |c| (c.state ? c.state.to_s : '?') }.join
|
40
|
+
columns.map { |c| (c.respond_to?(:state) ? c.state.to_s : '?') }.join
|
41
41
|
end
|
42
42
|
|
43
43
|
def clone_columns!
|
@@ -50,21 +50,31 @@ module Bio
|
|
50
50
|
end
|
51
51
|
end
|
52
52
|
|
53
|
-
# Support the notion of columns in an alignment. A column
|
54
|
-
#
|
53
|
+
# Support the notion of columns in an alignment. A column is simply an
|
54
|
+
# integer index into the alignment, stored in @col. A column can have state
|
55
|
+
# by attaching state objects.
|
55
56
|
class Column
|
56
57
|
include State
|
57
58
|
include Enumerable
|
58
59
|
|
59
60
|
def initialize aln, col
|
60
61
|
@aln = aln
|
61
|
-
@col = col
|
62
|
+
@col = col # column index number
|
63
|
+
@col.freeze
|
64
|
+
@aln
|
62
65
|
end
|
63
66
|
|
64
67
|
def [] index
|
65
68
|
@aln[index][@col]
|
66
69
|
end
|
67
70
|
|
71
|
+
# update all elements in the column
|
72
|
+
# def update! new_column
|
73
|
+
# each_with_index do |e,i|
|
74
|
+
# @aln[i][@col] = new_column[i]
|
75
|
+
# end
|
76
|
+
# end
|
77
|
+
|
68
78
|
# iterator fetches a column on demand, yielding column elements
|
69
79
|
def each
|
70
80
|
@aln.each do | seq |
|
@@ -5,11 +5,15 @@ module Bio
|
|
5
5
|
|
6
6
|
module DelNonInformativeSequences
|
7
7
|
include MarkRows
|
8
|
-
|
8
|
+
|
9
|
+
# Count the informative elements in a sequence. If the count
|
10
|
+
# is less than +percentage+ mark the sequence for deletion.
|
11
|
+
#
|
9
12
|
# Return a new alignment with rows marked for deletion, i.e. mark rows
|
10
13
|
# that mostly contain undefined elements and gaps (threshold
|
11
14
|
# +percentage+). The alignment returned is a cloned copy
|
12
15
|
def mark_non_informative_sequences percentage = 30
|
16
|
+
percentage=30 if not percentage # for CLI
|
13
17
|
mark_rows { |state,row|
|
14
18
|
num = row.count { |e| e.gap? or e.undefined? }
|
15
19
|
if (num.to_f/row.length) > 1.0-percentage/100.0
|
@@ -20,7 +24,7 @@ module Bio
|
|
20
24
|
end
|
21
25
|
|
22
26
|
def del_non_informative_sequences percentage=30
|
23
|
-
mark_non_informative_sequences.rows_where { |row| !row.state.deleted? }
|
27
|
+
mark_non_informative_sequences(percentage).rows_where { |row| !row.state.deleted? }
|
24
28
|
end
|
25
29
|
end
|
26
30
|
end
|
@@ -5,7 +5,7 @@ module Bio
|
|
5
5
|
# state, and returning a newly cloned alignment
|
6
6
|
module MarkRows
|
7
7
|
|
8
|
-
# Mark each seq
|
8
|
+
# Mark each seq and return alignment
|
9
9
|
def mark_rows &block
|
10
10
|
aln = markrows_clone
|
11
11
|
aln.rows.each do | row |
|
@@ -16,11 +16,15 @@ module Bio
|
|
16
16
|
|
17
17
|
# allow the marking of elements in a copied alignment, making sure
|
18
18
|
# each element is a proper Element object that can contain state.
|
19
|
+
#
|
19
20
|
# A Sequence alignment will be turned into an Elements alignment.
|
21
|
+
#
|
22
|
+
# Returns the new alignment
|
20
23
|
def mark_row_elements &block
|
21
24
|
aln = markrows_clone
|
22
25
|
aln.rows.each_with_index do | row,rownum |
|
23
|
-
new_seq = block.call(row
|
26
|
+
new_seq = block.call(Coerce::to_elements(row),rownum)
|
27
|
+
# p [rownum,new_seq,row]
|
24
28
|
aln.rows[rownum] = new_seq
|
25
29
|
end
|
26
30
|
aln
|
@@ -32,13 +36,15 @@ module Bio
|
|
32
36
|
aln = self.clone
|
33
37
|
# clone row state, or add a state object
|
34
38
|
aln.rows.each do | row |
|
35
|
-
|
36
|
-
|
37
|
-
row.state
|
38
|
-
|
39
|
-
|
40
|
-
|
41
|
-
|
39
|
+
if row.respond_to?(:state)
|
40
|
+
new_state =
|
41
|
+
if row.state
|
42
|
+
row.state.clone
|
43
|
+
else
|
44
|
+
RowState.new
|
45
|
+
end
|
46
|
+
row.state = new_state
|
47
|
+
end
|
42
48
|
end
|
43
49
|
aln
|
44
50
|
end
|
@@ -13,25 +13,33 @@ module Bio
|
|
13
13
|
end
|
14
14
|
end
|
15
15
|
|
16
|
-
# Drop all 'islands' in a sequence with low consensus,
|
17
|
-
# larger than 'min_gap_size' (default 3) on both sides, and
|
18
|
-
# than 'max_island_size' (default 30). An island larger than
|
19
|
-
# is arguably no longer an island, and low consensus
|
20
|
-
# loops - it is up to the alignment procedure to get
|
21
|
-
# allow for micro deletions inside an alignment (1 or
|
22
|
-
# The island consensus is calculated by column. If more
|
23
|
-
# island shows consensus, the island is retained.
|
24
|
-
# element is defined as the number of matches in the
|
16
|
+
# Drop all 'islands' in a sequence with low consensus, i.e. islands that
|
17
|
+
# show a gap larger than 'min_gap_size' (default 3) on both sides, and
|
18
|
+
# are shorter than 'max_island_size' (default 30). An island larger than
|
19
|
+
# 30 elements is arguably no longer an island, and low consensus
|
20
|
+
# stretches may be loops - it is up to the alignment procedure to get
|
21
|
+
# that right. We also allow for micro deletions inside an alignment (1 or
|
22
|
+
# 2 elements). The island consensus is calculated by column. If more
|
23
|
+
# than 50% of the island shows consensus, the island is retained.
|
24
|
+
# Consensus for each element is defined as the number of matches in the
|
25
|
+
# column (default 1).
|
25
26
|
def mark_islands
|
26
|
-
|
27
|
-
|
27
|
+
logger = Bio::Log::LoggerPlus['bio-alignment']
|
28
|
+
count_marked_islands = 0
|
29
|
+
count_marked_elements = 0
|
30
|
+
|
31
|
+
# Traverse each row in the alignment
|
32
|
+
marked_aln = mark_row_elements { |row,rownum|
|
33
|
+
# for each element create a state object, and find unique elements (i.e. consensus) across a column
|
28
34
|
row.each_with_index do |e,colnum|
|
29
35
|
e.state = IslandElementState.new
|
30
36
|
column = columns[colnum]
|
31
37
|
e.state.unique = (column.count{|e2| !e2.gap? and e2 == e } == 1)
|
32
38
|
# p [e,e.state,e.state.unique]
|
33
39
|
end
|
34
|
-
#
|
40
|
+
# at this stage all elements of the row have been set to unique,
|
41
|
+
# which are unique. Now group elements into islands (split on gap)
|
42
|
+
# and mask
|
35
43
|
gap = []
|
36
44
|
island = []
|
37
45
|
in_island = true
|
@@ -52,7 +60,9 @@ module Bio
|
|
52
60
|
gap << e
|
53
61
|
if gap.length > 2
|
54
62
|
in_island = false
|
55
|
-
mark_island(island)
|
63
|
+
ci, ce = mark_island(island)
|
64
|
+
count_marked_islands += ci
|
65
|
+
count_marked_elements += ce
|
56
66
|
# print_island(island)
|
57
67
|
island = []
|
58
68
|
end
|
@@ -60,41 +70,29 @@ module Bio
|
|
60
70
|
end
|
61
71
|
end
|
62
72
|
if in_island
|
63
|
-
mark_island(island)
|
73
|
+
ci, ce = mark_island(island)
|
74
|
+
count_marked_islands += ci
|
75
|
+
count_marked_elements += ce
|
64
76
|
# print_island(island) if island.length > 0
|
65
77
|
end
|
66
|
-
# row
|
67
|
-
# e.state = ElementState.new
|
68
|
-
# column = columns[colnum]
|
69
|
-
# e.state.mask! if column.count{|e2| !e2.gap? and e2 == e } == 1
|
70
|
-
# # print e,',',e.state,';'
|
71
|
-
# end
|
72
|
-
# now make sure there are at least 5 in a row, otherwise
|
73
|
-
# start unmasking. First group all elements
|
74
|
-
# group = []
|
75
|
-
# row.each_with_index do |e,colnum|
|
76
|
-
# next if e.gap?
|
77
|
-
# if e.state.masked?
|
78
|
-
# group << e
|
79
|
-
# else
|
80
|
-
# if group.length <= min_serial
|
81
|
-
# # the group is too small
|
82
|
-
# group.each do | e2 |
|
83
|
-
# e2.state.unmask!
|
84
|
-
# end
|
85
|
-
# end
|
86
|
-
# group = []
|
87
|
-
# end
|
88
|
-
# end
|
89
|
-
row # return changed sequence
|
78
|
+
row # always return the row to mark_row_elements
|
90
79
|
}
|
80
|
+
logger.info("#{count_marked_islands} islands marked (#{count_marked_elements} elements)")
|
81
|
+
return marked_aln
|
91
82
|
end
|
92
83
|
|
93
84
|
private
|
94
|
-
|
85
|
+
|
86
|
+
# Check a list of elements that form an island. First count the number
|
87
|
+
# of elements marked as being unique. If the island is more than 50%
|
88
|
+
# unique (i.e. less than 50% consensus with the rest if the alignment)
|
89
|
+
# all island elements are marked for masking. Returns the number of
|
90
|
+
# islands and elements marked as a tuple
|
95
91
|
def mark_island island
|
96
|
-
return if island.length < 2
|
92
|
+
return 0,0 if island.length < 2
|
97
93
|
unique = 0
|
94
|
+
count_islands = 0
|
95
|
+
count_elements = 0
|
98
96
|
island.each do |e|
|
99
97
|
unique += 1 if e.state.unique == true
|
100
98
|
end
|
@@ -104,7 +102,10 @@ module Bio
|
|
104
102
|
island.each do |e|
|
105
103
|
e.state.mask!
|
106
104
|
end
|
105
|
+
count_islands += 1
|
106
|
+
count_elements += island.size
|
107
107
|
end
|
108
|
+
return count_islands, count_elements
|
108
109
|
end
|
109
110
|
|
110
111
|
def print_island island
|
@@ -10,6 +10,7 @@ module Bio
|
|
10
10
|
|
11
11
|
def initialize c
|
12
12
|
@c = c
|
13
|
+
@c.freeze
|
13
14
|
end
|
14
15
|
def gap?
|
15
16
|
@c == GAP
|
@@ -23,6 +24,13 @@ module Bio
|
|
23
24
|
def == other
|
24
25
|
to_s == other.to_s
|
25
26
|
end
|
27
|
+
def clone
|
28
|
+
e = self.dup
|
29
|
+
if e.state != nil
|
30
|
+
e.state = e.state.clone
|
31
|
+
end
|
32
|
+
e
|
33
|
+
end
|
26
34
|
end
|
27
35
|
|
28
36
|
# Elements is a container for Element sequences.
|
@@ -34,6 +42,7 @@ module Bio
|
|
34
42
|
attr_reader :id, :seq
|
35
43
|
def initialize id, seq
|
36
44
|
@id = id
|
45
|
+
@id.freeze
|
37
46
|
@seq = []
|
38
47
|
if seq.kind_of?(Elements)
|
39
48
|
@seq = seq.clone
|
@@ -75,7 +84,7 @@ module Bio
|
|
75
84
|
def clone
|
76
85
|
copy = Elements.new(@id,"")
|
77
86
|
@seq.each do |e|
|
78
|
-
copy << e.
|
87
|
+
copy << e.clone
|
79
88
|
end
|
80
89
|
copy
|
81
90
|
end
|
@@ -0,0 +1,25 @@
|
|
1
|
+
module Bio
|
2
|
+
module BioAlignment
|
3
|
+
module PhylipOutput
|
4
|
+
# Calculate header info from alignment and return as string
|
5
|
+
def PhylipOutput::header alignment
|
6
|
+
"#{alignment.size} #{alignment[0].length}\n"
|
7
|
+
end
|
8
|
+
|
9
|
+
# Output sequence PAML style and return as a multi-line string
|
10
|
+
def PhylipOutput::to_paml seq, size=60
|
11
|
+
buf = seq.id+"\n"
|
12
|
+
coding = if seq.kind_of?(CodonSequence)
|
13
|
+
seq.to_nt
|
14
|
+
else
|
15
|
+
seq.to_s
|
16
|
+
end
|
17
|
+
coding.scan(/.{1,#{size}}/).each do | section |
|
18
|
+
buf += section + "\n"
|
19
|
+
end
|
20
|
+
buf
|
21
|
+
end
|
22
|
+
end
|
23
|
+
end
|
24
|
+
end
|
25
|
+
|
@@ -0,0 +1,18 @@
|
|
1
|
+
module Bio
|
2
|
+
module BioAlignment
|
3
|
+
module TextOutput
|
4
|
+
|
5
|
+
def TextOutput::to_text seq, style
|
6
|
+
res = ""
|
7
|
+
res += Coerce::fetch_id(seq).to_s + "\t"
|
8
|
+
res += if seq.kind_of?(CodonSequence) and style == :codon
|
9
|
+
seq.to_s
|
10
|
+
else
|
11
|
+
Coerce::fetch_seq_string(seq)
|
12
|
+
end
|
13
|
+
res+"\n"
|
14
|
+
end
|
15
|
+
|
16
|
+
end
|
17
|
+
end
|
18
|
+
end
|
data/lib/bio-alignment/rows.rb
CHANGED
@@ -11,6 +11,7 @@ module Bio
|
|
11
11
|
attr_reader :id, :seq
|
12
12
|
def initialize id, seq
|
13
13
|
@id = id
|
14
|
+
@id.freeze
|
14
15
|
@seq = seq
|
15
16
|
end
|
16
17
|
|
@@ -18,6 +19,10 @@ module Bio
|
|
18
19
|
@seq[index]
|
19
20
|
end
|
20
21
|
|
22
|
+
# def []= index, value --- we should not implement this for reasons of purity
|
23
|
+
# @seq[index] = value
|
24
|
+
# end
|
25
|
+
|
21
26
|
def length
|
22
27
|
@seq.length
|
23
28
|
end
|
data/lib/bio-alignment/tree.rb
CHANGED
@@ -11,13 +11,13 @@ module Bio
|
|
11
11
|
class Node
|
12
12
|
end
|
13
13
|
|
14
|
-
# Make all nodes in the Bio::Tree aware of the tree object
|
15
|
-
#
|
16
|
-
def Tree::init tree
|
14
|
+
# Make all nodes in the Bio::Tree aware of the tree object, and the alignment, so
|
15
|
+
# get a more intuitive API
|
16
|
+
def Tree::init tree, alignment
|
17
17
|
if tree.kind_of?(Bio::Tree)
|
18
18
|
# walk all nodes and infect the tree info
|
19
19
|
tree.each_node do | node |
|
20
|
-
node.inject_tree(tree)
|
20
|
+
node.inject_tree(tree, alignment)
|
21
21
|
end
|
22
22
|
# tree.root.set_tree(tree)
|
23
23
|
else
|
@@ -38,8 +38,10 @@ module Bio
|
|
38
38
|
class Tree
|
39
39
|
class Node
|
40
40
|
# Add tree information to this node, so it can be queried
|
41
|
-
def inject_tree tree
|
41
|
+
def inject_tree tree, alignment
|
42
42
|
@tree = tree
|
43
|
+
@tree.freeze
|
44
|
+
@alignment = alignment
|
43
45
|
self
|
44
46
|
end
|
45
47
|
|
@@ -102,6 +104,15 @@ module Bio
|
|
102
104
|
end
|
103
105
|
cs
|
104
106
|
end
|
107
|
+
|
108
|
+
# Return the alignment attached to the tree
|
109
|
+
def alignment
|
110
|
+
@alignment
|
111
|
+
end
|
112
|
+
|
113
|
+
def sequence
|
114
|
+
@alignment.find(name)
|
115
|
+
end
|
105
116
|
end # End of injecting Node functionality
|
106
117
|
|
107
118
|
def find name
|
data/spec/bio-alignment_spec.rb
CHANGED
@@ -108,3 +108,18 @@ describe "BioAlignment::DelBridges for codons" do
|
|
108
108
|
aln3 = aln2.columns_where { |col| !col.state.deleted? }
|
109
109
|
aln3.columns.size.should == 399
|
110
110
|
end
|
111
|
+
|
112
|
+
# require 'bio' # BioRuby
|
113
|
+
require 'bio-alignment/bioruby' # make Bio::Sequence enumerable
|
114
|
+
|
115
|
+
describe "BioAlignment::BioRuby interface" do
|
116
|
+
include Bio::BioAlignment
|
117
|
+
|
118
|
+
aln = Alignment.new
|
119
|
+
aln << Bio::Sequence::NA.new("atgcatgcaaaa")
|
120
|
+
aln << Bio::Sequence::NA.new("atg---tcaaaa")
|
121
|
+
aln[0].should == "atgcatgcaaaa"
|
122
|
+
aln[1].should == "atg---tcaaaa"
|
123
|
+
Coerce::fetch_seq_string(aln[0]).should == "atgcatgcaaaa"
|
124
|
+
test = Coerce::fetch_id(aln[0]) # JRuby may have a name collision with object.id
|
125
|
+
end
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: bio-alignment
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.0.
|
4
|
+
version: 0.0.8
|
5
5
|
prerelease:
|
6
6
|
platform: ruby
|
7
7
|
authors:
|
@@ -9,11 +9,11 @@ authors:
|
|
9
9
|
autorequire:
|
10
10
|
bindir: bin
|
11
11
|
cert_chain: []
|
12
|
-
date:
|
12
|
+
date: 2014-05-15 00:00:00.000000000 Z
|
13
13
|
dependencies:
|
14
14
|
- !ruby/object:Gem::Dependency
|
15
15
|
name: bio-logger
|
16
|
-
requirement:
|
16
|
+
requirement: !ruby/object:Gem::Requirement
|
17
17
|
none: false
|
18
18
|
requirements:
|
19
19
|
- - ! '>='
|
@@ -21,10 +21,15 @@ dependencies:
|
|
21
21
|
version: '0'
|
22
22
|
type: :runtime
|
23
23
|
prerelease: false
|
24
|
-
version_requirements:
|
24
|
+
version_requirements: !ruby/object:Gem::Requirement
|
25
|
+
none: false
|
26
|
+
requirements:
|
27
|
+
- - ! '>='
|
28
|
+
- !ruby/object:Gem::Version
|
29
|
+
version: '0'
|
25
30
|
- !ruby/object:Gem::Dependency
|
26
31
|
name: bio
|
27
|
-
requirement:
|
32
|
+
requirement: !ruby/object:Gem::Requirement
|
28
33
|
none: false
|
29
34
|
requirements:
|
30
35
|
- - ! '>='
|
@@ -32,10 +37,15 @@ dependencies:
|
|
32
37
|
version: 1.4.2
|
33
38
|
type: :runtime
|
34
39
|
prerelease: false
|
35
|
-
version_requirements:
|
40
|
+
version_requirements: !ruby/object:Gem::Requirement
|
41
|
+
none: false
|
42
|
+
requirements:
|
43
|
+
- - ! '>='
|
44
|
+
- !ruby/object:Gem::Version
|
45
|
+
version: 1.4.2
|
36
46
|
- !ruby/object:Gem::Dependency
|
37
47
|
name: rake
|
38
|
-
requirement:
|
48
|
+
requirement: !ruby/object:Gem::Requirement
|
39
49
|
none: false
|
40
50
|
requirements:
|
41
51
|
- - ! '>='
|
@@ -43,10 +53,15 @@ dependencies:
|
|
43
53
|
version: '0'
|
44
54
|
type: :development
|
45
55
|
prerelease: false
|
46
|
-
version_requirements:
|
56
|
+
version_requirements: !ruby/object:Gem::Requirement
|
57
|
+
none: false
|
58
|
+
requirements:
|
59
|
+
- - ! '>='
|
60
|
+
- !ruby/object:Gem::Version
|
61
|
+
version: '0'
|
47
62
|
- !ruby/object:Gem::Dependency
|
48
63
|
name: bio-bigbio
|
49
|
-
requirement:
|
64
|
+
requirement: !ruby/object:Gem::Requirement
|
50
65
|
none: false
|
51
66
|
requirements:
|
52
67
|
- - ! '>'
|
@@ -54,10 +69,15 @@ dependencies:
|
|
54
69
|
version: 0.1.3
|
55
70
|
type: :development
|
56
71
|
prerelease: false
|
57
|
-
version_requirements:
|
72
|
+
version_requirements: !ruby/object:Gem::Requirement
|
73
|
+
none: false
|
74
|
+
requirements:
|
75
|
+
- - ! '>'
|
76
|
+
- !ruby/object:Gem::Version
|
77
|
+
version: 0.1.3
|
58
78
|
- !ruby/object:Gem::Dependency
|
59
79
|
name: cucumber
|
60
|
-
requirement:
|
80
|
+
requirement: !ruby/object:Gem::Requirement
|
61
81
|
none: false
|
62
82
|
requirements:
|
63
83
|
- - ! '>='
|
@@ -65,10 +85,15 @@ dependencies:
|
|
65
85
|
version: '0'
|
66
86
|
type: :development
|
67
87
|
prerelease: false
|
68
|
-
version_requirements:
|
88
|
+
version_requirements: !ruby/object:Gem::Requirement
|
89
|
+
none: false
|
90
|
+
requirements:
|
91
|
+
- - ! '>='
|
92
|
+
- !ruby/object:Gem::Version
|
93
|
+
version: '0'
|
69
94
|
- !ruby/object:Gem::Dependency
|
70
95
|
name: rspec
|
71
|
-
requirement:
|
96
|
+
requirement: !ruby/object:Gem::Requirement
|
72
97
|
none: false
|
73
98
|
requirements:
|
74
99
|
- - ~>
|
@@ -76,10 +101,15 @@ dependencies:
|
|
76
101
|
version: 2.10.0
|
77
102
|
type: :development
|
78
103
|
prerelease: false
|
79
|
-
version_requirements:
|
104
|
+
version_requirements: !ruby/object:Gem::Requirement
|
105
|
+
none: false
|
106
|
+
requirements:
|
107
|
+
- - ~>
|
108
|
+
- !ruby/object:Gem::Version
|
109
|
+
version: 2.10.0
|
80
110
|
- !ruby/object:Gem::Dependency
|
81
111
|
name: bundler
|
82
|
-
requirement:
|
112
|
+
requirement: !ruby/object:Gem::Requirement
|
83
113
|
none: false
|
84
114
|
requirements:
|
85
115
|
- - ! '>='
|
@@ -87,10 +117,15 @@ dependencies:
|
|
87
117
|
version: 1.0.21
|
88
118
|
type: :development
|
89
119
|
prerelease: false
|
90
|
-
version_requirements:
|
120
|
+
version_requirements: !ruby/object:Gem::Requirement
|
121
|
+
none: false
|
122
|
+
requirements:
|
123
|
+
- - ! '>='
|
124
|
+
- !ruby/object:Gem::Version
|
125
|
+
version: 1.0.21
|
91
126
|
- !ruby/object:Gem::Dependency
|
92
127
|
name: jeweler
|
93
|
-
requirement:
|
128
|
+
requirement: !ruby/object:Gem::Requirement
|
94
129
|
none: false
|
95
130
|
requirements:
|
96
131
|
- - ! '>='
|
@@ -98,7 +133,12 @@ dependencies:
|
|
98
133
|
version: '0'
|
99
134
|
type: :development
|
100
135
|
prerelease: false
|
101
|
-
version_requirements:
|
136
|
+
version_requirements: !ruby/object:Gem::Requirement
|
137
|
+
none: false
|
138
|
+
requirements:
|
139
|
+
- - ! '>='
|
140
|
+
- !ruby/object:Gem::Version
|
141
|
+
version: '0'
|
102
142
|
description: Alignment handler for multiple sequence alignments (MSA)
|
103
143
|
email: pjotr.public01@thebird.nl
|
104
144
|
executables:
|
@@ -107,6 +147,7 @@ extensions: []
|
|
107
147
|
extra_rdoc_files:
|
108
148
|
- LICENSE.txt
|
109
149
|
- README.md
|
150
|
+
- TODO
|
110
151
|
files:
|
111
152
|
- .document
|
112
153
|
- .rspec
|
@@ -115,6 +156,7 @@ files:
|
|
115
156
|
- LICENSE.txt
|
116
157
|
- README.md
|
117
158
|
- Rakefile
|
159
|
+
- TODO
|
118
160
|
- VERSION
|
119
161
|
- bin/bio-alignment
|
120
162
|
- doc/bio-alignment-design.md
|
@@ -144,10 +186,12 @@ files:
|
|
144
186
|
- features/phylogeny/tree.feature
|
145
187
|
- features/rows-feature.rb
|
146
188
|
- features/rows.feature
|
189
|
+
- features/support/env.rb
|
147
190
|
- lib/bio-alignment.rb
|
148
191
|
- lib/bio-alignment/alignment.rb
|
149
192
|
- lib/bio-alignment/bioruby.rb
|
150
193
|
- lib/bio-alignment/codonsequence.rb
|
194
|
+
- lib/bio-alignment/coerce.rb
|
151
195
|
- lib/bio-alignment/columns.rb
|
152
196
|
- lib/bio-alignment/edit/del_bridges.rb
|
153
197
|
- lib/bio-alignment/edit/del_non_informative_sequences.rb
|
@@ -158,6 +202,9 @@ files:
|
|
158
202
|
- lib/bio-alignment/edit/mask_serial_mutations.rb
|
159
203
|
- lib/bio-alignment/edit/tree_splitter.rb
|
160
204
|
- lib/bio-alignment/elements.rb
|
205
|
+
- lib/bio-alignment/format/fasta.rb
|
206
|
+
- lib/bio-alignment/format/phylip.rb
|
207
|
+
- lib/bio-alignment/format/text.rb
|
161
208
|
- lib/bio-alignment/pal2nal.rb
|
162
209
|
- lib/bio-alignment/rows.rb
|
163
210
|
- lib/bio-alignment/sequence.rb
|
@@ -186,7 +233,7 @@ required_ruby_version: !ruby/object:Gem::Requirement
|
|
186
233
|
version: '0'
|
187
234
|
segments:
|
188
235
|
- 0
|
189
|
-
hash:
|
236
|
+
hash: 3021753307968946034
|
190
237
|
required_rubygems_version: !ruby/object:Gem::Requirement
|
191
238
|
none: false
|
192
239
|
requirements:
|
@@ -195,7 +242,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
195
242
|
version: '0'
|
196
243
|
requirements: []
|
197
244
|
rubyforge_project:
|
198
|
-
rubygems_version: 1.8.
|
245
|
+
rubygems_version: 1.8.23
|
199
246
|
signing_key:
|
200
247
|
specification_version: 3
|
201
248
|
summary: Support for multiple sequence alignments (MSA)
|