bio-alignment 0.0.6 → 0.0.7
Sign up to get free protection for your applications and to get access to all the features.
- data/Gemfile +1 -1
- data/README.md +65 -16
- data/VERSION +1 -1
- data/doc/bio-alignment-design.md +83 -72
- data/features/phylogeny/split-tree-feature.rb +31 -0
- data/features/phylogeny/split-tree.feature +66 -0
- data/features/{tree-feature.rb → phylogeny/tree-feature.rb} +32 -3
- data/features/{tree.feature → phylogeny/tree.feature} +8 -1
- data/lib/bio-alignment/alignment.rb +27 -1
- data/lib/bio-alignment/edit/tree_splitter.rb +58 -0
- data/lib/bio-alignment/tree.rb +97 -7
- metadata +26 -23
data/Gemfile
CHANGED
data/README.md
CHANGED
@@ -1,22 +1,39 @@
|
|
1
1
|
# bio-alignment
|
2
2
|
|
3
|
-
|
3
|
+
Matrix style alignment handler for multiple sequence alignments (MSA).
|
4
4
|
|
5
|
-
|
6
|
-
sequence object. Support for any nucleotide, amino acid and codon
|
7
|
-
sequences that are lists. Any list with payload can be used (e.g.
|
8
|
-
nucleotide quality score, codon annotation). The only requirement is
|
9
|
-
that the list is iterable and can be indexed.
|
5
|
+
[![Build Status](https://secure.travis-ci.org/pjotrp/bioruby-alignment.png)](http://travis-ci.org/pjotrp/bioruby-alignment)
|
10
6
|
|
11
|
-
This
|
7
|
+
This alignment handler makes no assumptions about the underlying
|
8
|
+
sequence object. It supports any nucleotide, amino acid and codon
|
9
|
+
sequences that are lists. Any list with payload or state, can be used
|
10
|
+
(e.g. nucleotide quality score, codon annotation). The only
|
11
|
+
requirement is that the list is Enumerable and can be indexed, i.e.
|
12
|
+
inherit Ruby Enumerable and have the [] method.
|
13
|
+
|
14
|
+
Features are:
|
15
|
+
|
16
|
+
* Matrix notation for alignment object
|
17
|
+
* Functional style alignment access and editing
|
18
|
+
* Support for BioRuby Sequences
|
19
|
+
* Support for BioRuby trees and node distance calculation
|
20
|
+
* bio-alignment interacts well with BioRuby structures,
|
21
|
+
including sequence objects and alignment/tree parsers
|
22
|
+
|
23
|
+
When possible, BioRuby functionality is merged in. For example, by
|
24
|
+
supporting Bio::Sequence objects, standard BioRuby alignment
|
25
|
+
functions, sequence readers and writers can be used. By supporting the
|
26
|
+
BioRuby Tree object, standard BioRuby tree parsers and writers can be
|
27
|
+
used. bio-alignment takes alignment handling with phylogenetic tree
|
28
|
+
support to a new level.
|
29
|
+
|
30
|
+
bio-alignment is based on Pjotr's experience designing the BioScala
|
12
31
|
Alignment handler and BioRuby's PAML support. Read the
|
13
32
|
Bio::BioAlignment
|
14
33
|
[design
|
15
34
|
document](https://github.com/pjotrp/bioruby-alignment/blob/master/doc/bio-alignment-design.md)
|
16
35
|
for Ruby.
|
17
36
|
|
18
|
-
Note: this software is under active development.
|
19
|
-
|
20
37
|
## Developers
|
21
38
|
|
22
39
|
### Codon alignment example
|
@@ -40,6 +57,8 @@ aligmment (note codon gaps are represented by '---')
|
|
40
57
|
aln.rows.each do | row |
|
41
58
|
fasta.write(row.id, row.to_aa.to_s)
|
42
59
|
end
|
60
|
+
# get first codon element of the fourth sequence
|
61
|
+
p aln[3][0]
|
43
62
|
```
|
44
63
|
|
45
64
|
Now add some state - you can define your own row state
|
@@ -151,8 +170,28 @@ resulting in the codon alignment.
|
|
151
170
|
|
152
171
|
### Phylogeny
|
153
172
|
|
154
|
-
BioAlignment has support for attaching a
|
155
|
-
alignment, and traversing the tree
|
173
|
+
BioAlignment has support for attaching a phylogenetic tree to an
|
174
|
+
alignment, and traversing the tree using an intuitive interface
|
175
|
+
|
176
|
+
```ruby
|
177
|
+
sole_tree = Bio::Newick.new(string).tree # use BioRuby's tree parser
|
178
|
+
tree = aln.attach_tree(sole_tree) # attach the tree
|
179
|
+
# now do stuff with the tree, which has improved bio-align support
|
180
|
+
root = tree.root
|
181
|
+
children = root.children
|
182
|
+
children.map { |n| n.name }.sort.should == ["","seq7"]
|
183
|
+
seq7 = children.last
|
184
|
+
seq4 = tree.find("seq4")
|
185
|
+
seq4.distance(seq7).should == 19.387756600000003
|
186
|
+
print tree.output_newick # BioRuby Newick output
|
187
|
+
```
|
188
|
+
|
189
|
+
There are methods for finding sibling nodes, splitting the alignment
|
190
|
+
based on the tree, and locating sequences on the same branch. More
|
191
|
+
examples can be found in the tests and features. The underlying
|
192
|
+
implementation of Bio::Tree is that of BioRuby. We have added an OOP
|
193
|
+
layer for traversing the tree by injecting methods into the BioRuby
|
194
|
+
object itself.
|
156
195
|
|
157
196
|
### Alignment marking/masking/editing
|
158
197
|
|
@@ -249,18 +288,28 @@ where aln2 is a copy of aln with bridging columns deleted.
|
|
249
288
|
|
250
289
|
### See also
|
251
290
|
|
252
|
-
|
291
|
+
For more on the design of bio-alignment, read the
|
292
|
+
Bio::BioAlignment
|
293
|
+
[design
|
294
|
+
document](https://github.com/pjotrp/bioruby-alignment/blob/master/doc/bio-alignment-design.md).
|
295
|
+
|
296
|
+
The API documentation can be found
|
297
|
+
[online](http://rubygems.org/gems/bio-alignment). For examples see the files in
|
253
298
|
[./spec/*.rb](https://github.com/pjotrp/bioruby-alignment/tree/master/spec) and
|
254
299
|
[./features/*](https://github.com/pjotrp/bioruby-alignment/tree/master/features).
|
255
300
|
|
256
301
|
## Cite
|
257
302
|
|
258
|
-
If you use this software, please cite
|
303
|
+
If you use this software, please cite one of
|
304
|
+
|
305
|
+
* [BioRuby: bioinformatics software for the Ruby programming language](http://dx.doi.org/10.1093/bioinformatics/btq475)
|
306
|
+
* [Biogem: an effective tool-based approach for scaling up open source software development in bioinformatics](http://dx.doi.org/10.1093/bioinformatics/bts080)
|
307
|
+
|
308
|
+
## Biogems.info
|
309
|
+
|
310
|
+
This Biogem is published at [#bio-alignment](http://biogems.info/index.html)
|
259
311
|
|
260
312
|
## Copyright
|
261
313
|
|
262
314
|
Copyright (c) 2012 Pjotr Prins. See LICENSE.txt for further details.
|
263
315
|
|
264
|
-
## Biogems.info
|
265
|
-
|
266
|
-
This exciting Ruby Biogem is published on http://biogems.info/
|
data/VERSION
CHANGED
@@ -1 +1 @@
|
|
1
|
-
0.0.
|
1
|
+
0.0.7
|
data/doc/bio-alignment-design.md
CHANGED
@@ -1,58 +1,59 @@
|
|
1
1
|
# Bio-alignment design
|
2
2
|
|
3
|
-
''A well designed library should be simple and elegant to use...''
|
3
|
+
''A well designed library should be *simple* and elegant to use...''
|
4
4
|
|
5
5
|
## Introduction
|
6
6
|
|
7
|
-
Biological multi-sequence alignments (MSA) are
|
8
|
-
|
9
|
-
|
10
|
-
|
11
|
-
|
12
|
-
|
13
|
-
|
14
|
-
|
15
|
-
|
16
|
-
|
17
|
-
|
18
|
-
say we have a nucleotide sequence with pay load
|
7
|
+
Biological multi-sequence alignments (MSA) are matrices of nucleotide or amino
|
8
|
+
acid sequences with gaps. Despite this rather simple premise, most software
|
9
|
+
fails make it simple to access these structures. Also most implementations fail
|
10
|
+
to support a 'pay load' of items in the matrix (this is because underlying
|
11
|
+
sequences are String based). The result is that a developer has to track
|
12
|
+
information in multiple places. For example to track a base pair quality score
|
13
|
+
will be a second matrix of information. This makes code complex and therefore
|
14
|
+
error prone. With the bio-alignment library, elements of the matrix can carry
|
15
|
+
information, so called 'state'. When the alignment gets edited, i.e. the
|
16
|
+
element gets moved or deleted, the information gets moved or deleted along. For
|
17
|
+
example, say we have a nucleotide sequence with quality pay load
|
19
18
|
|
20
19
|
A G T A
|
21
20
|
| | | |
|
22
21
|
5 9 * 1
|
23
22
|
|
24
|
-
most library implementations will have two strings "AGTA" and "59*1".
|
25
|
-
|
26
|
-
"
|
27
|
-
|
28
|
-
|
23
|
+
most library implementations will have two strings "AGTA" and "59*1". Removing
|
24
|
+
the third nucleotide would mean removing it twice, into first "AGA", and second
|
25
|
+
"591". With bio-alignment this is one action because we have one object for
|
26
|
+
each element that contains both values, e.g. the payload of 'T' is '*'. Moving
|
27
|
+
'T' automatically moves '*'. Simple really.
|
29
28
|
|
30
|
-
In addition the bio-alignment library deals with codons and
|
31
|
-
Rather than track multiple matrices, the codon is viewed as
|
32
|
-
and the translated codon as the pay load. Again, when an alignment
|
33
|
-
reordered the code only has to do it in one place.
|
29
|
+
In addition to carrying state, the bio-alignment library deals with codons and
|
30
|
+
codon translation. Rather than track multiple matrices, the codon is viewed as
|
31
|
+
an element, and the translated codon as the pay load. Again, when an alignment
|
32
|
+
gets reordered the code only has to do it in one place.
|
34
33
|
|
35
|
-
Likewise, an alignment column can have a pay load (e.g. quality score
|
36
|
-
|
37
|
-
|
38
|
-
matrix element, column, or row 'attributes'.
|
34
|
+
Likewise, an alignment column can have a pay load (e.g. quality score in a pile
|
35
|
+
up), and an alignment row can have a pay load (e.g. the sequence name). The
|
36
|
+
concept of pay load, normally referred to as 'state', is handled through
|
37
|
+
generic matrix element, column, or row 'attributes'.
|
39
38
|
|
40
|
-
Many of these ideas came from my work on the [BioScala
|
39
|
+
Many of these ideas came from my earlier work on the [BioScala
|
41
40
|
project](https://github.com/pjotrp/bioscala/blob/master/doc/design.txt),
|
42
41
|
The BioScala library has the additional advantage of having type
|
43
|
-
safety throughout
|
42
|
+
safety throughout, but lacks many of the features I have added to the
|
43
|
+
Ruby version.
|
44
44
|
|
45
45
|
## Row or Sequence
|
46
46
|
|
47
|
-
Any sequence for an alignment is simply a list of objects. The
|
48
|
-
|
49
|
-
|
50
|
-
is a good example.
|
47
|
+
Any sequence for an alignment is simply a list of objects. The requirement for
|
48
|
+
any such list is that it should be enumerable and can be indexed. In Ruby
|
49
|
+
terms, the list has to include Enumerable and provide 'each' and '[]' methods.
|
50
|
+
The CodonSequence list, included in this library, is a good example.
|
51
51
|
|
52
52
|
In addition, elements in the list should respond to certain properties (see
|
53
53
|
below).
|
54
54
|
|
55
55
|
```ruby
|
56
|
+
# create a list of codons
|
56
57
|
codons = CodonSequence.new(rec.id,rec.seq)
|
57
58
|
print codons.id
|
58
59
|
# get first codon
|
@@ -70,7 +71,8 @@ acid with
|
|
70
71
|
print codons.seq[0].to_aa
|
71
72
|
```
|
72
73
|
|
73
|
-
in fact, because Sequence is index-able we can write
|
74
|
+
in fact, because bio-alignment demands Sequence is index-able we can write
|
75
|
+
directly
|
74
76
|
|
75
77
|
```ruby
|
76
78
|
print codons[0].to_aa # 'M'
|
@@ -85,14 +87,17 @@ do a fancy
|
|
85
87
|
aaseq = codons.map { | codon | codon.to_aa }.join("")
|
86
88
|
```
|
87
89
|
|
90
|
+
this is getting interesting... Codons, which are three letter nucleotide base
|
91
|
+
pairs, actually act as basic lists, and can be converted to amino acids.
|
92
|
+
|
88
93
|
## Element
|
89
94
|
|
90
95
|
Elements in the list should respond to a gap? method, for an alignment
|
91
96
|
gap, and the undefined? method for a position that is either an
|
92
97
|
element or a gap. Also it should respond to the to_s method.
|
93
98
|
|
94
|
-
|
95
|
-
|
99
|
+
It is important to note that an element can contain *any* pay load, or state. Ruby
|
100
|
+
objects are 'open'. You can even add state at runtime.
|
96
101
|
|
97
102
|
## Elements and CodonSequence
|
98
103
|
|
@@ -102,24 +107,25 @@ carry state.
|
|
102
107
|
|
103
108
|
The third list type we normally use in an Alignment, next to Sequence and
|
104
109
|
Elements, is the CodonSequence (remember, you can easily roll your own Sequence
|
105
|
-
type).
|
110
|
+
type, just make them Enumerable and indexed).
|
106
111
|
|
107
112
|
## Column
|
108
113
|
|
109
|
-
The column list tracks the columns of the alignment.
|
110
|
-
|
111
|
-
no elements,
|
114
|
+
The column list tracks the columns of the alignment. Again, the requirement is
|
115
|
+
that the list should be Enumerable and indexed. By default, the Column contains
|
116
|
+
no elements, only when the alignment is transposed. Matrix elements are found
|
117
|
+
by indexing on the sequences (rows).
|
112
118
|
|
113
|
-
One of the
|
119
|
+
One of the features of this library is that the Column access logic is
|
114
120
|
split out into a separate module, which accesses the data in a lazy fashion.
|
115
121
|
Also column state is stored as an 'any object'. I.e. a column can contain
|
116
|
-
any state.
|
122
|
+
any type of state.
|
117
123
|
|
118
|
-
## Matrix
|
124
|
+
## Matrix (MSA)
|
119
125
|
|
120
|
-
The
|
121
|
-
consisting of Elements. Accessing the matrix is by
|
122
|
-
by Element
|
126
|
+
The matrix (multi sequence alignment or MSA) consists of a Column list, and
|
127
|
+
multiple Sequences, in turn consisting of Elements. Accessing the matrix is by
|
128
|
+
Sequence, followed by Element, leading to a matrix style notation
|
123
129
|
|
124
130
|
```ruby
|
125
131
|
require 'bio-alignment'
|
@@ -130,31 +136,34 @@ by Element.
|
|
130
136
|
fasta.each do | rec |
|
131
137
|
aln.sequences << rec
|
132
138
|
end
|
139
|
+
# get first codon element of the fourth sequence
|
140
|
+
codon = aln[3][0]
|
133
141
|
```
|
134
142
|
|
135
143
|
note that MSA understands rec, as long as rec.id and rec.seq exist, and strings
|
136
|
-
(req.seq is a String). Alternatively we can convert to a Codon sequence by
|
144
|
+
(req.seq is a String). Alternatively we can first convert to a Codon sequence by
|
137
145
|
|
138
146
|
```ruby
|
139
147
|
fasta.each do | rec |
|
140
148
|
aln.sequences << CodonSequence.new(rec.id,rec.seq)
|
141
149
|
end
|
150
|
+
# get first codon element of the fourth sequence
|
151
|
+
codon = aln[3][0]
|
142
152
|
```
|
143
153
|
|
144
154
|
The Matrix can be accessed in transposed fashion, but accessing the normal
|
145
|
-
matrix and transposed matrix at the same time is not supported.
|
146
|
-
designed to be transaction safe -
|
147
|
-
|
155
|
+
matrix and transposed matrix at the same time is not supported. Note that
|
156
|
+
Matrix editing is not designed to be transaction safe - better to copy the
|
157
|
+
Matrix when editing.
|
148
158
|
|
149
159
|
## Adding functionality
|
150
160
|
|
151
|
-
To ascertain that the basic BioAlignment implementation does not get
|
152
|
-
|
153
|
-
modules can be added at run time(!) One advantage is that there is
|
154
|
-
|
155
|
-
|
156
|
-
|
157
|
-
named del_bridges:
|
161
|
+
To ascertain that the basic BioAlignment implementation does not get polluted
|
162
|
+
with heaps of methods, extra functionality is added by using modules. These
|
163
|
+
modules can be added at run time(!) One advantage is that there is less name
|
164
|
+
space pollution, the other is that different implementations can be plugged in
|
165
|
+
- using the same interface. For example, here we are going to use an alignment
|
166
|
+
editor named DelBridges, which has a method named del_bridges:
|
158
167
|
|
159
168
|
```ruby
|
160
169
|
require 'bio-alignment/edit/del_bridges'
|
@@ -164,18 +173,19 @@ named del_bridges:
|
|
164
173
|
aln2 = aln.del_bridges
|
165
174
|
```
|
166
175
|
|
167
|
-
in other words, the functionality in DelBridges gets attached to the
|
168
|
-
|
169
|
-
|
170
|
-
|
171
|
-
|
172
|
-
|
176
|
+
in other words, the functionality in DelBridges gets attached to the aln
|
177
|
+
instance at run time, without affecting any other instantiated object(!) Also,
|
178
|
+
when not requiring 'bio-alignment/edit/del_bridges', the functionality is never
|
179
|
+
visible, and never added to the runtime environment. This type of runtime
|
180
|
+
plugin is something you can only do in a dynamic language, such as Ruby. Ruby,
|
181
|
+
makes it rather convenient.
|
173
182
|
|
174
|
-
|
175
|
-
deletion state,
|
183
|
+
You may have created own style sequence objects in an alignment. To register a
|
184
|
+
prefab deletion state, extend the sequence with the RowState module:
|
176
185
|
|
177
186
|
```ruby
|
178
187
|
require 'bio-alignment/state'
|
188
|
+
# Use the standard BioRuby sequence object
|
179
189
|
bioseq = Bio::Sequence::NA.new("AGCT")
|
180
190
|
bioseq.extend(State) # add state
|
181
191
|
bioseq.state = RowState.new # set state
|
@@ -183,10 +193,10 @@ deletion state, simply extend the sequence with the RowState module:
|
|
183
193
|
> false
|
184
194
|
```
|
185
195
|
|
186
|
-
That is impressive - the BioRuby Sequence has no deletion state facility
|
187
|
-
just added that, and it can even be used in BioAlignment editors
|
188
|
-
such a state object. See also the scenario "Give deletion state
|
189
|
-
Bio::Sequence object" in the bioruby.feature.
|
196
|
+
That is impressive - the BioRuby Sequence has no deletion state facility by
|
197
|
+
itself. We just added that, and it can even be used in BioAlignment editors
|
198
|
+
which require such a state object. See also the scenario "Give deletion state
|
199
|
+
to a Bio::Sequence object" in the bioruby.feature.
|
190
200
|
|
191
201
|
Note: if we wanted only to allow one plugin per instance at a time, we can
|
192
202
|
create a generic interface with a method of the same name for every
|
@@ -195,9 +205,10 @@ multiple plugins (by default).
|
|
195
205
|
|
196
206
|
## Adding Phylogenetic support
|
197
207
|
|
198
|
-
|
199
|
-
we extend BioAlignment with
|
200
|
-
with the add_tree method.
|
208
|
+
An MSA often comes with a phylogenetic tree. Similar to runtime adding of the
|
209
|
+
delete state module, now we extend BioAlignment with
|
210
|
+
BioAlignment::AlignmentTree. A tree is plugged in with the add_tree method. See
|
211
|
+
the README and features directory for more examples.
|
201
212
|
|
202
213
|
## Methods returning alignments and concurrency
|
203
214
|
|
@@ -216,7 +227,7 @@ in functional style, such as
|
|
216
227
|
```
|
217
228
|
|
218
229
|
where aln2 is a copy (of aln) with columns removed that were marked for
|
219
|
-
deletion. In other words,
|
230
|
+
deletion. In other words, applied ''Functional programming in Ruby.'' If
|
220
231
|
functions can be easily 'piped', and code can be easily copy and pasted into
|
221
232
|
different algorithms, it is likely that the module is written in a functional
|
222
233
|
style.
|
@@ -0,0 +1,31 @@
|
|
1
|
+
require 'bio-alignment/edit/tree_splitter.rb'
|
2
|
+
|
3
|
+
When /^I split the tree$/ do |string|
|
4
|
+
tree = @aln.attach_tree(@tree)
|
5
|
+
@aln.extend TreeSplitter
|
6
|
+
(aln1,aln2) = @aln.split_on_distance
|
7
|
+
aln2.size.should == 5
|
8
|
+
@split1 = aln1
|
9
|
+
@split2 = aln2
|
10
|
+
end
|
11
|
+
|
12
|
+
Then /^I should have found sub\-trees "([^"]*)" and "([^"]*)"$/ do |arg1, arg2|
|
13
|
+
@split2.ids.sort.join(",").should == arg2
|
14
|
+
@split1.ids.sort.join(",").should == arg1
|
15
|
+
end
|
16
|
+
|
17
|
+
When /^I split the tree with a target of (\d+)$/ do |arg1|
|
18
|
+
tree = @aln.attach_tree(@tree)
|
19
|
+
@aln.extend TreeSplitter
|
20
|
+
@split1,@split2 = @aln.split_on_distance(arg1.to_i)
|
21
|
+
end
|
22
|
+
|
23
|
+
Then /^I should have found low\-homology sub\-tree "([^"]*)"$/ do |arg1|
|
24
|
+
@split1.ids.sort.join(",").should == arg1
|
25
|
+
end
|
26
|
+
|
27
|
+
Then /^I should have found high\-homology sub\-tree "([^"]*)"$/ do |arg1|
|
28
|
+
@split2.ids.sort.join(",").should == arg1
|
29
|
+
end
|
30
|
+
|
31
|
+
|
@@ -0,0 +1,66 @@
|
|
1
|
+
@split
|
2
|
+
Feature: Splitting alignments into equal sized branches using phylogenetic tree info
|
3
|
+
|
4
|
+
Sometimes we want to split a large alignment into sub-sets. When an
|
5
|
+
alignment is accompanied by a phylogenetic tree, we can greedily split the
|
6
|
+
tree. With a rooted tree, we start from the root, and walk the tree, taking
|
7
|
+
the shortest edge at every node (a tie may favour splitting). If the tree can
|
8
|
+
be split, so that both sides are similar sized, the job is done (if you want
|
9
|
+
more splits, just repeat the exercise). Essentially one subset shows
|
10
|
+
relatively high homology, the other relatively low homology. This is a crude
|
11
|
+
method, but has the advantage of being quick to calculate and reproducible.
|
12
|
+
If there is no root, we start from the point next to the longest edge.
|
13
|
+
|
14
|
+
We add one 'target_size' parameter to allow for leaving more sequences in the
|
15
|
+
high homology subset. 'target_size' sets the allowed size of the
|
16
|
+
high-homology alignment. For example, setting it to 10 in a 15 sequence
|
17
|
+
alignment, will stop the splitting at 5 sequences, leaving (approx.) 10
|
18
|
+
sequences in the high homology group. Likewise, setting it to 5 will continue
|
19
|
+
splitting until that number is reached.
|
20
|
+
|
21
|
+
In below example the tree will be split in a branch with similar sequences,
|
22
|
+
and a branch with sequences that are somewhat removed.
|
23
|
+
|
24
|
+
Scenario: Split a tree
|
25
|
+
Given I have a multiple sequence alignment (MSA)
|
26
|
+
"""
|
27
|
+
seq1 ----SNSFSRPTIIFSGCSTACSGK--SELVCGFRSFMLSDV
|
28
|
+
seq2 SSIISNSFSRPTIIFSGCSTACSGK--SEQVCGFR---LSDV
|
29
|
+
seq3 SSIISNSFSRPTIIFSGCSTACSGKLTSEQVCGFR---LSDV
|
30
|
+
seq4 ----PKLFSRPTIIFSGCSTACSGK--SEPVCGFRSFMLSDV
|
31
|
+
seq5 ----------PTIIFSGCSKACSGKGLSELVCGFRSFMLSDV
|
32
|
+
seq6 ----------PTIIFSGCSKACSGK-----FRSFRSFMLSAV
|
33
|
+
seq7 ----------PTIIFSGCSKACSGK-----VCGIFHAVRSFM
|
34
|
+
seq8 ----------PTIIFSGCSKACSGK--SELVCGFRSFMLSAV
|
35
|
+
"""
|
36
|
+
And I have a phylogenetic tree in Newick format
|
37
|
+
"""
|
38
|
+
((seq6:5.3571434,(seq4:4.04762,((seq8:1.1904755,seq5:1.1904755):1.7857151,((seq3:0.0,seq2:0.0):1.1904755,seq1:1.1904755):1.7857151):1.0714293):1.3095236):4.336735,seq7:9.693878);
|
39
|
+
"""
|
40
|
+
When I split the tree
|
41
|
+
"""
|
42
|
+
,--9.69----------------------------------------- seq7 ----------PTIIFSGCSKACSGK-----VCGIFHAVRSFM
|
43
|
+
| ,--1.19----- seq1 ----SNSFSRPTIIFSGCSTACSGK--SELVCGFRSFMLSDV
|
44
|
+
| ,--1.79--| ,-- seq2 SSIISNSFSRPTIIFSGCSTACSGK--SEQVCGFR---LSDV
|
45
|
+
| ,--1.07--| `--1.19--+-- seq3 SSIISNSFSRPTIIFSGCSTACSGKLTSEQVCGFR---LSDV
|
46
|
+
| | `--1.79--+--1.19----- seq5 ----------PTIIFSGCSKACSGKGLSELVCGFRSFMLSDV
|
47
|
+
| ,--1.31--| `--1.19----- seq8 ----------PTIIFSGCSKACSGK--SELVCGFRSFMLSAV
|
48
|
+
`--4.34--| `--4.05----------------------- seq4 ----PKLFSRPTIIFSGCSTACSGK--SEPVCGFRSFMLSDV
|
49
|
+
`--5.36-------------------------------- seq6 ----------PTIIFSGCSKACSGK-----FRSFRSFMLSAV
|
50
|
+
"""
|
51
|
+
Then I should have found sub-trees "seq4,seq6,seq7" and "seq1,seq2,seq3,seq5,seq8"
|
52
|
+
When I split the tree with a target of 2
|
53
|
+
Then I should have found high-homology sub-tree "seq5,seq8"
|
54
|
+
When I split the tree with a target of 3
|
55
|
+
Then I should have found high-homology sub-tree "seq1,seq2,seq3"
|
56
|
+
When I split the tree with a target of 4
|
57
|
+
Then I should have found high-homology sub-tree "seq1,seq2,seq3"
|
58
|
+
When I split the tree with a target of 5
|
59
|
+
Then I should have found high-homology sub-tree "seq1,seq2,seq3,seq5,seq8"
|
60
|
+
When I split the tree with a target of 6
|
61
|
+
Then I should have found high-homology sub-tree "seq1,seq2,seq3,seq4,seq5,seq8"
|
62
|
+
When I split the tree with a target of 7
|
63
|
+
Then I should have found low-homology sub-tree "seq7"
|
64
|
+
When I split the tree with a target of 6
|
65
|
+
Then I should have found low-homology sub-tree "seq6,seq7"
|
66
|
+
|
@@ -27,24 +27,29 @@ Then /^I should be able to traverse the tree$/ do
|
|
27
27
|
root = @aln.root # get the root of the tree
|
28
28
|
root.leaf?.should == false
|
29
29
|
children = root.children
|
30
|
+
# root has one direct leaf
|
30
31
|
children.map { |n| n.name }.sort.should == ["","seq7"]
|
31
32
|
seq7 = children.last
|
32
33
|
seq7.name.should == 'seq7'
|
33
34
|
seq7.leaf?.should == true
|
34
35
|
seq7.parent.should == root
|
36
|
+
# find leaf seq4
|
35
37
|
seq4 = tree.find("seq4")
|
36
38
|
seq4.leaf?.should == true
|
37
|
-
|
39
|
+
# total distance to seq7 9.69+4.34+1.31+4.05 ~ 19.38
|
40
|
+
seq4.distance(seq7).should == 19.387756600000003 # BioRuby does this!
|
38
41
|
end
|
39
42
|
|
40
43
|
Then /^fetch elements from the MSA from each end node in the tree$/ do
|
41
44
|
# walk the tree
|
42
45
|
tree = @aln.attach_tree(@tree)
|
43
46
|
ids = []
|
47
|
+
# Walk the ordered tree and fetch the sequence from the alignment
|
44
48
|
column20 = tree.map { | leaf |
|
45
49
|
ids << leaf.name
|
50
|
+
# we have the ID, now find the alignment
|
46
51
|
seq = @aln.find(leaf.name)
|
47
|
-
#
|
52
|
+
# Return the 18th nucleotide, just for show
|
48
53
|
seq[19]
|
49
54
|
}
|
50
55
|
ids.should == ["seq6", "seq4", "seq8", "seq5", "seq3", "seq2", "seq1", "seq7"]
|
@@ -52,11 +57,35 @@ Then /^fetch elements from the MSA from each end node in the tree$/ do
|
|
52
57
|
end
|
53
58
|
|
54
59
|
Then /^calculate the phylogenetic distance between each element$/ do
|
55
|
-
|
60
|
+
# we did this earlier with
|
61
|
+
tree = @aln.attach_tree(@tree)
|
62
|
+
seq7 = tree.find("seq7")
|
63
|
+
seq4 = tree.find("seq4")
|
64
|
+
# total distance to seq7 9.69+4.34+1.31+4.05 ~ 19.38
|
65
|
+
seq4.distance(seq7).should == 19.387756600000003 # BioRuby does this!
|
66
|
+
end
|
67
|
+
|
68
|
+
Then /^find that the nearest sequence to "([^"]*)" is "([^"]*)"$/ do |arg1, arg2|
|
69
|
+
tree = @aln.attach_tree(@tree)
|
70
|
+
seq = tree.find(arg1)
|
71
|
+
seq.nearest.map{|n|n.to_s}.sort.join(',').should == arg2
|
56
72
|
end
|
57
73
|
|
74
|
+
Then /^find that "([^"]*)" is on the same branch as "([^"]*)"$/ do |arg1, arg2|
|
75
|
+
# really the same as the above
|
76
|
+
tree = @aln.attach_tree(@tree)
|
77
|
+
seq = tree.find(arg1)
|
78
|
+
seq.nearest.map{|n|n.to_s}.sort.join(',').should == arg2
|
79
|
+
end
|
80
|
+
|
81
|
+
|
58
82
|
Then /^draw the MSA with the tree$/ do | string |
|
59
83
|
# textual drawing, like tabtree, or http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/149701
|
84
|
+
# or BioPythons http://biopython.org/DIST/docs/api/Bio.Phylo._utils-pysrc.html#draw_ascii
|
85
|
+
# hg clone https://bitbucket.org/keesey/namesonnodes-sa
|
86
|
+
#
|
87
|
+
# http://cegg.unige.ch/newick_utils
|
88
|
+
# http://code.google.com/p/a3lbmonkeybrain-as3/source/browse/trunk/src/a3lbmonkeybrain/calculia/collections/graphs/exporters/TextCladogramExporter.as?spec=svn26&r=26
|
60
89
|
print string
|
61
90
|
pending # express the regexp above with the code you wish you had
|
62
91
|
end
|
@@ -1,6 +1,8 @@
|
|
1
1
|
@tree
|
2
2
|
Feature: Tree support for alignments
|
3
|
-
Alignments are often accompanied by phylogenetic trees.
|
3
|
+
Alignments are often accompanied by phylogenetic trees. When we
|
4
|
+
have an alignment with its tree, we want to traverse the tree
|
5
|
+
and calculate distances.
|
4
6
|
|
5
7
|
Scenario: Get ordered elements from a tree
|
6
8
|
Given I have a multiple sequence alignment (MSA)
|
@@ -21,6 +23,11 @@ Feature: Tree support for alignments
|
|
21
23
|
Then I should be able to traverse the tree
|
22
24
|
And fetch elements from the MSA from each end node in the tree
|
23
25
|
And calculate the phylogenetic distance between each element
|
26
|
+
And find that the nearest sequence to "seq2" is "seq3"
|
27
|
+
And find that the nearest sequence to "seq5" is "seq8"
|
28
|
+
And find that the nearest sequence to "seq1" is "seq2,seq3"
|
29
|
+
And find that "seq1" is on the same branch as "seq2,seq3"
|
30
|
+
And find that "seq4" is on the same branch as "seq1,seq2,seq3,seq5,seq8"
|
24
31
|
And draw the MSA with the tree
|
25
32
|
"""
|
26
33
|
,--9.69----------------------------------------- seq7 ----------PTIIFSGCSKACSGK-----VCGIFHAVRSFM
|
@@ -15,6 +15,7 @@ module Bio
|
|
15
15
|
include Columns
|
16
16
|
|
17
17
|
attr_accessor :sequences
|
18
|
+
attr_reader :tree
|
18
19
|
|
19
20
|
# Create alignment. seqs can be a list of sequences. If these
|
20
21
|
# are String types, they get converted to the library Sequence
|
@@ -41,6 +42,16 @@ module Bio
|
|
41
42
|
|
42
43
|
alias rows sequences
|
43
44
|
|
45
|
+
# return an array of sequence ids
|
46
|
+
def ids
|
47
|
+
rows.map { |r| r.id }
|
48
|
+
end
|
49
|
+
|
50
|
+
def size
|
51
|
+
rows.size
|
52
|
+
end
|
53
|
+
|
54
|
+
# Return a sequence by index
|
44
55
|
def [] index
|
45
56
|
rows[index]
|
46
57
|
end
|
@@ -59,7 +70,7 @@ module Bio
|
|
59
70
|
each do | seq |
|
60
71
|
return seq if seq.id == name
|
61
72
|
end
|
62
|
-
raise "ERROR: Sequence not found by its name
|
73
|
+
raise "ERROR: Sequence not found by its name, looking for <#{name}>"
|
63
74
|
end
|
64
75
|
|
65
76
|
# clopy alignment and allow updating elements
|
@@ -85,6 +96,8 @@ module Bio
|
|
85
96
|
aln.sequences << seq.clone
|
86
97
|
end
|
87
98
|
aln.clone_columns! if @columns
|
99
|
+
# clone the tree
|
100
|
+
@tree = @tree.clone if @tree
|
88
101
|
aln
|
89
102
|
end
|
90
103
|
|
@@ -96,6 +109,19 @@ module Bio
|
|
96
109
|
@tree = Tree::init(tree)
|
97
110
|
@tree
|
98
111
|
end
|
112
|
+
|
113
|
+
# Reduce an alignment, based on the new tree
|
114
|
+
def tree_reduce new_tree
|
115
|
+
names = new_tree.map { | node | node.name }.compact
|
116
|
+
# p names
|
117
|
+
nrows = []
|
118
|
+
names.each do | name |
|
119
|
+
nrows << find(name).clone
|
120
|
+
end
|
121
|
+
new_aln = Alignment.new(nrows)
|
122
|
+
new_aln.attach_tree(new_tree.clone)
|
123
|
+
new_aln
|
124
|
+
end
|
99
125
|
end
|
100
126
|
end
|
101
127
|
end
|
@@ -0,0 +1,58 @@
|
|
1
|
+
module Bio
|
2
|
+
module BioAlignment
|
3
|
+
|
4
|
+
# Split an alignment based on its phylogeny
|
5
|
+
module TreeSplitter
|
6
|
+
|
7
|
+
# Split an alignment using a phylogeny tree. One half contains sequences
|
8
|
+
# that are relatively homologues, the other half contains the rest. This
|
9
|
+
# is described in the tree-split.feature in the features directory.
|
10
|
+
#
|
11
|
+
# The target_size parameter gives the size of the homologues sequence
|
12
|
+
# set. If target_size is nil, the set will be split in half.
|
13
|
+
#
|
14
|
+
# Returns two alignments with their matching trees attached
|
15
|
+
def split_on_distance target_size = nil
|
16
|
+
target_size = size/2+1 if not target_size
|
17
|
+
|
18
|
+
aln1 = clone
|
19
|
+
# Start from the root of the tree (FIXME: what if there is no root?)
|
20
|
+
prev_root = nil
|
21
|
+
new_root = aln1.tree.root
|
22
|
+
while new_root
|
23
|
+
# find the nearest child (shortest edge)
|
24
|
+
near_children = new_root.nearest_children
|
25
|
+
# We possibly have multiple matches, so we are going to split on the
|
26
|
+
# number of leafs, or we leave it like it is, if the split will be
|
27
|
+
# too far from the target
|
28
|
+
prev_root = new_root
|
29
|
+
new_root = near_children.first
|
30
|
+
near_children.each do |c|
|
31
|
+
next if c == new_root
|
32
|
+
# find the nearest match
|
33
|
+
if (c.leaves.size-target_size).abs < (new_root.leaves.size-target_size).abs
|
34
|
+
new_root = c
|
35
|
+
end
|
36
|
+
end
|
37
|
+
# Break out of the loop when we hit the target
|
38
|
+
break if new_root.leaves.size <= target_size
|
39
|
+
end
|
40
|
+
# Now see if whether the last step actually was an improvement, otherwise
|
41
|
+
# we take one node up
|
42
|
+
# p [(prev_root.leaves.size-target_size).abs,(new_root.leaves.size-target_size).abs]
|
43
|
+
new_root = prev_root if (prev_root.leaves.size-target_size).abs < (new_root.leaves.size-target_size).abs
|
44
|
+
branch = aln1.tree.clone_subtree(new_root)
|
45
|
+
reduced_tree = aln1.tree.clone_tree_without_branch(new_root)
|
46
|
+
# p branch.map { |n| n.name }.compact
|
47
|
+
# p reduced_tree.map { |n| n.name }.compact
|
48
|
+
|
49
|
+
# Now reduce the alignments themselves to match the trees
|
50
|
+
aln1 = tree_reduce(reduced_tree)
|
51
|
+
aln2 = tree_reduce(branch)
|
52
|
+
return aln1,aln2
|
53
|
+
end
|
54
|
+
|
55
|
+
end
|
56
|
+
end
|
57
|
+
end
|
58
|
+
|
data/lib/bio-alignment/tree.rb
CHANGED
@@ -37,27 +37,72 @@ module Bio
|
|
37
37
|
# Here we add to BioRuby's Bio::Tree classes
|
38
38
|
class Tree
|
39
39
|
class Node
|
40
|
+
# Add tree information to this node, so it can be queried
|
40
41
|
def inject_tree tree
|
41
42
|
@tree = tree
|
43
|
+
self
|
42
44
|
end
|
43
45
|
|
46
|
+
# Is this Node a leaf?
|
44
47
|
def leaf?
|
45
48
|
children.size == 0
|
46
49
|
end
|
47
50
|
|
51
|
+
# Get the children of this Node
|
48
52
|
def children
|
49
53
|
@tree.children(self)
|
50
54
|
end
|
51
55
|
|
56
|
+
def descendents
|
57
|
+
@tree.descendents(self)
|
58
|
+
end
|
59
|
+
|
60
|
+
# Get the parents of this Node
|
52
61
|
def parent
|
53
62
|
@tree.parent(self)
|
54
63
|
end
|
64
|
+
|
65
|
+
# Get the direct sibling nodes (i.e. parent.children)
|
66
|
+
def siblings
|
67
|
+
parent.children - [self]
|
68
|
+
end
|
69
|
+
|
70
|
+
# Return the leaves of this node
|
71
|
+
def leaves
|
72
|
+
@tree.leaves(self)
|
73
|
+
end
|
74
|
+
|
75
|
+
# Find the nearest and dearest, i.e. the leafs attached to the parent
|
76
|
+
# node
|
77
|
+
def nearest
|
78
|
+
@tree.leaves(parent) - [self]
|
79
|
+
end
|
55
80
|
|
56
|
-
# Get the distance to another node
|
81
|
+
# Get the distance to another node
|
57
82
|
def distance other
|
58
83
|
@tree.distance(self,other)
|
59
84
|
end
|
60
|
-
|
85
|
+
|
86
|
+
# Get child node with the shortest edge - note that if there are more
|
87
|
+
# than one, the first will be picked
|
88
|
+
def nearest_child
|
89
|
+
c = nil
|
90
|
+
children.each do |n|
|
91
|
+
c=n if not c or distance(n)<distance(c)
|
92
|
+
end
|
93
|
+
c
|
94
|
+
end
|
95
|
+
|
96
|
+
# Get the child nodes with the shortest edge - returns an Array
|
97
|
+
def nearest_children
|
98
|
+
min_distance = distance(nearest_child)
|
99
|
+
cs = []
|
100
|
+
children.each do |n|
|
101
|
+
cs << n if distance(n) == min_distance
|
102
|
+
end
|
103
|
+
cs
|
104
|
+
end
|
105
|
+
end # End of injecting Node functionality
|
61
106
|
|
62
107
|
def find name
|
63
108
|
get_node_by_name(name)
|
@@ -65,12 +110,57 @@ module Bio
|
|
65
110
|
|
66
111
|
# Walk the ordered tree leaves, calling into the block, and return an array
|
67
112
|
def map
|
68
|
-
|
69
|
-
|
70
|
-
|
71
|
-
|
113
|
+
leaves.map { | leaf | yield leaf }
|
114
|
+
end
|
115
|
+
|
116
|
+
# Create a deep clone of the tree
|
117
|
+
def clone_subtree start_node
|
118
|
+
new_tree = self.class.new
|
119
|
+
list = [start_node] + start_node.descendents
|
120
|
+
list.each do |x|
|
121
|
+
new_tree.add_node(x)
|
122
|
+
end
|
123
|
+
each_edge do |node1, node2, edge|
|
124
|
+
if new_tree.include?(node1) and new_tree.include?(node2)
|
125
|
+
new_tree.add_edge(node1, node2, edge)
|
126
|
+
end
|
127
|
+
end
|
128
|
+
new_tree
|
129
|
+
end
|
130
|
+
|
131
|
+
# Clone a tree without the branch starting at node
|
132
|
+
def clone_tree_without_branch node
|
133
|
+
new_tree = self.class.new
|
134
|
+
original = [root] + root.descendents
|
135
|
+
# p "Original",original
|
136
|
+
skip = [node] + node.descendents
|
137
|
+
# p "Skip",skip
|
138
|
+
# p "Retain",root.descendents - skip
|
139
|
+
nodes.each do |x|
|
140
|
+
if not skip.include?(x)
|
141
|
+
new_tree.add_node(x)
|
142
|
+
else
|
143
|
+
end
|
144
|
+
end
|
145
|
+
each_edge do |node1, node2, edge|
|
146
|
+
if new_tree.include?(node1) and new_tree.include?(node2)
|
147
|
+
new_tree.add_edge(node1, node2, edge)
|
148
|
+
end
|
149
|
+
end
|
150
|
+
new_tree
|
151
|
+
end
|
152
|
+
|
153
|
+
def clone
|
154
|
+
new_tree = self.class.new
|
155
|
+
nodes.each do |x|
|
156
|
+
new_tree.add_node(x)
|
157
|
+
end
|
158
|
+
self.each_edge do |node1, node2, edge|
|
159
|
+
if new_tree.include?(node1) and new_tree.include?(node2) then
|
160
|
+
new_tree.add_edge(node1, node2, edge)
|
161
|
+
end
|
72
162
|
end
|
73
|
-
|
163
|
+
new_tree
|
74
164
|
end
|
75
165
|
|
76
166
|
end
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: bio-alignment
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.0.
|
4
|
+
version: 0.0.7
|
5
5
|
prerelease:
|
6
6
|
platform: ruby
|
7
7
|
authors:
|
@@ -9,11 +9,11 @@ authors:
|
|
9
9
|
autorequire:
|
10
10
|
bindir: bin
|
11
11
|
cert_chain: []
|
12
|
-
date: 2012-
|
12
|
+
date: 2012-06-25 00:00:00.000000000Z
|
13
13
|
dependencies:
|
14
14
|
- !ruby/object:Gem::Dependency
|
15
15
|
name: bio-logger
|
16
|
-
requirement: &
|
16
|
+
requirement: &83191660 !ruby/object:Gem::Requirement
|
17
17
|
none: false
|
18
18
|
requirements:
|
19
19
|
- - ! '>='
|
@@ -21,10 +21,10 @@ dependencies:
|
|
21
21
|
version: '0'
|
22
22
|
type: :runtime
|
23
23
|
prerelease: false
|
24
|
-
version_requirements: *
|
24
|
+
version_requirements: *83191660
|
25
25
|
- !ruby/object:Gem::Dependency
|
26
26
|
name: bio
|
27
|
-
requirement: &
|
27
|
+
requirement: &83191360 !ruby/object:Gem::Requirement
|
28
28
|
none: false
|
29
29
|
requirements:
|
30
30
|
- - ! '>='
|
@@ -32,10 +32,10 @@ dependencies:
|
|
32
32
|
version: 1.4.2
|
33
33
|
type: :runtime
|
34
34
|
prerelease: false
|
35
|
-
version_requirements: *
|
35
|
+
version_requirements: *83191360
|
36
36
|
- !ruby/object:Gem::Dependency
|
37
37
|
name: rake
|
38
|
-
requirement: &
|
38
|
+
requirement: &83190960 !ruby/object:Gem::Requirement
|
39
39
|
none: false
|
40
40
|
requirements:
|
41
41
|
- - ! '>='
|
@@ -43,10 +43,10 @@ dependencies:
|
|
43
43
|
version: '0'
|
44
44
|
type: :development
|
45
45
|
prerelease: false
|
46
|
-
version_requirements: *
|
46
|
+
version_requirements: *83190960
|
47
47
|
- !ruby/object:Gem::Dependency
|
48
48
|
name: bio-bigbio
|
49
|
-
requirement: &
|
49
|
+
requirement: &83190640 !ruby/object:Gem::Requirement
|
50
50
|
none: false
|
51
51
|
requirements:
|
52
52
|
- - ! '>'
|
@@ -54,10 +54,10 @@ dependencies:
|
|
54
54
|
version: 0.1.3
|
55
55
|
type: :development
|
56
56
|
prerelease: false
|
57
|
-
version_requirements: *
|
57
|
+
version_requirements: *83190640
|
58
58
|
- !ruby/object:Gem::Dependency
|
59
59
|
name: cucumber
|
60
|
-
requirement: &
|
60
|
+
requirement: &83190190 !ruby/object:Gem::Requirement
|
61
61
|
none: false
|
62
62
|
requirements:
|
63
63
|
- - ! '>='
|
@@ -65,21 +65,21 @@ dependencies:
|
|
65
65
|
version: '0'
|
66
66
|
type: :development
|
67
67
|
prerelease: false
|
68
|
-
version_requirements: *
|
68
|
+
version_requirements: *83190190
|
69
69
|
- !ruby/object:Gem::Dependency
|
70
70
|
name: rspec
|
71
|
-
requirement: &
|
71
|
+
requirement: &83095350 !ruby/object:Gem::Requirement
|
72
72
|
none: false
|
73
73
|
requirements:
|
74
74
|
- - ~>
|
75
75
|
- !ruby/object:Gem::Version
|
76
|
-
version: 2.
|
76
|
+
version: 2.10.0
|
77
77
|
type: :development
|
78
78
|
prerelease: false
|
79
|
-
version_requirements: *
|
79
|
+
version_requirements: *83095350
|
80
80
|
- !ruby/object:Gem::Dependency
|
81
81
|
name: bundler
|
82
|
-
requirement: &
|
82
|
+
requirement: &83094950 !ruby/object:Gem::Requirement
|
83
83
|
none: false
|
84
84
|
requirements:
|
85
85
|
- - ! '>='
|
@@ -87,10 +87,10 @@ dependencies:
|
|
87
87
|
version: 1.0.21
|
88
88
|
type: :development
|
89
89
|
prerelease: false
|
90
|
-
version_requirements: *
|
90
|
+
version_requirements: *83094950
|
91
91
|
- !ruby/object:Gem::Dependency
|
92
92
|
name: jeweler
|
93
|
-
requirement: &
|
93
|
+
requirement: &83094400 !ruby/object:Gem::Requirement
|
94
94
|
none: false
|
95
95
|
requirements:
|
96
96
|
- - ! '>='
|
@@ -98,7 +98,7 @@ dependencies:
|
|
98
98
|
version: '0'
|
99
99
|
type: :development
|
100
100
|
prerelease: false
|
101
|
-
version_requirements: *
|
101
|
+
version_requirements: *83094400
|
102
102
|
description: Alignment handler for multiple sequence alignments (MSA)
|
103
103
|
email: pjotr.public01@thebird.nl
|
104
104
|
executables:
|
@@ -138,10 +138,12 @@ files:
|
|
138
138
|
- features/edit/mask_serial_mutations.feature
|
139
139
|
- features/pal2nal-feature.rb
|
140
140
|
- features/pal2nal.feature
|
141
|
+
- features/phylogeny/split-tree-feature.rb
|
142
|
+
- features/phylogeny/split-tree.feature
|
143
|
+
- features/phylogeny/tree-feature.rb
|
144
|
+
- features/phylogeny/tree.feature
|
141
145
|
- features/rows-feature.rb
|
142
146
|
- features/rows.feature
|
143
|
-
- features/tree-feature.rb
|
144
|
-
- features/tree.feature
|
145
147
|
- lib/bio-alignment.rb
|
146
148
|
- lib/bio-alignment/alignment.rb
|
147
149
|
- lib/bio-alignment/bioruby.rb
|
@@ -154,6 +156,7 @@ files:
|
|
154
156
|
- lib/bio-alignment/edit/edit_rows.rb
|
155
157
|
- lib/bio-alignment/edit/mask_islands.rb
|
156
158
|
- lib/bio-alignment/edit/mask_serial_mutations.rb
|
159
|
+
- lib/bio-alignment/edit/tree_splitter.rb
|
157
160
|
- lib/bio-alignment/elements.rb
|
158
161
|
- lib/bio-alignment/pal2nal.rb
|
159
162
|
- lib/bio-alignment/rows.rb
|
@@ -183,7 +186,7 @@ required_ruby_version: !ruby/object:Gem::Requirement
|
|
183
186
|
version: '0'
|
184
187
|
segments:
|
185
188
|
- 0
|
186
|
-
hash:
|
189
|
+
hash: 900281341
|
187
190
|
required_rubygems_version: !ruby/object:Gem::Requirement
|
188
191
|
none: false
|
189
192
|
requirements:
|
@@ -192,7 +195,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
192
195
|
version: '0'
|
193
196
|
requirements: []
|
194
197
|
rubyforge_project:
|
195
|
-
rubygems_version: 1.8.
|
198
|
+
rubygems_version: 1.8.6
|
196
199
|
signing_key:
|
197
200
|
specification_version: 3
|
198
201
|
summary: Support for multiple sequence alignments (MSA)
|