bio-alignment 0.0.6 → 0.0.7
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/Gemfile +1 -1
- data/README.md +65 -16
- data/VERSION +1 -1
- data/doc/bio-alignment-design.md +83 -72
- data/features/phylogeny/split-tree-feature.rb +31 -0
- data/features/phylogeny/split-tree.feature +66 -0
- data/features/{tree-feature.rb → phylogeny/tree-feature.rb} +32 -3
- data/features/{tree.feature → phylogeny/tree.feature} +8 -1
- data/lib/bio-alignment/alignment.rb +27 -1
- data/lib/bio-alignment/edit/tree_splitter.rb +58 -0
- data/lib/bio-alignment/tree.rb +97 -7
- metadata +26 -23
data/Gemfile
CHANGED
data/README.md
CHANGED
@@ -1,22 +1,39 @@
|
|
1
1
|
# bio-alignment
|
2
2
|
|
3
|
-
|
3
|
+
Matrix style alignment handler for multiple sequence alignments (MSA).
|
4
4
|
|
5
|
-
|
6
|
-
sequence object. Support for any nucleotide, amino acid and codon
|
7
|
-
sequences that are lists. Any list with payload can be used (e.g.
|
8
|
-
nucleotide quality score, codon annotation). The only requirement is
|
9
|
-
that the list is iterable and can be indexed.
|
5
|
+
[](http://travis-ci.org/pjotrp/bioruby-alignment)
|
10
6
|
|
11
|
-
This
|
7
|
+
This alignment handler makes no assumptions about the underlying
|
8
|
+
sequence object. It supports any nucleotide, amino acid and codon
|
9
|
+
sequences that are lists. Any list with payload or state, can be used
|
10
|
+
(e.g. nucleotide quality score, codon annotation). The only
|
11
|
+
requirement is that the list is Enumerable and can be indexed, i.e.
|
12
|
+
inherit Ruby Enumerable and have the [] method.
|
13
|
+
|
14
|
+
Features are:
|
15
|
+
|
16
|
+
* Matrix notation for alignment object
|
17
|
+
* Functional style alignment access and editing
|
18
|
+
* Support for BioRuby Sequences
|
19
|
+
* Support for BioRuby trees and node distance calculation
|
20
|
+
* bio-alignment interacts well with BioRuby structures,
|
21
|
+
including sequence objects and alignment/tree parsers
|
22
|
+
|
23
|
+
When possible, BioRuby functionality is merged in. For example, by
|
24
|
+
supporting Bio::Sequence objects, standard BioRuby alignment
|
25
|
+
functions, sequence readers and writers can be used. By supporting the
|
26
|
+
BioRuby Tree object, standard BioRuby tree parsers and writers can be
|
27
|
+
used. bio-alignment takes alignment handling with phylogenetic tree
|
28
|
+
support to a new level.
|
29
|
+
|
30
|
+
bio-alignment is based on Pjotr's experience designing the BioScala
|
12
31
|
Alignment handler and BioRuby's PAML support. Read the
|
13
32
|
Bio::BioAlignment
|
14
33
|
[design
|
15
34
|
document](https://github.com/pjotrp/bioruby-alignment/blob/master/doc/bio-alignment-design.md)
|
16
35
|
for Ruby.
|
17
36
|
|
18
|
-
Note: this software is under active development.
|
19
|
-
|
20
37
|
## Developers
|
21
38
|
|
22
39
|
### Codon alignment example
|
@@ -40,6 +57,8 @@ aligmment (note codon gaps are represented by '---')
|
|
40
57
|
aln.rows.each do | row |
|
41
58
|
fasta.write(row.id, row.to_aa.to_s)
|
42
59
|
end
|
60
|
+
# get first codon element of the fourth sequence
|
61
|
+
p aln[3][0]
|
43
62
|
```
|
44
63
|
|
45
64
|
Now add some state - you can define your own row state
|
@@ -151,8 +170,28 @@ resulting in the codon alignment.
|
|
151
170
|
|
152
171
|
### Phylogeny
|
153
172
|
|
154
|
-
BioAlignment has support for attaching a
|
155
|
-
alignment, and traversing the tree
|
173
|
+
BioAlignment has support for attaching a phylogenetic tree to an
|
174
|
+
alignment, and traversing the tree using an intuitive interface
|
175
|
+
|
176
|
+
```ruby
|
177
|
+
sole_tree = Bio::Newick.new(string).tree # use BioRuby's tree parser
|
178
|
+
tree = aln.attach_tree(sole_tree) # attach the tree
|
179
|
+
# now do stuff with the tree, which has improved bio-align support
|
180
|
+
root = tree.root
|
181
|
+
children = root.children
|
182
|
+
children.map { |n| n.name }.sort.should == ["","seq7"]
|
183
|
+
seq7 = children.last
|
184
|
+
seq4 = tree.find("seq4")
|
185
|
+
seq4.distance(seq7).should == 19.387756600000003
|
186
|
+
print tree.output_newick # BioRuby Newick output
|
187
|
+
```
|
188
|
+
|
189
|
+
There are methods for finding sibling nodes, splitting the alignment
|
190
|
+
based on the tree, and locating sequences on the same branch. More
|
191
|
+
examples can be found in the tests and features. The underlying
|
192
|
+
implementation of Bio::Tree is that of BioRuby. We have added an OOP
|
193
|
+
layer for traversing the tree by injecting methods into the BioRuby
|
194
|
+
object itself.
|
156
195
|
|
157
196
|
### Alignment marking/masking/editing
|
158
197
|
|
@@ -249,18 +288,28 @@ where aln2 is a copy of aln with bridging columns deleted.
|
|
249
288
|
|
250
289
|
### See also
|
251
290
|
|
252
|
-
|
291
|
+
For more on the design of bio-alignment, read the
|
292
|
+
Bio::BioAlignment
|
293
|
+
[design
|
294
|
+
document](https://github.com/pjotrp/bioruby-alignment/blob/master/doc/bio-alignment-design.md).
|
295
|
+
|
296
|
+
The API documentation can be found
|
297
|
+
[online](http://rubygems.org/gems/bio-alignment). For examples see the files in
|
253
298
|
[./spec/*.rb](https://github.com/pjotrp/bioruby-alignment/tree/master/spec) and
|
254
299
|
[./features/*](https://github.com/pjotrp/bioruby-alignment/tree/master/features).
|
255
300
|
|
256
301
|
## Cite
|
257
302
|
|
258
|
-
If you use this software, please cite
|
303
|
+
If you use this software, please cite one of
|
304
|
+
|
305
|
+
* [BioRuby: bioinformatics software for the Ruby programming language](http://dx.doi.org/10.1093/bioinformatics/btq475)
|
306
|
+
* [Biogem: an effective tool-based approach for scaling up open source software development in bioinformatics](http://dx.doi.org/10.1093/bioinformatics/bts080)
|
307
|
+
|
308
|
+
## Biogems.info
|
309
|
+
|
310
|
+
This Biogem is published at [#bio-alignment](http://biogems.info/index.html)
|
259
311
|
|
260
312
|
## Copyright
|
261
313
|
|
262
314
|
Copyright (c) 2012 Pjotr Prins. See LICENSE.txt for further details.
|
263
315
|
|
264
|
-
## Biogems.info
|
265
|
-
|
266
|
-
This exciting Ruby Biogem is published on http://biogems.info/
|
data/VERSION
CHANGED
@@ -1 +1 @@
|
|
1
|
-
0.0.
|
1
|
+
0.0.7
|
data/doc/bio-alignment-design.md
CHANGED
@@ -1,58 +1,59 @@
|
|
1
1
|
# Bio-alignment design
|
2
2
|
|
3
|
-
''A well designed library should be simple and elegant to use...''
|
3
|
+
''A well designed library should be *simple* and elegant to use...''
|
4
4
|
|
5
5
|
## Introduction
|
6
6
|
|
7
|
-
Biological multi-sequence alignments (MSA) are
|
8
|
-
|
9
|
-
|
10
|
-
|
11
|
-
|
12
|
-
|
13
|
-
|
14
|
-
|
15
|
-
|
16
|
-
|
17
|
-
|
18
|
-
say we have a nucleotide sequence with pay load
|
7
|
+
Biological multi-sequence alignments (MSA) are matrices of nucleotide or amino
|
8
|
+
acid sequences with gaps. Despite this rather simple premise, most software
|
9
|
+
fails make it simple to access these structures. Also most implementations fail
|
10
|
+
to support a 'pay load' of items in the matrix (this is because underlying
|
11
|
+
sequences are String based). The result is that a developer has to track
|
12
|
+
information in multiple places. For example to track a base pair quality score
|
13
|
+
will be a second matrix of information. This makes code complex and therefore
|
14
|
+
error prone. With the bio-alignment library, elements of the matrix can carry
|
15
|
+
information, so called 'state'. When the alignment gets edited, i.e. the
|
16
|
+
element gets moved or deleted, the information gets moved or deleted along. For
|
17
|
+
example, say we have a nucleotide sequence with quality pay load
|
19
18
|
|
20
19
|
A G T A
|
21
20
|
| | | |
|
22
21
|
5 9 * 1
|
23
22
|
|
24
|
-
most library implementations will have two strings "AGTA" and "59*1".
|
25
|
-
|
26
|
-
"
|
27
|
-
|
28
|
-
|
23
|
+
most library implementations will have two strings "AGTA" and "59*1". Removing
|
24
|
+
the third nucleotide would mean removing it twice, into first "AGA", and second
|
25
|
+
"591". With bio-alignment this is one action because we have one object for
|
26
|
+
each element that contains both values, e.g. the payload of 'T' is '*'. Moving
|
27
|
+
'T' automatically moves '*'. Simple really.
|
29
28
|
|
30
|
-
In addition the bio-alignment library deals with codons and
|
31
|
-
Rather than track multiple matrices, the codon is viewed as
|
32
|
-
and the translated codon as the pay load. Again, when an alignment
|
33
|
-
reordered the code only has to do it in one place.
|
29
|
+
In addition to carrying state, the bio-alignment library deals with codons and
|
30
|
+
codon translation. Rather than track multiple matrices, the codon is viewed as
|
31
|
+
an element, and the translated codon as the pay load. Again, when an alignment
|
32
|
+
gets reordered the code only has to do it in one place.
|
34
33
|
|
35
|
-
Likewise, an alignment column can have a pay load (e.g. quality score
|
36
|
-
|
37
|
-
|
38
|
-
matrix element, column, or row 'attributes'.
|
34
|
+
Likewise, an alignment column can have a pay load (e.g. quality score in a pile
|
35
|
+
up), and an alignment row can have a pay load (e.g. the sequence name). The
|
36
|
+
concept of pay load, normally referred to as 'state', is handled through
|
37
|
+
generic matrix element, column, or row 'attributes'.
|
39
38
|
|
40
|
-
Many of these ideas came from my work on the [BioScala
|
39
|
+
Many of these ideas came from my earlier work on the [BioScala
|
41
40
|
project](https://github.com/pjotrp/bioscala/blob/master/doc/design.txt),
|
42
41
|
The BioScala library has the additional advantage of having type
|
43
|
-
safety throughout
|
42
|
+
safety throughout, but lacks many of the features I have added to the
|
43
|
+
Ruby version.
|
44
44
|
|
45
45
|
## Row or Sequence
|
46
46
|
|
47
|
-
Any sequence for an alignment is simply a list of objects. The
|
48
|
-
|
49
|
-
|
50
|
-
is a good example.
|
47
|
+
Any sequence for an alignment is simply a list of objects. The requirement for
|
48
|
+
any such list is that it should be enumerable and can be indexed. In Ruby
|
49
|
+
terms, the list has to include Enumerable and provide 'each' and '[]' methods.
|
50
|
+
The CodonSequence list, included in this library, is a good example.
|
51
51
|
|
52
52
|
In addition, elements in the list should respond to certain properties (see
|
53
53
|
below).
|
54
54
|
|
55
55
|
```ruby
|
56
|
+
# create a list of codons
|
56
57
|
codons = CodonSequence.new(rec.id,rec.seq)
|
57
58
|
print codons.id
|
58
59
|
# get first codon
|
@@ -70,7 +71,8 @@ acid with
|
|
70
71
|
print codons.seq[0].to_aa
|
71
72
|
```
|
72
73
|
|
73
|
-
in fact, because Sequence is index-able we can write
|
74
|
+
in fact, because bio-alignment demands Sequence is index-able we can write
|
75
|
+
directly
|
74
76
|
|
75
77
|
```ruby
|
76
78
|
print codons[0].to_aa # 'M'
|
@@ -85,14 +87,17 @@ do a fancy
|
|
85
87
|
aaseq = codons.map { | codon | codon.to_aa }.join("")
|
86
88
|
```
|
87
89
|
|
90
|
+
this is getting interesting... Codons, which are three letter nucleotide base
|
91
|
+
pairs, actually act as basic lists, and can be converted to amino acids.
|
92
|
+
|
88
93
|
## Element
|
89
94
|
|
90
95
|
Elements in the list should respond to a gap? method, for an alignment
|
91
96
|
gap, and the undefined? method for a position that is either an
|
92
97
|
element or a gap. Also it should respond to the to_s method.
|
93
98
|
|
94
|
-
|
95
|
-
|
99
|
+
It is important to note that an element can contain *any* pay load, or state. Ruby
|
100
|
+
objects are 'open'. You can even add state at runtime.
|
96
101
|
|
97
102
|
## Elements and CodonSequence
|
98
103
|
|
@@ -102,24 +107,25 @@ carry state.
|
|
102
107
|
|
103
108
|
The third list type we normally use in an Alignment, next to Sequence and
|
104
109
|
Elements, is the CodonSequence (remember, you can easily roll your own Sequence
|
105
|
-
type).
|
110
|
+
type, just make them Enumerable and indexed).
|
106
111
|
|
107
112
|
## Column
|
108
113
|
|
109
|
-
The column list tracks the columns of the alignment.
|
110
|
-
|
111
|
-
no elements,
|
114
|
+
The column list tracks the columns of the alignment. Again, the requirement is
|
115
|
+
that the list should be Enumerable and indexed. By default, the Column contains
|
116
|
+
no elements, only when the alignment is transposed. Matrix elements are found
|
117
|
+
by indexing on the sequences (rows).
|
112
118
|
|
113
|
-
One of the
|
119
|
+
One of the features of this library is that the Column access logic is
|
114
120
|
split out into a separate module, which accesses the data in a lazy fashion.
|
115
121
|
Also column state is stored as an 'any object'. I.e. a column can contain
|
116
|
-
any state.
|
122
|
+
any type of state.
|
117
123
|
|
118
|
-
## Matrix
|
124
|
+
## Matrix (MSA)
|
119
125
|
|
120
|
-
The
|
121
|
-
consisting of Elements. Accessing the matrix is by
|
122
|
-
by Element
|
126
|
+
The matrix (multi sequence alignment or MSA) consists of a Column list, and
|
127
|
+
multiple Sequences, in turn consisting of Elements. Accessing the matrix is by
|
128
|
+
Sequence, followed by Element, leading to a matrix style notation
|
123
129
|
|
124
130
|
```ruby
|
125
131
|
require 'bio-alignment'
|
@@ -130,31 +136,34 @@ by Element.
|
|
130
136
|
fasta.each do | rec |
|
131
137
|
aln.sequences << rec
|
132
138
|
end
|
139
|
+
# get first codon element of the fourth sequence
|
140
|
+
codon = aln[3][0]
|
133
141
|
```
|
134
142
|
|
135
143
|
note that MSA understands rec, as long as rec.id and rec.seq exist, and strings
|
136
|
-
(req.seq is a String). Alternatively we can convert to a Codon sequence by
|
144
|
+
(req.seq is a String). Alternatively we can first convert to a Codon sequence by
|
137
145
|
|
138
146
|
```ruby
|
139
147
|
fasta.each do | rec |
|
140
148
|
aln.sequences << CodonSequence.new(rec.id,rec.seq)
|
141
149
|
end
|
150
|
+
# get first codon element of the fourth sequence
|
151
|
+
codon = aln[3][0]
|
142
152
|
```
|
143
153
|
|
144
154
|
The Matrix can be accessed in transposed fashion, but accessing the normal
|
145
|
-
matrix and transposed matrix at the same time is not supported.
|
146
|
-
designed to be transaction safe -
|
147
|
-
|
155
|
+
matrix and transposed matrix at the same time is not supported. Note that
|
156
|
+
Matrix editing is not designed to be transaction safe - better to copy the
|
157
|
+
Matrix when editing.
|
148
158
|
|
149
159
|
## Adding functionality
|
150
160
|
|
151
|
-
To ascertain that the basic BioAlignment implementation does not get
|
152
|
-
|
153
|
-
modules can be added at run time(!) One advantage is that there is
|
154
|
-
|
155
|
-
|
156
|
-
|
157
|
-
named del_bridges:
|
161
|
+
To ascertain that the basic BioAlignment implementation does not get polluted
|
162
|
+
with heaps of methods, extra functionality is added by using modules. These
|
163
|
+
modules can be added at run time(!) One advantage is that there is less name
|
164
|
+
space pollution, the other is that different implementations can be plugged in
|
165
|
+
- using the same interface. For example, here we are going to use an alignment
|
166
|
+
editor named DelBridges, which has a method named del_bridges:
|
158
167
|
|
159
168
|
```ruby
|
160
169
|
require 'bio-alignment/edit/del_bridges'
|
@@ -164,18 +173,19 @@ named del_bridges:
|
|
164
173
|
aln2 = aln.del_bridges
|
165
174
|
```
|
166
175
|
|
167
|
-
in other words, the functionality in DelBridges gets attached to the
|
168
|
-
|
169
|
-
|
170
|
-
|
171
|
-
|
172
|
-
|
176
|
+
in other words, the functionality in DelBridges gets attached to the aln
|
177
|
+
instance at run time, without affecting any other instantiated object(!) Also,
|
178
|
+
when not requiring 'bio-alignment/edit/del_bridges', the functionality is never
|
179
|
+
visible, and never added to the runtime environment. This type of runtime
|
180
|
+
plugin is something you can only do in a dynamic language, such as Ruby. Ruby,
|
181
|
+
makes it rather convenient.
|
173
182
|
|
174
|
-
|
175
|
-
deletion state,
|
183
|
+
You may have created own style sequence objects in an alignment. To register a
|
184
|
+
prefab deletion state, extend the sequence with the RowState module:
|
176
185
|
|
177
186
|
```ruby
|
178
187
|
require 'bio-alignment/state'
|
188
|
+
# Use the standard BioRuby sequence object
|
179
189
|
bioseq = Bio::Sequence::NA.new("AGCT")
|
180
190
|
bioseq.extend(State) # add state
|
181
191
|
bioseq.state = RowState.new # set state
|
@@ -183,10 +193,10 @@ deletion state, simply extend the sequence with the RowState module:
|
|
183
193
|
> false
|
184
194
|
```
|
185
195
|
|
186
|
-
That is impressive - the BioRuby Sequence has no deletion state facility
|
187
|
-
just added that, and it can even be used in BioAlignment editors
|
188
|
-
such a state object. See also the scenario "Give deletion state
|
189
|
-
Bio::Sequence object" in the bioruby.feature.
|
196
|
+
That is impressive - the BioRuby Sequence has no deletion state facility by
|
197
|
+
itself. We just added that, and it can even be used in BioAlignment editors
|
198
|
+
which require such a state object. See also the scenario "Give deletion state
|
199
|
+
to a Bio::Sequence object" in the bioruby.feature.
|
190
200
|
|
191
201
|
Note: if we wanted only to allow one plugin per instance at a time, we can
|
192
202
|
create a generic interface with a method of the same name for every
|
@@ -195,9 +205,10 @@ multiple plugins (by default).
|
|
195
205
|
|
196
206
|
## Adding Phylogenetic support
|
197
207
|
|
198
|
-
|
199
|
-
we extend BioAlignment with
|
200
|
-
with the add_tree method.
|
208
|
+
An MSA often comes with a phylogenetic tree. Similar to runtime adding of the
|
209
|
+
delete state module, now we extend BioAlignment with
|
210
|
+
BioAlignment::AlignmentTree. A tree is plugged in with the add_tree method. See
|
211
|
+
the README and features directory for more examples.
|
201
212
|
|
202
213
|
## Methods returning alignments and concurrency
|
203
214
|
|
@@ -216,7 +227,7 @@ in functional style, such as
|
|
216
227
|
```
|
217
228
|
|
218
229
|
where aln2 is a copy (of aln) with columns removed that were marked for
|
219
|
-
deletion. In other words,
|
230
|
+
deletion. In other words, applied ''Functional programming in Ruby.'' If
|
220
231
|
functions can be easily 'piped', and code can be easily copy and pasted into
|
221
232
|
different algorithms, it is likely that the module is written in a functional
|
222
233
|
style.
|
@@ -0,0 +1,31 @@
|
|
1
|
+
require 'bio-alignment/edit/tree_splitter.rb'
|
2
|
+
|
3
|
+
When /^I split the tree$/ do |string|
|
4
|
+
tree = @aln.attach_tree(@tree)
|
5
|
+
@aln.extend TreeSplitter
|
6
|
+
(aln1,aln2) = @aln.split_on_distance
|
7
|
+
aln2.size.should == 5
|
8
|
+
@split1 = aln1
|
9
|
+
@split2 = aln2
|
10
|
+
end
|
11
|
+
|
12
|
+
Then /^I should have found sub\-trees "([^"]*)" and "([^"]*)"$/ do |arg1, arg2|
|
13
|
+
@split2.ids.sort.join(",").should == arg2
|
14
|
+
@split1.ids.sort.join(",").should == arg1
|
15
|
+
end
|
16
|
+
|
17
|
+
When /^I split the tree with a target of (\d+)$/ do |arg1|
|
18
|
+
tree = @aln.attach_tree(@tree)
|
19
|
+
@aln.extend TreeSplitter
|
20
|
+
@split1,@split2 = @aln.split_on_distance(arg1.to_i)
|
21
|
+
end
|
22
|
+
|
23
|
+
Then /^I should have found low\-homology sub\-tree "([^"]*)"$/ do |arg1|
|
24
|
+
@split1.ids.sort.join(",").should == arg1
|
25
|
+
end
|
26
|
+
|
27
|
+
Then /^I should have found high\-homology sub\-tree "([^"]*)"$/ do |arg1|
|
28
|
+
@split2.ids.sort.join(",").should == arg1
|
29
|
+
end
|
30
|
+
|
31
|
+
|
@@ -0,0 +1,66 @@
|
|
1
|
+
@split
|
2
|
+
Feature: Splitting alignments into equal sized branches using phylogenetic tree info
|
3
|
+
|
4
|
+
Sometimes we want to split a large alignment into sub-sets. When an
|
5
|
+
alignment is accompanied by a phylogenetic tree, we can greedily split the
|
6
|
+
tree. With a rooted tree, we start from the root, and walk the tree, taking
|
7
|
+
the shortest edge at every node (a tie may favour splitting). If the tree can
|
8
|
+
be split, so that both sides are similar sized, the job is done (if you want
|
9
|
+
more splits, just repeat the exercise). Essentially one subset shows
|
10
|
+
relatively high homology, the other relatively low homology. This is a crude
|
11
|
+
method, but has the advantage of being quick to calculate and reproducible.
|
12
|
+
If there is no root, we start from the point next to the longest edge.
|
13
|
+
|
14
|
+
We add one 'target_size' parameter to allow for leaving more sequences in the
|
15
|
+
high homology subset. 'target_size' sets the allowed size of the
|
16
|
+
high-homology alignment. For example, setting it to 10 in a 15 sequence
|
17
|
+
alignment, will stop the splitting at 5 sequences, leaving (approx.) 10
|
18
|
+
sequences in the high homology group. Likewise, setting it to 5 will continue
|
19
|
+
splitting until that number is reached.
|
20
|
+
|
21
|
+
In below example the tree will be split in a branch with similar sequences,
|
22
|
+
and a branch with sequences that are somewhat removed.
|
23
|
+
|
24
|
+
Scenario: Split a tree
|
25
|
+
Given I have a multiple sequence alignment (MSA)
|
26
|
+
"""
|
27
|
+
seq1 ----SNSFSRPTIIFSGCSTACSGK--SELVCGFRSFMLSDV
|
28
|
+
seq2 SSIISNSFSRPTIIFSGCSTACSGK--SEQVCGFR---LSDV
|
29
|
+
seq3 SSIISNSFSRPTIIFSGCSTACSGKLTSEQVCGFR---LSDV
|
30
|
+
seq4 ----PKLFSRPTIIFSGCSTACSGK--SEPVCGFRSFMLSDV
|
31
|
+
seq5 ----------PTIIFSGCSKACSGKGLSELVCGFRSFMLSDV
|
32
|
+
seq6 ----------PTIIFSGCSKACSGK-----FRSFRSFMLSAV
|
33
|
+
seq7 ----------PTIIFSGCSKACSGK-----VCGIFHAVRSFM
|
34
|
+
seq8 ----------PTIIFSGCSKACSGK--SELVCGFRSFMLSAV
|
35
|
+
"""
|
36
|
+
And I have a phylogenetic tree in Newick format
|
37
|
+
"""
|
38
|
+
((seq6:5.3571434,(seq4:4.04762,((seq8:1.1904755,seq5:1.1904755):1.7857151,((seq3:0.0,seq2:0.0):1.1904755,seq1:1.1904755):1.7857151):1.0714293):1.3095236):4.336735,seq7:9.693878);
|
39
|
+
"""
|
40
|
+
When I split the tree
|
41
|
+
"""
|
42
|
+
,--9.69----------------------------------------- seq7 ----------PTIIFSGCSKACSGK-----VCGIFHAVRSFM
|
43
|
+
| ,--1.19----- seq1 ----SNSFSRPTIIFSGCSTACSGK--SELVCGFRSFMLSDV
|
44
|
+
| ,--1.79--| ,-- seq2 SSIISNSFSRPTIIFSGCSTACSGK--SEQVCGFR---LSDV
|
45
|
+
| ,--1.07--| `--1.19--+-- seq3 SSIISNSFSRPTIIFSGCSTACSGKLTSEQVCGFR---LSDV
|
46
|
+
| | `--1.79--+--1.19----- seq5 ----------PTIIFSGCSKACSGKGLSELVCGFRSFMLSDV
|
47
|
+
| ,--1.31--| `--1.19----- seq8 ----------PTIIFSGCSKACSGK--SELVCGFRSFMLSAV
|
48
|
+
`--4.34--| `--4.05----------------------- seq4 ----PKLFSRPTIIFSGCSTACSGK--SEPVCGFRSFMLSDV
|
49
|
+
`--5.36-------------------------------- seq6 ----------PTIIFSGCSKACSGK-----FRSFRSFMLSAV
|
50
|
+
"""
|
51
|
+
Then I should have found sub-trees "seq4,seq6,seq7" and "seq1,seq2,seq3,seq5,seq8"
|
52
|
+
When I split the tree with a target of 2
|
53
|
+
Then I should have found high-homology sub-tree "seq5,seq8"
|
54
|
+
When I split the tree with a target of 3
|
55
|
+
Then I should have found high-homology sub-tree "seq1,seq2,seq3"
|
56
|
+
When I split the tree with a target of 4
|
57
|
+
Then I should have found high-homology sub-tree "seq1,seq2,seq3"
|
58
|
+
When I split the tree with a target of 5
|
59
|
+
Then I should have found high-homology sub-tree "seq1,seq2,seq3,seq5,seq8"
|
60
|
+
When I split the tree with a target of 6
|
61
|
+
Then I should have found high-homology sub-tree "seq1,seq2,seq3,seq4,seq5,seq8"
|
62
|
+
When I split the tree with a target of 7
|
63
|
+
Then I should have found low-homology sub-tree "seq7"
|
64
|
+
When I split the tree with a target of 6
|
65
|
+
Then I should have found low-homology sub-tree "seq6,seq7"
|
66
|
+
|
@@ -27,24 +27,29 @@ Then /^I should be able to traverse the tree$/ do
|
|
27
27
|
root = @aln.root # get the root of the tree
|
28
28
|
root.leaf?.should == false
|
29
29
|
children = root.children
|
30
|
+
# root has one direct leaf
|
30
31
|
children.map { |n| n.name }.sort.should == ["","seq7"]
|
31
32
|
seq7 = children.last
|
32
33
|
seq7.name.should == 'seq7'
|
33
34
|
seq7.leaf?.should == true
|
34
35
|
seq7.parent.should == root
|
36
|
+
# find leaf seq4
|
35
37
|
seq4 = tree.find("seq4")
|
36
38
|
seq4.leaf?.should == true
|
37
|
-
|
39
|
+
# total distance to seq7 9.69+4.34+1.31+4.05 ~ 19.38
|
40
|
+
seq4.distance(seq7).should == 19.387756600000003 # BioRuby does this!
|
38
41
|
end
|
39
42
|
|
40
43
|
Then /^fetch elements from the MSA from each end node in the tree$/ do
|
41
44
|
# walk the tree
|
42
45
|
tree = @aln.attach_tree(@tree)
|
43
46
|
ids = []
|
47
|
+
# Walk the ordered tree and fetch the sequence from the alignment
|
44
48
|
column20 = tree.map { | leaf |
|
45
49
|
ids << leaf.name
|
50
|
+
# we have the ID, now find the alignment
|
46
51
|
seq = @aln.find(leaf.name)
|
47
|
-
#
|
52
|
+
# Return the 18th nucleotide, just for show
|
48
53
|
seq[19]
|
49
54
|
}
|
50
55
|
ids.should == ["seq6", "seq4", "seq8", "seq5", "seq3", "seq2", "seq1", "seq7"]
|
@@ -52,11 +57,35 @@ Then /^fetch elements from the MSA from each end node in the tree$/ do
|
|
52
57
|
end
|
53
58
|
|
54
59
|
Then /^calculate the phylogenetic distance between each element$/ do
|
55
|
-
|
60
|
+
# we did this earlier with
|
61
|
+
tree = @aln.attach_tree(@tree)
|
62
|
+
seq7 = tree.find("seq7")
|
63
|
+
seq4 = tree.find("seq4")
|
64
|
+
# total distance to seq7 9.69+4.34+1.31+4.05 ~ 19.38
|
65
|
+
seq4.distance(seq7).should == 19.387756600000003 # BioRuby does this!
|
66
|
+
end
|
67
|
+
|
68
|
+
Then /^find that the nearest sequence to "([^"]*)" is "([^"]*)"$/ do |arg1, arg2|
|
69
|
+
tree = @aln.attach_tree(@tree)
|
70
|
+
seq = tree.find(arg1)
|
71
|
+
seq.nearest.map{|n|n.to_s}.sort.join(',').should == arg2
|
56
72
|
end
|
57
73
|
|
74
|
+
Then /^find that "([^"]*)" is on the same branch as "([^"]*)"$/ do |arg1, arg2|
|
75
|
+
# really the same as the above
|
76
|
+
tree = @aln.attach_tree(@tree)
|
77
|
+
seq = tree.find(arg1)
|
78
|
+
seq.nearest.map{|n|n.to_s}.sort.join(',').should == arg2
|
79
|
+
end
|
80
|
+
|
81
|
+
|
58
82
|
Then /^draw the MSA with the tree$/ do | string |
|
59
83
|
# textual drawing, like tabtree, or http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/149701
|
84
|
+
# or BioPythons http://biopython.org/DIST/docs/api/Bio.Phylo._utils-pysrc.html#draw_ascii
|
85
|
+
# hg clone https://bitbucket.org/keesey/namesonnodes-sa
|
86
|
+
#
|
87
|
+
# http://cegg.unige.ch/newick_utils
|
88
|
+
# http://code.google.com/p/a3lbmonkeybrain-as3/source/browse/trunk/src/a3lbmonkeybrain/calculia/collections/graphs/exporters/TextCladogramExporter.as?spec=svn26&r=26
|
60
89
|
print string
|
61
90
|
pending # express the regexp above with the code you wish you had
|
62
91
|
end
|
@@ -1,6 +1,8 @@
|
|
1
1
|
@tree
|
2
2
|
Feature: Tree support for alignments
|
3
|
-
Alignments are often accompanied by phylogenetic trees.
|
3
|
+
Alignments are often accompanied by phylogenetic trees. When we
|
4
|
+
have an alignment with its tree, we want to traverse the tree
|
5
|
+
and calculate distances.
|
4
6
|
|
5
7
|
Scenario: Get ordered elements from a tree
|
6
8
|
Given I have a multiple sequence alignment (MSA)
|
@@ -21,6 +23,11 @@ Feature: Tree support for alignments
|
|
21
23
|
Then I should be able to traverse the tree
|
22
24
|
And fetch elements from the MSA from each end node in the tree
|
23
25
|
And calculate the phylogenetic distance between each element
|
26
|
+
And find that the nearest sequence to "seq2" is "seq3"
|
27
|
+
And find that the nearest sequence to "seq5" is "seq8"
|
28
|
+
And find that the nearest sequence to "seq1" is "seq2,seq3"
|
29
|
+
And find that "seq1" is on the same branch as "seq2,seq3"
|
30
|
+
And find that "seq4" is on the same branch as "seq1,seq2,seq3,seq5,seq8"
|
24
31
|
And draw the MSA with the tree
|
25
32
|
"""
|
26
33
|
,--9.69----------------------------------------- seq7 ----------PTIIFSGCSKACSGK-----VCGIFHAVRSFM
|
@@ -15,6 +15,7 @@ module Bio
|
|
15
15
|
include Columns
|
16
16
|
|
17
17
|
attr_accessor :sequences
|
18
|
+
attr_reader :tree
|
18
19
|
|
19
20
|
# Create alignment. seqs can be a list of sequences. If these
|
20
21
|
# are String types, they get converted to the library Sequence
|
@@ -41,6 +42,16 @@ module Bio
|
|
41
42
|
|
42
43
|
alias rows sequences
|
43
44
|
|
45
|
+
# return an array of sequence ids
|
46
|
+
def ids
|
47
|
+
rows.map { |r| r.id }
|
48
|
+
end
|
49
|
+
|
50
|
+
def size
|
51
|
+
rows.size
|
52
|
+
end
|
53
|
+
|
54
|
+
# Return a sequence by index
|
44
55
|
def [] index
|
45
56
|
rows[index]
|
46
57
|
end
|
@@ -59,7 +70,7 @@ module Bio
|
|
59
70
|
each do | seq |
|
60
71
|
return seq if seq.id == name
|
61
72
|
end
|
62
|
-
raise "ERROR: Sequence not found by its name
|
73
|
+
raise "ERROR: Sequence not found by its name, looking for <#{name}>"
|
63
74
|
end
|
64
75
|
|
65
76
|
# clopy alignment and allow updating elements
|
@@ -85,6 +96,8 @@ module Bio
|
|
85
96
|
aln.sequences << seq.clone
|
86
97
|
end
|
87
98
|
aln.clone_columns! if @columns
|
99
|
+
# clone the tree
|
100
|
+
@tree = @tree.clone if @tree
|
88
101
|
aln
|
89
102
|
end
|
90
103
|
|
@@ -96,6 +109,19 @@ module Bio
|
|
96
109
|
@tree = Tree::init(tree)
|
97
110
|
@tree
|
98
111
|
end
|
112
|
+
|
113
|
+
# Reduce an alignment, based on the new tree
|
114
|
+
def tree_reduce new_tree
|
115
|
+
names = new_tree.map { | node | node.name }.compact
|
116
|
+
# p names
|
117
|
+
nrows = []
|
118
|
+
names.each do | name |
|
119
|
+
nrows << find(name).clone
|
120
|
+
end
|
121
|
+
new_aln = Alignment.new(nrows)
|
122
|
+
new_aln.attach_tree(new_tree.clone)
|
123
|
+
new_aln
|
124
|
+
end
|
99
125
|
end
|
100
126
|
end
|
101
127
|
end
|
@@ -0,0 +1,58 @@
|
|
1
|
+
module Bio
|
2
|
+
module BioAlignment
|
3
|
+
|
4
|
+
# Split an alignment based on its phylogeny
|
5
|
+
module TreeSplitter
|
6
|
+
|
7
|
+
# Split an alignment using a phylogeny tree. One half contains sequences
|
8
|
+
# that are relatively homologues, the other half contains the rest. This
|
9
|
+
# is described in the tree-split.feature in the features directory.
|
10
|
+
#
|
11
|
+
# The target_size parameter gives the size of the homologues sequence
|
12
|
+
# set. If target_size is nil, the set will be split in half.
|
13
|
+
#
|
14
|
+
# Returns two alignments with their matching trees attached
|
15
|
+
def split_on_distance target_size = nil
|
16
|
+
target_size = size/2+1 if not target_size
|
17
|
+
|
18
|
+
aln1 = clone
|
19
|
+
# Start from the root of the tree (FIXME: what if there is no root?)
|
20
|
+
prev_root = nil
|
21
|
+
new_root = aln1.tree.root
|
22
|
+
while new_root
|
23
|
+
# find the nearest child (shortest edge)
|
24
|
+
near_children = new_root.nearest_children
|
25
|
+
# We possibly have multiple matches, so we are going to split on the
|
26
|
+
# number of leafs, or we leave it like it is, if the split will be
|
27
|
+
# too far from the target
|
28
|
+
prev_root = new_root
|
29
|
+
new_root = near_children.first
|
30
|
+
near_children.each do |c|
|
31
|
+
next if c == new_root
|
32
|
+
# find the nearest match
|
33
|
+
if (c.leaves.size-target_size).abs < (new_root.leaves.size-target_size).abs
|
34
|
+
new_root = c
|
35
|
+
end
|
36
|
+
end
|
37
|
+
# Break out of the loop when we hit the target
|
38
|
+
break if new_root.leaves.size <= target_size
|
39
|
+
end
|
40
|
+
# Now see if whether the last step actually was an improvement, otherwise
|
41
|
+
# we take one node up
|
42
|
+
# p [(prev_root.leaves.size-target_size).abs,(new_root.leaves.size-target_size).abs]
|
43
|
+
new_root = prev_root if (prev_root.leaves.size-target_size).abs < (new_root.leaves.size-target_size).abs
|
44
|
+
branch = aln1.tree.clone_subtree(new_root)
|
45
|
+
reduced_tree = aln1.tree.clone_tree_without_branch(new_root)
|
46
|
+
# p branch.map { |n| n.name }.compact
|
47
|
+
# p reduced_tree.map { |n| n.name }.compact
|
48
|
+
|
49
|
+
# Now reduce the alignments themselves to match the trees
|
50
|
+
aln1 = tree_reduce(reduced_tree)
|
51
|
+
aln2 = tree_reduce(branch)
|
52
|
+
return aln1,aln2
|
53
|
+
end
|
54
|
+
|
55
|
+
end
|
56
|
+
end
|
57
|
+
end
|
58
|
+
|
data/lib/bio-alignment/tree.rb
CHANGED
@@ -37,27 +37,72 @@ module Bio
|
|
37
37
|
# Here we add to BioRuby's Bio::Tree classes
|
38
38
|
class Tree
|
39
39
|
class Node
|
40
|
+
# Add tree information to this node, so it can be queried
|
40
41
|
def inject_tree tree
|
41
42
|
@tree = tree
|
43
|
+
self
|
42
44
|
end
|
43
45
|
|
46
|
+
# Is this Node a leaf?
|
44
47
|
def leaf?
|
45
48
|
children.size == 0
|
46
49
|
end
|
47
50
|
|
51
|
+
# Get the children of this Node
|
48
52
|
def children
|
49
53
|
@tree.children(self)
|
50
54
|
end
|
51
55
|
|
56
|
+
def descendents
|
57
|
+
@tree.descendents(self)
|
58
|
+
end
|
59
|
+
|
60
|
+
# Get the parents of this Node
|
52
61
|
def parent
|
53
62
|
@tree.parent(self)
|
54
63
|
end
|
64
|
+
|
65
|
+
# Get the direct sibling nodes (i.e. parent.children)
|
66
|
+
def siblings
|
67
|
+
parent.children - [self]
|
68
|
+
end
|
69
|
+
|
70
|
+
# Return the leaves of this node
|
71
|
+
def leaves
|
72
|
+
@tree.leaves(self)
|
73
|
+
end
|
74
|
+
|
75
|
+
# Find the nearest and dearest, i.e. the leafs attached to the parent
|
76
|
+
# node
|
77
|
+
def nearest
|
78
|
+
@tree.leaves(parent) - [self]
|
79
|
+
end
|
55
80
|
|
56
|
-
# Get the distance to another node
|
81
|
+
# Get the distance to another node
|
57
82
|
def distance other
|
58
83
|
@tree.distance(self,other)
|
59
84
|
end
|
60
|
-
|
85
|
+
|
86
|
+
# Get child node with the shortest edge - note that if there are more
|
87
|
+
# than one, the first will be picked
|
88
|
+
def nearest_child
|
89
|
+
c = nil
|
90
|
+
children.each do |n|
|
91
|
+
c=n if not c or distance(n)<distance(c)
|
92
|
+
end
|
93
|
+
c
|
94
|
+
end
|
95
|
+
|
96
|
+
# Get the child nodes with the shortest edge - returns an Array
|
97
|
+
def nearest_children
|
98
|
+
min_distance = distance(nearest_child)
|
99
|
+
cs = []
|
100
|
+
children.each do |n|
|
101
|
+
cs << n if distance(n) == min_distance
|
102
|
+
end
|
103
|
+
cs
|
104
|
+
end
|
105
|
+
end # End of injecting Node functionality
|
61
106
|
|
62
107
|
def find name
|
63
108
|
get_node_by_name(name)
|
@@ -65,12 +110,57 @@ module Bio
|
|
65
110
|
|
66
111
|
# Walk the ordered tree leaves, calling into the block, and return an array
|
67
112
|
def map
|
68
|
-
|
69
|
-
|
70
|
-
|
71
|
-
|
113
|
+
leaves.map { | leaf | yield leaf }
|
114
|
+
end
|
115
|
+
|
116
|
+
# Create a deep clone of the tree
|
117
|
+
def clone_subtree start_node
|
118
|
+
new_tree = self.class.new
|
119
|
+
list = [start_node] + start_node.descendents
|
120
|
+
list.each do |x|
|
121
|
+
new_tree.add_node(x)
|
122
|
+
end
|
123
|
+
each_edge do |node1, node2, edge|
|
124
|
+
if new_tree.include?(node1) and new_tree.include?(node2)
|
125
|
+
new_tree.add_edge(node1, node2, edge)
|
126
|
+
end
|
127
|
+
end
|
128
|
+
new_tree
|
129
|
+
end
|
130
|
+
|
131
|
+
# Clone a tree without the branch starting at node
|
132
|
+
def clone_tree_without_branch node
|
133
|
+
new_tree = self.class.new
|
134
|
+
original = [root] + root.descendents
|
135
|
+
# p "Original",original
|
136
|
+
skip = [node] + node.descendents
|
137
|
+
# p "Skip",skip
|
138
|
+
# p "Retain",root.descendents - skip
|
139
|
+
nodes.each do |x|
|
140
|
+
if not skip.include?(x)
|
141
|
+
new_tree.add_node(x)
|
142
|
+
else
|
143
|
+
end
|
144
|
+
end
|
145
|
+
each_edge do |node1, node2, edge|
|
146
|
+
if new_tree.include?(node1) and new_tree.include?(node2)
|
147
|
+
new_tree.add_edge(node1, node2, edge)
|
148
|
+
end
|
149
|
+
end
|
150
|
+
new_tree
|
151
|
+
end
|
152
|
+
|
153
|
+
def clone
|
154
|
+
new_tree = self.class.new
|
155
|
+
nodes.each do |x|
|
156
|
+
new_tree.add_node(x)
|
157
|
+
end
|
158
|
+
self.each_edge do |node1, node2, edge|
|
159
|
+
if new_tree.include?(node1) and new_tree.include?(node2) then
|
160
|
+
new_tree.add_edge(node1, node2, edge)
|
161
|
+
end
|
72
162
|
end
|
73
|
-
|
163
|
+
new_tree
|
74
164
|
end
|
75
165
|
|
76
166
|
end
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: bio-alignment
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.0.
|
4
|
+
version: 0.0.7
|
5
5
|
prerelease:
|
6
6
|
platform: ruby
|
7
7
|
authors:
|
@@ -9,11 +9,11 @@ authors:
|
|
9
9
|
autorequire:
|
10
10
|
bindir: bin
|
11
11
|
cert_chain: []
|
12
|
-
date: 2012-
|
12
|
+
date: 2012-06-25 00:00:00.000000000Z
|
13
13
|
dependencies:
|
14
14
|
- !ruby/object:Gem::Dependency
|
15
15
|
name: bio-logger
|
16
|
-
requirement: &
|
16
|
+
requirement: &83191660 !ruby/object:Gem::Requirement
|
17
17
|
none: false
|
18
18
|
requirements:
|
19
19
|
- - ! '>='
|
@@ -21,10 +21,10 @@ dependencies:
|
|
21
21
|
version: '0'
|
22
22
|
type: :runtime
|
23
23
|
prerelease: false
|
24
|
-
version_requirements: *
|
24
|
+
version_requirements: *83191660
|
25
25
|
- !ruby/object:Gem::Dependency
|
26
26
|
name: bio
|
27
|
-
requirement: &
|
27
|
+
requirement: &83191360 !ruby/object:Gem::Requirement
|
28
28
|
none: false
|
29
29
|
requirements:
|
30
30
|
- - ! '>='
|
@@ -32,10 +32,10 @@ dependencies:
|
|
32
32
|
version: 1.4.2
|
33
33
|
type: :runtime
|
34
34
|
prerelease: false
|
35
|
-
version_requirements: *
|
35
|
+
version_requirements: *83191360
|
36
36
|
- !ruby/object:Gem::Dependency
|
37
37
|
name: rake
|
38
|
-
requirement: &
|
38
|
+
requirement: &83190960 !ruby/object:Gem::Requirement
|
39
39
|
none: false
|
40
40
|
requirements:
|
41
41
|
- - ! '>='
|
@@ -43,10 +43,10 @@ dependencies:
|
|
43
43
|
version: '0'
|
44
44
|
type: :development
|
45
45
|
prerelease: false
|
46
|
-
version_requirements: *
|
46
|
+
version_requirements: *83190960
|
47
47
|
- !ruby/object:Gem::Dependency
|
48
48
|
name: bio-bigbio
|
49
|
-
requirement: &
|
49
|
+
requirement: &83190640 !ruby/object:Gem::Requirement
|
50
50
|
none: false
|
51
51
|
requirements:
|
52
52
|
- - ! '>'
|
@@ -54,10 +54,10 @@ dependencies:
|
|
54
54
|
version: 0.1.3
|
55
55
|
type: :development
|
56
56
|
prerelease: false
|
57
|
-
version_requirements: *
|
57
|
+
version_requirements: *83190640
|
58
58
|
- !ruby/object:Gem::Dependency
|
59
59
|
name: cucumber
|
60
|
-
requirement: &
|
60
|
+
requirement: &83190190 !ruby/object:Gem::Requirement
|
61
61
|
none: false
|
62
62
|
requirements:
|
63
63
|
- - ! '>='
|
@@ -65,21 +65,21 @@ dependencies:
|
|
65
65
|
version: '0'
|
66
66
|
type: :development
|
67
67
|
prerelease: false
|
68
|
-
version_requirements: *
|
68
|
+
version_requirements: *83190190
|
69
69
|
- !ruby/object:Gem::Dependency
|
70
70
|
name: rspec
|
71
|
-
requirement: &
|
71
|
+
requirement: &83095350 !ruby/object:Gem::Requirement
|
72
72
|
none: false
|
73
73
|
requirements:
|
74
74
|
- - ~>
|
75
75
|
- !ruby/object:Gem::Version
|
76
|
-
version: 2.
|
76
|
+
version: 2.10.0
|
77
77
|
type: :development
|
78
78
|
prerelease: false
|
79
|
-
version_requirements: *
|
79
|
+
version_requirements: *83095350
|
80
80
|
- !ruby/object:Gem::Dependency
|
81
81
|
name: bundler
|
82
|
-
requirement: &
|
82
|
+
requirement: &83094950 !ruby/object:Gem::Requirement
|
83
83
|
none: false
|
84
84
|
requirements:
|
85
85
|
- - ! '>='
|
@@ -87,10 +87,10 @@ dependencies:
|
|
87
87
|
version: 1.0.21
|
88
88
|
type: :development
|
89
89
|
prerelease: false
|
90
|
-
version_requirements: *
|
90
|
+
version_requirements: *83094950
|
91
91
|
- !ruby/object:Gem::Dependency
|
92
92
|
name: jeweler
|
93
|
-
requirement: &
|
93
|
+
requirement: &83094400 !ruby/object:Gem::Requirement
|
94
94
|
none: false
|
95
95
|
requirements:
|
96
96
|
- - ! '>='
|
@@ -98,7 +98,7 @@ dependencies:
|
|
98
98
|
version: '0'
|
99
99
|
type: :development
|
100
100
|
prerelease: false
|
101
|
-
version_requirements: *
|
101
|
+
version_requirements: *83094400
|
102
102
|
description: Alignment handler for multiple sequence alignments (MSA)
|
103
103
|
email: pjotr.public01@thebird.nl
|
104
104
|
executables:
|
@@ -138,10 +138,12 @@ files:
|
|
138
138
|
- features/edit/mask_serial_mutations.feature
|
139
139
|
- features/pal2nal-feature.rb
|
140
140
|
- features/pal2nal.feature
|
141
|
+
- features/phylogeny/split-tree-feature.rb
|
142
|
+
- features/phylogeny/split-tree.feature
|
143
|
+
- features/phylogeny/tree-feature.rb
|
144
|
+
- features/phylogeny/tree.feature
|
141
145
|
- features/rows-feature.rb
|
142
146
|
- features/rows.feature
|
143
|
-
- features/tree-feature.rb
|
144
|
-
- features/tree.feature
|
145
147
|
- lib/bio-alignment.rb
|
146
148
|
- lib/bio-alignment/alignment.rb
|
147
149
|
- lib/bio-alignment/bioruby.rb
|
@@ -154,6 +156,7 @@ files:
|
|
154
156
|
- lib/bio-alignment/edit/edit_rows.rb
|
155
157
|
- lib/bio-alignment/edit/mask_islands.rb
|
156
158
|
- lib/bio-alignment/edit/mask_serial_mutations.rb
|
159
|
+
- lib/bio-alignment/edit/tree_splitter.rb
|
157
160
|
- lib/bio-alignment/elements.rb
|
158
161
|
- lib/bio-alignment/pal2nal.rb
|
159
162
|
- lib/bio-alignment/rows.rb
|
@@ -183,7 +186,7 @@ required_ruby_version: !ruby/object:Gem::Requirement
|
|
183
186
|
version: '0'
|
184
187
|
segments:
|
185
188
|
- 0
|
186
|
-
hash:
|
189
|
+
hash: 900281341
|
187
190
|
required_rubygems_version: !ruby/object:Gem::Requirement
|
188
191
|
none: false
|
189
192
|
requirements:
|
@@ -192,7 +195,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
192
195
|
version: '0'
|
193
196
|
requirements: []
|
194
197
|
rubyforge_project:
|
195
|
-
rubygems_version: 1.8.
|
198
|
+
rubygems_version: 1.8.6
|
196
199
|
signing_key:
|
197
200
|
specification_version: 3
|
198
201
|
summary: Support for multiple sequence alignments (MSA)
|