bio 1.4.0 → 1.4.1
Sign up to get free protection for your applications and to get access to all the features.
- data/ChangeLog +1712 -0
- data/KNOWN_ISSUES.rdoc +11 -1
- data/README.rdoc +3 -2
- data/RELEASE_NOTES.rdoc +65 -127
- data/bioruby.gemspec +38 -2
- data/doc/RELEASE_NOTES-1.4.0.rdoc +167 -0
- data/doc/Tutorial.rd +74 -16
- data/doc/Tutorial.rd.html +68 -16
- data/lib/bio.rb +2 -0
- data/lib/bio/appl/clustalw/report.rb +18 -0
- data/lib/bio/appl/paml/codeml/report.rb +579 -21
- data/lib/bio/command.rb +149 -21
- data/lib/bio/db/aaindex.rb +11 -1
- data/lib/bio/db/embl/sptr.rb +1 -1
- data/lib/bio/db/fasta/defline.rb +7 -2
- data/lib/bio/db/fasta/qual.rb +24 -0
- data/lib/bio/db/fasta/qual_to_biosequence.rb +29 -0
- data/lib/bio/db/fastq.rb +15 -0
- data/lib/bio/db/go.rb +2 -2
- data/lib/bio/db/kegg/common.rb +109 -5
- data/lib/bio/db/kegg/genes.rb +61 -15
- data/lib/bio/db/kegg/genome.rb +43 -38
- data/lib/bio/db/kegg/module.rb +158 -0
- data/lib/bio/db/kegg/orthology.rb +40 -1
- data/lib/bio/db/kegg/pathway.rb +254 -0
- data/lib/bio/db/medline.rb +6 -2
- data/lib/bio/io/flatfile/autodetection.rb +6 -0
- data/lib/bio/location.rb +39 -0
- data/lib/bio/reference.rb +24 -0
- data/lib/bio/sequence.rb +2 -0
- data/lib/bio/sequence/adapter.rb +1 -0
- data/lib/bio/sequence/format.rb +14 -0
- data/lib/bio/sequence/sequence_masker.rb +95 -0
- data/lib/bio/tree.rb +4 -4
- data/lib/bio/util/restriction_enzyme/double_stranded/aligned_strands.rb +5 -0
- data/lib/bio/version.rb +1 -1
- data/setup.rb +5 -0
- data/test/data/KEGG/K02338.orthology +180 -52
- data/test/data/KEGG/M00118.module +44 -0
- data/test/data/KEGG/T00005.genome +140 -0
- data/test/data/KEGG/T00070.genome +34 -0
- data/test/data/KEGG/b0529.gene +47 -0
- data/test/data/KEGG/ec00072.pathway +23 -0
- data/test/data/KEGG/hsa00790.pathway +59 -0
- data/test/data/KEGG/ko00312.pathway +16 -0
- data/test/data/KEGG/map00030.pathway +37 -0
- data/test/data/KEGG/map00052.pathway +13 -0
- data/test/data/KEGG/rn00250.pathway +114 -0
- data/test/data/clustalw/example1.aln +58 -0
- data/test/data/go/selected_component.ontology +12 -0
- data/test/data/go/selected_gene_association.sgd +31 -0
- data/test/data/go/selected_wikipedia2go +13 -0
- data/test/data/medline/20146148_modified.medline +54 -0
- data/test/data/paml/codeml/models/aa.aln +26 -0
- data/test/data/paml/codeml/models/aa.dnd +13 -0
- data/test/data/paml/codeml/models/aa.ph +13 -0
- data/test/data/paml/codeml/models/alignment.phy +49 -0
- data/test/data/paml/codeml/models/results0-3.txt +312 -0
- data/test/data/paml/codeml/models/results7-8.txt +340 -0
- data/test/functional/bio/io/test_togows.rb +8 -8
- data/test/functional/bio/test_command.rb +7 -6
- data/test/unit/bio/appl/clustalw/test_report.rb +80 -0
- data/test/unit/bio/appl/paml/codeml/test_rates.rb +6 -6
- data/test/unit/bio/appl/paml/codeml/test_report.rb +231 -24
- data/test/unit/bio/appl/paml/codeml/test_report_single.rb +46 -0
- data/test/unit/bio/db/embl/test_sptr.rb +1 -1
- data/test/unit/bio/db/fasta/test_defline.rb +160 -0
- data/test/unit/bio/db/fasta/test_defline_misc.rb +490 -0
- data/test/unit/bio/db/kegg/test_genes.rb +281 -1
- data/test/unit/bio/db/kegg/test_genome.rb +408 -0
- data/test/unit/bio/db/kegg/test_module.rb +246 -0
- data/test/unit/bio/db/kegg/test_orthology.rb +95 -0
- data/test/unit/bio/db/kegg/test_pathway.rb +1250 -0
- data/test/unit/bio/db/test_aaindex.rb +8 -7
- data/test/unit/bio/db/test_fastq.rb +36 -0
- data/test/unit/bio/db/test_go.rb +171 -0
- data/test/unit/bio/db/test_medline.rb +148 -0
- data/test/unit/bio/db/test_qual.rb +9 -2
- data/test/unit/bio/sequence/test_sequence_masker.rb +169 -0
- data/test/unit/bio/test_tree.rb +260 -1
- data/test/unit/bio/util/test_contingency_table.rb +7 -7
- metadata +53 -6
data/KNOWN_ISSUES.rdoc
CHANGED
@@ -4,7 +4,7 @@ License:: The Ruby License
|
|
4
4
|
|
5
5
|
= Known issues and bugs in BioRuby
|
6
6
|
|
7
|
-
Below are known issues and bugs in BioRuby.
|
7
|
+
Below are known issues and bugs in BioRuby. Patches to fix them are welcome.
|
8
8
|
We hope they will be fixed in the future.
|
9
9
|
|
10
10
|
Items marked with (WONT_FIX) tags would not be fixed within BioRuby because
|
@@ -69,6 +69,16 @@ This indicates that br_bioflat.rb and Bio::FlatFileIndex may create
|
|
69
69
|
incorrect indexes on mswin32, mingw32, and bccwin32. In addition,
|
70
70
|
Bio::FlatFile may return incorrect data.
|
71
71
|
|
72
|
+
==== String escaping of command-line arguments
|
73
|
+
|
74
|
+
After BioRuby 1.4.1, in Ruby 1.9.X running on Windows, escaping of
|
75
|
+
command-line arguments are processed by the Ruby interpreter. Before BioRuby
|
76
|
+
1.4.0, the escaping is executed in Bio::Command#escape_shell_windows, and
|
77
|
+
the behavior is different from the Ruby interpreter's one.
|
78
|
+
|
79
|
+
Curreltly, due to the change, test/functional/bio/test_command.rb may fail
|
80
|
+
on Windows with Ruby 1.9.X.
|
81
|
+
|
72
82
|
==== Windows 95/98/98SE/ME
|
73
83
|
|
74
84
|
(WONT_FIX) Some methods that call external programs may not work in
|
data/README.rdoc
CHANGED
@@ -9,7 +9,7 @@ License:: The Ruby License
|
|
9
9
|
|
10
10
|
= BioRuby
|
11
11
|
|
12
|
-
Copyright (C) 2001-
|
12
|
+
Copyright (C) 2001-2010 Toshiaki Katayama <k@bioruby.org>
|
13
13
|
|
14
14
|
BioRuby is an open source Ruby library for developing bioinformatics
|
15
15
|
software. Object oriented scripting language Ruby has many features
|
@@ -42,6 +42,7 @@ services including KEGG API can be easily utilized by BioRuby.
|
|
42
42
|
README.rdoc:: This file. General information and installation procedure.
|
43
43
|
RELEASE_NOTES.rdoc:: News and important changes in this release.
|
44
44
|
KNOWN_ISSUES.rdoc:: Known issues and bugs in BioRuby.
|
45
|
+
doc/RELEASE_NOTES-1.4.0.rdoc:: News and incompatible changes from 1.3.1 to 1.4.0.
|
45
46
|
doc/Changes-1.3.rdoc:: News and incompatible changes from 1.2.1 to 1.3.0.
|
46
47
|
doc/Changes-0.7.rd:: News and incompatible changes from 0.6.4 to 1.2.1.
|
47
48
|
|
@@ -116,7 +117,7 @@ and could be obtained by the following procedure.
|
|
116
117
|
* Ruby 1.8.2 or later (except Ruby 1.9.0) -- http://www.ruby-lang.org/
|
117
118
|
* Ruby 1.8.7-p174 or later, or Ruby 1.8.6-p383 or later is recommended.
|
118
119
|
* Not yet fully ready with Ruby 1.9, although many components can now work
|
119
|
-
in Ruby 1.9.1.
|
120
|
+
in Ruby 1.9.1 and Ruby 1.9.2.
|
120
121
|
|
121
122
|
== OPTIONAL REQUIREMENTS
|
122
123
|
|
data/RELEASE_NOTES.rdoc
CHANGED
@@ -1,166 +1,104 @@
|
|
1
|
-
= BioRuby 1.4.
|
1
|
+
= BioRuby 1.4.1 RELEASE NOTES
|
2
2
|
|
3
|
-
A lot of changes have been made to the BioRuby 1.4.
|
3
|
+
A lot of changes have been made to the BioRuby 1.4.1 after the version 1.4.0
|
4
4
|
is released. This document describes important and/or incompatible changes
|
5
|
-
since the BioRuby 1.
|
5
|
+
since the BioRuby 1.4.0 release.
|
6
6
|
|
7
|
-
|
8
|
-
|
9
|
-
=== PhyloXML support
|
10
|
-
|
11
|
-
Support for reading and writing PhyloXML file format is added. New classes
|
12
|
-
Bio::PhyloXML::Parser and Bio::PhyloXML::Writer are used to read and write
|
13
|
-
a PhyloXML file, respectively.
|
14
|
-
|
15
|
-
The code is developed by Diana Jaunzeikare, mentored by Christian M Zmasek
|
16
|
-
and co-mentors, supported by Google Summer of Code 2009 in collaboration
|
17
|
-
with the National Evolutionary Synthesis Center (NESCent).
|
18
|
-
|
19
|
-
=== FASTQ file format support
|
7
|
+
For known problems, see KNOWN_ISSUES.rdoc.
|
20
8
|
|
21
|
-
|
22
|
-
three FASTQ format variants are supported.
|
9
|
+
== New features
|
23
10
|
|
24
|
-
|
25
|
-
of the FASTQ format is supported (although the three format variants should
|
26
|
-
be specified later by users if quality scores are needed).
|
11
|
+
=== PAML Codeml support is significantly improved
|
27
12
|
|
28
|
-
|
29
|
-
|
30
|
-
"to_biosequnece" method. Bio::Sequence#output now supports output of the
|
31
|
-
FASTQ format.
|
13
|
+
PAML Codeml result parser is completely rewritten and is significantly
|
14
|
+
improved. The code is developed by Pjotr Prins.
|
32
15
|
|
33
|
-
|
34
|
-
open-bio-l mailing list. The prototype of Bio::Fastq class was first
|
35
|
-
developed during the BioHackathon 2009 held in Okinawa.
|
16
|
+
=== KEGG PATHWAY and KEGG MODULE parser
|
36
17
|
|
37
|
-
|
18
|
+
Parsers for KEGG PATHWAY and KEGG MODULE data are added. The code is developed
|
19
|
+
by Kozo Nishida and Toshiaki Katayama.
|
38
20
|
|
39
|
-
|
40
|
-
formats are supported. The code is developed by Anthony Underwood.
|
21
|
+
=== Bio::KEGG improvements
|
41
22
|
|
42
|
-
|
23
|
+
Following new methods are added.
|
43
24
|
|
44
|
-
|
45
|
-
|
46
|
-
|
25
|
+
* Bio::KEGG::GENES#keggclass, keggclasses, names_as_array, names,
|
26
|
+
motifs_as_strings, motifs_as_hash, motifs
|
27
|
+
* Bio::KEGG::GENOME#original_databases
|
47
28
|
|
48
|
-
===
|
29
|
+
=== Test codes are added and improved.
|
49
30
|
|
50
|
-
|
51
|
-
|
52
|
-
addition, return value types of some methods are also changed for unifying
|
53
|
-
APIs among KEGG parser classes. See incompatible changes below for details.
|
31
|
+
Test codes are added and improved. Tney are developed by Kazuhiro Hayashi,
|
32
|
+
Kozo Nishida, John Prince, and Naohisa Goto.
|
54
33
|
|
55
|
-
===
|
34
|
+
=== Other new methods
|
56
35
|
|
57
|
-
|
58
|
-
|
59
|
-
|
36
|
+
* Bio::Fastq#mask
|
37
|
+
* Bio::Sequence#output_fasta
|
38
|
+
* Bio::ClustalW::Report#get_sequence
|
39
|
+
* Bio::Reference#==
|
40
|
+
* Bio::Location#==
|
41
|
+
* Bio::Locations#==
|
42
|
+
* Bio::FastaNumericFormat#to_biosequence
|
60
43
|
|
61
|
-
|
44
|
+
== Bug fixes
|
62
45
|
|
63
|
-
|
64
|
-
and the library path and test data path can be specified with environment
|
65
|
-
variables. BIORUBY_TEST_LIB is the path to be added to the Ruby's $LOAD_PATH.
|
66
|
-
For example, to test BioRuby installed in
|
67
|
-
/usr/local/lib/site_ruby/1.8, run
|
68
|
-
env BIORUBY_TEST_LIB=/usr/local/lib/site_ruby/1.8 ruby test/runner.rb
|
46
|
+
=== Bio::Tree
|
69
47
|
|
70
|
-
|
71
|
-
flag to turn on debug of the tests.
|
48
|
+
Following methods did not work correctly.
|
72
49
|
|
73
|
-
|
50
|
+
* Bio::Tree#collect_edge!
|
51
|
+
* Bio::Tree#remove_edge_if
|
74
52
|
|
75
|
-
===
|
53
|
+
=== Bio::KEGG::GENES and Bio::KEGG::GENOME
|
76
54
|
|
77
|
-
|
78
|
-
|
55
|
+
* Fixed bugs in Bio::KEGG::GENES#pathway.
|
56
|
+
* Fixed parser errors due to the format changes of KEGG GENES and KEGG GENOME.
|
79
57
|
|
80
|
-
===
|
58
|
+
=== Other bug fixes
|
81
59
|
|
82
|
-
|
83
|
-
|
84
|
-
|
60
|
+
* In Bio::Command, changed not to call fork(2) on platforms that do not
|
61
|
+
support it.
|
62
|
+
* Bio::MEDLINE#initialize should handle continuation of lines.
|
63
|
+
* Typo and a missing field in Bio::GO::GeneAssociation#to_str.
|
64
|
+
* Bug fix of Bio::FastaNumericFormat#to_biosequence.
|
65
|
+
* Fixed UniProt GN parsing issue in Bio::SPTR.
|
85
66
|
|
86
67
|
== Incompatible changes
|
87
68
|
|
88
|
-
=== Bio::
|
89
|
-
|
90
|
-
NCBI announces that all Entrez E-utility requests must contain email and
|
91
|
-
tool parameters, and requests without them will return error after June
|
92
|
-
2010.
|
93
|
-
|
94
|
-
To set default email address and tool name, following methods are added.
|
95
|
-
* Bio::NCBI.default_email=(email)
|
96
|
-
* Bio::NCBI.default_tool=(tool_name)
|
97
|
-
|
98
|
-
For every query, Bio::NCBI::REST checks the email and tool parameters and
|
99
|
-
raises error if they are empty.
|
100
|
-
|
101
|
-
IMPORTANT NOTE: No default email address is preset in BioRuby. Programmers
|
102
|
-
using BioRuby must set their own email address or implement to get user's
|
103
|
-
email address in some way (from input form, configuration file, etc).
|
104
|
-
|
105
|
-
Default tool name is set as "#{$0} (bioruby/#{Bio::BIORUBY_VERSION_ID})".
|
106
|
-
For example, if you run "ruby my_script.rb" with BioRuby 1.4.0, the value is
|
107
|
-
"my_script.rb (bioruby/1.4.0)".
|
108
|
-
|
109
|
-
=== Bio::KEGG
|
110
|
-
|
111
|
-
==== dblinks method
|
112
|
-
|
113
|
-
In Bio::KEGG::COMPOUND, DRUG, ENZYME, GLYCAN and ORTHOLOGY, the method
|
114
|
-
dblinks is changed to return a Hash. Each key of the hash is a database name
|
115
|
-
and its value is an array of entry IDs in the database. If old behavior
|
116
|
-
(returns raw entry lines as an array of strings) is needed, use
|
117
|
-
dblinks_as_strings.
|
118
|
-
|
119
|
-
==== pathways method
|
120
|
-
|
121
|
-
In Bio::KEGG::COMPOUND, DRUG, ENZYME, GENES, GLYCAN and REACTION, the
|
122
|
-
method pathways is changed to return a Hash. Each key of the hash is a
|
123
|
-
pathway ID and its value is the description of the pathway.
|
124
|
-
|
125
|
-
In Bio::KEGG::GENES, if old behavior (returns pathway IDs as an Array) is
|
126
|
-
needed, use pathways.keys.
|
127
|
-
|
128
|
-
In Bio::KEGG::COMPOUND, DRUG, ENZYME, GLYCAN, and REACTION, if old behavior
|
129
|
-
(returns raw entry lines as an array of strings) is needed, use
|
130
|
-
pathways_as_strings.
|
131
|
-
|
132
|
-
Note that Bio::KEGG::ORTHOLOGY#pathways is not changed (returns an array
|
133
|
-
containing pathway IDs).
|
69
|
+
=== Bio::PAML::Codeml::Report
|
134
70
|
|
135
|
-
|
71
|
+
The code is completely rewritten. See the RDoc for details.
|
136
72
|
|
137
|
-
|
138
|
-
changed to return a Hash. Each key of the hash is a ortholog ID and its
|
139
|
-
value is the name of the ortholog. If old behavior (returns raw entry lines
|
140
|
-
as an array of strings) is needed, use orthologs_as_strings.
|
73
|
+
=== Bio::KEGG::ORTHOLOGY
|
141
74
|
|
142
|
-
|
75
|
+
Bio::KEGG::ORTHOLOGY#pathways is changed to return a hash. The old pathway
|
76
|
+
method is renamed to pathways_in_keggclass for compatibility.
|
143
77
|
|
144
|
-
|
145
|
-
return a Hash that is the same as Bio::KEGG::ORTHOLOGY#genes_as_hash.
|
146
|
-
If old behavior (returns raw entry lines as an array of strings) is needed,
|
147
|
-
use genes_as_strings.
|
78
|
+
=== Bio::AAindex2
|
148
79
|
|
149
|
-
|
80
|
+
Bio::AAindex2 now copies each symmetric element for lower triangular matrix
|
81
|
+
to the upper right part, because the Matrix class in Ruby 1.9.2 no longer
|
82
|
+
accepts any dimension mismatches. We think the previous behavior is a bug.
|
150
83
|
|
151
|
-
Bio::
|
152
|
-
hash is a KEGG Rpair ID and its value is an array containing name and type.
|
153
|
-
If old behavior (returns as tokens) is needed, use rpairs_as_tokens.
|
84
|
+
=== Bio::MEDLINE
|
154
85
|
|
155
|
-
|
86
|
+
Bio::MEDLINE#reference no longer puts empty values in the returned
|
87
|
+
Bio::Reference object. We think the previous behavior is a bug.
|
88
|
+
We also think the effect is very small.
|
156
89
|
|
157
|
-
|
90
|
+
== Known issues
|
158
91
|
|
159
|
-
|
92
|
+
The following issues are added or updated. See KNOWN_ISSUES.rdoc for other
|
93
|
+
already known issues.
|
160
94
|
|
161
|
-
|
95
|
+
=== String escaping of command-line arguments in Ruby 1.9.X on Windows
|
162
96
|
|
163
|
-
|
97
|
+
After BioRuby 1.4.1, in Ruby 1.9.X running on Windows, escaping of
|
98
|
+
command-line arguments are processed by the Ruby interpreter. Before BioRuby
|
99
|
+
1.4.0, the escaping is executed in Bio::Command#escape_shell_windows, and
|
100
|
+
the behavior is different from the Ruby interpreter's one.
|
164
101
|
|
165
|
-
|
102
|
+
Curreltly, due to the change, test/functional/bio/test_command.rb may fail
|
103
|
+
on Windows with Ruby 1.9.X.
|
166
104
|
|
data/bioruby.gemspec
CHANGED
@@ -3,7 +3,7 @@
|
|
3
3
|
#
|
4
4
|
Gem::Specification.new do |s|
|
5
5
|
s.name = 'bio'
|
6
|
-
s.version = "1.4.
|
6
|
+
s.version = "1.4.1"
|
7
7
|
|
8
8
|
s.author = "BioRuby project"
|
9
9
|
s.email = "staff@bioruby.org"
|
@@ -37,6 +37,7 @@ Gem::Specification.new do |s|
|
|
37
37
|
"doc/Changes-1.3.rdoc",
|
38
38
|
"doc/KEGG_API.rd",
|
39
39
|
"doc/KEGG_API.rd.ja",
|
40
|
+
"doc/RELEASE_NOTES-1.4.0.rdoc",
|
40
41
|
"doc/Tutorial.rd",
|
41
42
|
"doc/Tutorial.rd.html",
|
42
43
|
"doc/Tutorial.rd.ja",
|
@@ -124,6 +125,7 @@ Gem::Specification.new do |s|
|
|
124
125
|
"lib/bio/db/fasta/format_fasta.rb",
|
125
126
|
"lib/bio/db/fasta/format_qual.rb",
|
126
127
|
"lib/bio/db/fasta/qual.rb",
|
128
|
+
"lib/bio/db/fasta/qual_to_biosequence.rb",
|
127
129
|
"lib/bio/db/fastq.rb",
|
128
130
|
"lib/bio/db/fastq/fastq_to_biosequence.rb",
|
129
131
|
"lib/bio/db/fastq/format_fastq.rb",
|
@@ -147,7 +149,9 @@ Gem::Specification.new do |s|
|
|
147
149
|
"lib/bio/db/kegg/glycan.rb",
|
148
150
|
"lib/bio/db/kegg/keggtab.rb",
|
149
151
|
"lib/bio/db/kegg/kgml.rb",
|
152
|
+
"lib/bio/db/kegg/module.rb",
|
150
153
|
"lib/bio/db/kegg/orthology.rb",
|
154
|
+
"lib/bio/db/kegg/pathway.rb",
|
151
155
|
"lib/bio/db/kegg/reaction.rb",
|
152
156
|
"lib/bio/db/kegg/taxonomy.rb",
|
153
157
|
"lib/bio/db/lasergene.rb",
|
@@ -219,6 +223,7 @@ Gem::Specification.new do |s|
|
|
219
223
|
"lib/bio/sequence/generic.rb",
|
220
224
|
"lib/bio/sequence/na.rb",
|
221
225
|
"lib/bio/sequence/quality_score.rb",
|
226
|
+
"lib/bio/sequence/sequence_masker.rb",
|
222
227
|
"lib/bio/shell.rb",
|
223
228
|
"lib/bio/shell/core.rb",
|
224
229
|
"lib/bio/shell/demo.rb",
|
@@ -370,7 +375,17 @@ Gem::Specification.new do |s|
|
|
370
375
|
"test/data/KEGG/G00024.glycan",
|
371
376
|
"test/data/KEGG/G01366.glycan",
|
372
377
|
"test/data/KEGG/K02338.orthology",
|
378
|
+
"test/data/KEGG/M00118.module",
|
373
379
|
"test/data/KEGG/R00006.reaction",
|
380
|
+
"test/data/KEGG/T00005.genome",
|
381
|
+
"test/data/KEGG/T00070.genome",
|
382
|
+
"test/data/KEGG/b0529.gene",
|
383
|
+
"test/data/KEGG/ec00072.pathway",
|
384
|
+
"test/data/KEGG/hsa00790.pathway",
|
385
|
+
"test/data/KEGG/ko00312.pathway",
|
386
|
+
"test/data/KEGG/map00030.pathway",
|
387
|
+
"test/data/KEGG/map00052.pathway",
|
388
|
+
"test/data/KEGG/rn00250.pathway",
|
374
389
|
"test/data/SOSUI/sample.report",
|
375
390
|
"test/data/TMHMM/sample.report",
|
376
391
|
"test/data/aaindex/DAYM780301",
|
@@ -383,6 +398,7 @@ Gem::Specification.new do |s|
|
|
383
398
|
"test/data/blast/b0002.faa.m7",
|
384
399
|
"test/data/blast/b0002.faa.m8",
|
385
400
|
"test/data/blast/blastp-multi.m7",
|
401
|
+
"test/data/clustalw/example1.aln",
|
386
402
|
"test/data/command/echoarg2.bat",
|
387
403
|
"test/data/embl/AB090716.embl",
|
388
404
|
"test/data/embl/AB090716.embl.rel89",
|
@@ -441,13 +457,23 @@ Gem::Specification.new do |s|
|
|
441
457
|
"test/data/fastq/wrapping_original_sanger.fastq",
|
442
458
|
"test/data/gcg/pileup-aa.msf",
|
443
459
|
"test/data/genscan/sample.report",
|
460
|
+
"test/data/go/selected_component.ontology",
|
461
|
+
"test/data/go/selected_gene_association.sgd",
|
462
|
+
"test/data/go/selected_wikipedia2go",
|
444
463
|
"test/data/iprscan/merged.raw",
|
445
464
|
"test/data/iprscan/merged.txt",
|
465
|
+
"test/data/medline/20146148_modified.medline",
|
446
466
|
"test/data/meme/db",
|
447
467
|
"test/data/meme/mast",
|
448
468
|
"test/data/meme/mast.out",
|
449
469
|
"test/data/meme/meme.out",
|
450
470
|
"test/data/paml/codeml/control_file.txt",
|
471
|
+
"test/data/paml/codeml/models/aa.aln",
|
472
|
+
"test/data/paml/codeml/models/aa.dnd",
|
473
|
+
"test/data/paml/codeml/models/aa.ph",
|
474
|
+
"test/data/paml/codeml/models/alignment.phy",
|
475
|
+
"test/data/paml/codeml/models/results0-3.txt",
|
476
|
+
"test/data/paml/codeml/models/results7-8.txt",
|
451
477
|
"test/data/paml/codeml/output.txt",
|
452
478
|
"test/data/paml/codeml/rates",
|
453
479
|
"test/data/phyloxml/apaf.xml",
|
@@ -479,6 +505,7 @@ Gem::Specification.new do |s|
|
|
479
505
|
"test/unit/bio/appl/blast/test_ncbioptions.rb",
|
480
506
|
"test/unit/bio/appl/blast/test_report.rb",
|
481
507
|
"test/unit/bio/appl/blast/test_rpsblast.rb",
|
508
|
+
"test/unit/bio/appl/clustalw/test_report.rb",
|
482
509
|
"test/unit/bio/appl/gcg/test_msf.rb",
|
483
510
|
"test/unit/bio/appl/genscan/test_report.rb",
|
484
511
|
"test/unit/bio/appl/hmmer/test_report.rb",
|
@@ -489,6 +516,7 @@ Gem::Specification.new do |s|
|
|
489
516
|
"test/unit/bio/appl/meme/test_motif.rb",
|
490
517
|
"test/unit/bio/appl/paml/codeml/test_rates.rb",
|
491
518
|
"test/unit/bio/appl/paml/codeml/test_report.rb",
|
519
|
+
"test/unit/bio/appl/paml/codeml/test_report_single.rb",
|
492
520
|
"test/unit/bio/appl/paml/test_codeml.rb",
|
493
521
|
"test/unit/bio/appl/sim4/test_report.rb",
|
494
522
|
"test/unit/bio/appl/sosui/test_report.rb",
|
@@ -508,13 +536,18 @@ Gem::Specification.new do |s|
|
|
508
536
|
"test/unit/bio/db/embl/test_embl_to_bioseq.rb",
|
509
537
|
"test/unit/bio/db/embl/test_sptr.rb",
|
510
538
|
"test/unit/bio/db/embl/test_uniprot.rb",
|
539
|
+
"test/unit/bio/db/fasta/test_defline.rb",
|
540
|
+
"test/unit/bio/db/fasta/test_defline_misc.rb",
|
511
541
|
"test/unit/bio/db/fasta/test_format_qual.rb",
|
512
542
|
"test/unit/bio/db/kegg/test_compound.rb",
|
513
543
|
"test/unit/bio/db/kegg/test_drug.rb",
|
514
544
|
"test/unit/bio/db/kegg/test_enzyme.rb",
|
515
545
|
"test/unit/bio/db/kegg/test_genes.rb",
|
546
|
+
"test/unit/bio/db/kegg/test_genome.rb",
|
516
547
|
"test/unit/bio/db/kegg/test_glycan.rb",
|
548
|
+
"test/unit/bio/db/kegg/test_module.rb",
|
517
549
|
"test/unit/bio/db/kegg/test_orthology.rb",
|
550
|
+
"test/unit/bio/db/kegg/test_pathway.rb",
|
518
551
|
"test/unit/bio/db/kegg/test_reaction.rb",
|
519
552
|
"test/unit/bio/db/pdb/test_pdb.rb",
|
520
553
|
"test/unit/bio/db/sanger_chromatogram/test_abif.rb",
|
@@ -523,6 +556,7 @@ Gem::Specification.new do |s|
|
|
523
556
|
"test/unit/bio/db/test_fasta.rb",
|
524
557
|
"test/unit/bio/db/test_fastq.rb",
|
525
558
|
"test/unit/bio/db/test_gff.rb",
|
559
|
+
"test/unit/bio/db/test_go.rb",
|
526
560
|
"test/unit/bio/db/test_lasergene.rb",
|
527
561
|
"test/unit/bio/db/test_medline.rb",
|
528
562
|
"test/unit/bio/db/test_newick.rb",
|
@@ -548,6 +582,7 @@ Gem::Specification.new do |s|
|
|
548
582
|
"test/unit/bio/sequence/test_dblink.rb",
|
549
583
|
"test/unit/bio/sequence/test_na.rb",
|
550
584
|
"test/unit/bio/sequence/test_quality_score.rb",
|
585
|
+
"test/unit/bio/sequence/test_sequence_masker.rb",
|
551
586
|
"test/unit/bio/shell/plugin/test_seq.rb",
|
552
587
|
"test/unit/bio/test_alignment.rb",
|
553
588
|
"test/unit/bio/test_command.rb",
|
@@ -588,7 +623,8 @@ Gem::Specification.new do |s|
|
|
588
623
|
"README.rdoc",
|
589
624
|
"README_DEV.rdoc",
|
590
625
|
"RELEASE_NOTES.rdoc",
|
591
|
-
"doc/Changes-1.3.rdoc"
|
626
|
+
"doc/Changes-1.3.rdoc",
|
627
|
+
"doc/RELEASE_NOTES-1.4.0.rdoc"
|
592
628
|
]
|
593
629
|
s.rdoc_options << '--main' << 'README.rdoc'
|
594
630
|
s.rdoc_options << '--title' << 'BioRuby API documentation'
|
@@ -0,0 +1,167 @@
|
|
1
|
+
= BioRuby 1.4.0 RELEASE NOTES
|
2
|
+
|
3
|
+
A lot of changes have been made to the BioRuby 1.4.0 after the version 1.3.1
|
4
|
+
is released. This document describes important and/or incompatible changes
|
5
|
+
since the BioRuby 1.3.1 release.
|
6
|
+
|
7
|
+
== New features
|
8
|
+
|
9
|
+
=== PhyloXML support
|
10
|
+
|
11
|
+
Support for reading and writing PhyloXML file format is added. New classes
|
12
|
+
Bio::PhyloXML::Parser and Bio::PhyloXML::Writer are used to read and write
|
13
|
+
a PhyloXML file, respectively.
|
14
|
+
|
15
|
+
The code is developed by Diana Jaunzeikare, mentored by Christian M Zmasek
|
16
|
+
and co-mentors, supported by Google Summer of Code 2009 in collaboration
|
17
|
+
with the National Evolutionary Synthesis Center (NESCent).
|
18
|
+
|
19
|
+
=== FASTQ file format support
|
20
|
+
|
21
|
+
Support for reading and writing FASTQ file format is added. All of the
|
22
|
+
three FASTQ format variants are supported.
|
23
|
+
|
24
|
+
To read a FASTQ file, Bio::FlatFile can be used. File format auto-detection
|
25
|
+
of the FASTQ format is supported (although the three format variants should
|
26
|
+
be specified later by users if quality scores are needed).
|
27
|
+
|
28
|
+
New class Bio::Fastq is the parser class for the FASTQ format. An object
|
29
|
+
of the Bio::Fastq class can be converted to a Bio::Sequence object with the
|
30
|
+
"to_biosequnece" method. Bio::Sequence#output now supports output of the
|
31
|
+
FASTQ format.
|
32
|
+
|
33
|
+
The code is written by Naohisa Goto, with the help of discussions in the
|
34
|
+
open-bio-l mailing list. The prototype of Bio::Fastq class was first
|
35
|
+
developed during the BioHackathon 2009 held in Okinawa.
|
36
|
+
|
37
|
+
=== DNA chromatogram support
|
38
|
+
|
39
|
+
Support for reading DNA chromatogram files are added. SCF and and ABIF file
|
40
|
+
formats are supported. The code is developed by Anthony Underwood.
|
41
|
+
|
42
|
+
=== MEME (motif-based sequence analysis tools) support
|
43
|
+
|
44
|
+
Support for running MAST (Motif Aliginment & Search Tool, part of the MEME
|
45
|
+
Suite, motif-based sequence analysis tools) and parsing its results are
|
46
|
+
added. The code is developed by Adam Kraut.
|
47
|
+
|
48
|
+
=== Improvement of KEGG parser classes
|
49
|
+
|
50
|
+
Some new methods are added to parse new fields added to some KEGG file
|
51
|
+
formats. Unit tests for KEGG parsers are also added and improved. In
|
52
|
+
addition, return value types of some methods are also changed for unifying
|
53
|
+
APIs among KEGG parser classes. See incompatible changes below for details.
|
54
|
+
Most of them are contributed by Kozo Nishida.
|
55
|
+
|
56
|
+
=== Many sample scripts are added
|
57
|
+
|
58
|
+
Many sample scripts showing demonstrations of usages of classes are added.
|
59
|
+
They are moved from primitive test codes for the classes described in the
|
60
|
+
"if __FILE__ == $0" convention in the library files.
|
61
|
+
|
62
|
+
=== Unit tests can test installed BioRuby
|
63
|
+
|
64
|
+
Mechanism to load library and to find test data in the unit tests are changed,
|
65
|
+
and the library path and test data path can be specified with environment
|
66
|
+
variables. BIORUBY_TEST_LIB is the path to be added to the Ruby's $LOAD_PATH.
|
67
|
+
For example, to test BioRuby installed in
|
68
|
+
/usr/local/lib/site_ruby/1.8, run
|
69
|
+
env BIORUBY_TEST_LIB=/usr/local/lib/site_ruby/1.8 ruby test/runner.rb
|
70
|
+
|
71
|
+
BIORUBY_TEST_DATA is the path of the test data, and BIORUBY_TEST_DEBUG is a
|
72
|
+
flag to turn on debug of the tests.
|
73
|
+
|
74
|
+
== Deprecated features
|
75
|
+
|
76
|
+
=== ChangeLog is replaced by git log
|
77
|
+
|
78
|
+
ChangeLog is replaced by the output of git-log command, and ChangeLog before
|
79
|
+
the 1.3.1 release is moved to doc/ChangeLog-before-1.3.1.
|
80
|
+
|
81
|
+
=== "if __FILE__ == $0" convention
|
82
|
+
|
83
|
+
Primitive test codes in the "if __FILE__ == $0" convention are removed and
|
84
|
+
the codes are moved to the sample scripts named sample/demo_*.rb (except
|
85
|
+
some older or deprecated files).
|
86
|
+
|
87
|
+
== Incompatible changes
|
88
|
+
|
89
|
+
=== Bio::NCBI::REST
|
90
|
+
|
91
|
+
NCBI announces that all Entrez E-utility requests must contain email and
|
92
|
+
tool parameters, and requests without them will return error after June
|
93
|
+
2010.
|
94
|
+
|
95
|
+
To set default email address and tool name, following methods are added.
|
96
|
+
* Bio::NCBI.default_email=(email)
|
97
|
+
* Bio::NCBI.default_tool=(tool_name)
|
98
|
+
|
99
|
+
For every query, Bio::NCBI::REST checks the email and tool parameters and
|
100
|
+
raises error if they are empty.
|
101
|
+
|
102
|
+
IMPORTANT NOTE: No default email address is preset in BioRuby. Programmers
|
103
|
+
using BioRuby must set their own email address or implement to get user's
|
104
|
+
email address in some way (from input form, configuration file, etc).
|
105
|
+
|
106
|
+
Default tool name is set as "#{$0} (bioruby/#{Bio::BIORUBY_VERSION_ID})".
|
107
|
+
For example, if you run "ruby my_script.rb" with BioRuby 1.4.0, the value is
|
108
|
+
"my_script.rb (bioruby/1.4.0)".
|
109
|
+
|
110
|
+
=== Bio::KEGG
|
111
|
+
|
112
|
+
==== dblinks method
|
113
|
+
|
114
|
+
In Bio::KEGG::COMPOUND, DRUG, ENZYME, GLYCAN and ORTHOLOGY, the method
|
115
|
+
dblinks is changed to return a Hash. Each key of the hash is a database name
|
116
|
+
and its value is an array of entry IDs in the database. If old behavior
|
117
|
+
(returns raw entry lines as an array of strings) is needed, use
|
118
|
+
dblinks_as_strings.
|
119
|
+
|
120
|
+
==== pathways method
|
121
|
+
|
122
|
+
In Bio::KEGG::COMPOUND, DRUG, ENZYME, GENES, GLYCAN and REACTION, the
|
123
|
+
method pathways is changed to return a Hash. Each key of the hash is a
|
124
|
+
pathway ID and its value is the description of the pathway.
|
125
|
+
|
126
|
+
In Bio::KEGG::GENES, if old behavior (returns pathway IDs as an Array) is
|
127
|
+
needed, use pathways.keys.
|
128
|
+
|
129
|
+
In Bio::KEGG::COMPOUND, DRUG, ENZYME, GLYCAN, and REACTION, if old behavior
|
130
|
+
(returns raw entry lines as an array of strings) is needed, use
|
131
|
+
pathways_as_strings.
|
132
|
+
|
133
|
+
Note that Bio::KEGG::ORTHOLOGY#pathways is not changed (returns an array
|
134
|
+
containing pathway IDs).
|
135
|
+
|
136
|
+
==== orthologs method
|
137
|
+
|
138
|
+
In Bio::KEGG::ENZYME, GENES, GLYCAN and REACTION, the method orthologs is
|
139
|
+
changed to return a Hash. Each key of the hash is a ortholog ID and its
|
140
|
+
value is the name of the ortholog. If old behavior (returns raw entry lines
|
141
|
+
as an array of strings) is needed, use orthologs_as_strings.
|
142
|
+
|
143
|
+
==== genes method
|
144
|
+
|
145
|
+
In Bio::KEGG::ENZYME#genes and Bio::KEGG::ORTHOLOGY#genes is changed to
|
146
|
+
return a Hash that is the same as Bio::KEGG::ORTHOLOGY#genes_as_hash.
|
147
|
+
If old behavior (returns raw entry lines as an array of strings) is needed,
|
148
|
+
use genes_as_strings.
|
149
|
+
|
150
|
+
==== Bio::KEGG:REACTION#rpairs
|
151
|
+
|
152
|
+
Bio::KEGG::REACTION#rpairs is changed to return a Hash. Each key of the
|
153
|
+
hash is a KEGG Rpair ID and its value is an array containing name and type.
|
154
|
+
If old behavior (returns as tokens) is needed, use rpairs_as_tokens.
|
155
|
+
|
156
|
+
==== Bio::KEGG::ORTHOLOGY
|
157
|
+
|
158
|
+
Bio::KEGG:ORTHOLOGY#dblinks_as_hash does not lower-case database names.
|
159
|
+
|
160
|
+
=== Bio::RestrictionEnzyme
|
161
|
+
|
162
|
+
Format validation when creating an object is turned off because of efficiency.
|
163
|
+
|
164
|
+
== Known problems
|
165
|
+
|
166
|
+
See KNOWN_ISSUES.rdoc for details.
|
167
|
+
|