bio 1.4.0 → 1.4.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/ChangeLog +1712 -0
- data/KNOWN_ISSUES.rdoc +11 -1
- data/README.rdoc +3 -2
- data/RELEASE_NOTES.rdoc +65 -127
- data/bioruby.gemspec +38 -2
- data/doc/RELEASE_NOTES-1.4.0.rdoc +167 -0
- data/doc/Tutorial.rd +74 -16
- data/doc/Tutorial.rd.html +68 -16
- data/lib/bio.rb +2 -0
- data/lib/bio/appl/clustalw/report.rb +18 -0
- data/lib/bio/appl/paml/codeml/report.rb +579 -21
- data/lib/bio/command.rb +149 -21
- data/lib/bio/db/aaindex.rb +11 -1
- data/lib/bio/db/embl/sptr.rb +1 -1
- data/lib/bio/db/fasta/defline.rb +7 -2
- data/lib/bio/db/fasta/qual.rb +24 -0
- data/lib/bio/db/fasta/qual_to_biosequence.rb +29 -0
- data/lib/bio/db/fastq.rb +15 -0
- data/lib/bio/db/go.rb +2 -2
- data/lib/bio/db/kegg/common.rb +109 -5
- data/lib/bio/db/kegg/genes.rb +61 -15
- data/lib/bio/db/kegg/genome.rb +43 -38
- data/lib/bio/db/kegg/module.rb +158 -0
- data/lib/bio/db/kegg/orthology.rb +40 -1
- data/lib/bio/db/kegg/pathway.rb +254 -0
- data/lib/bio/db/medline.rb +6 -2
- data/lib/bio/io/flatfile/autodetection.rb +6 -0
- data/lib/bio/location.rb +39 -0
- data/lib/bio/reference.rb +24 -0
- data/lib/bio/sequence.rb +2 -0
- data/lib/bio/sequence/adapter.rb +1 -0
- data/lib/bio/sequence/format.rb +14 -0
- data/lib/bio/sequence/sequence_masker.rb +95 -0
- data/lib/bio/tree.rb +4 -4
- data/lib/bio/util/restriction_enzyme/double_stranded/aligned_strands.rb +5 -0
- data/lib/bio/version.rb +1 -1
- data/setup.rb +5 -0
- data/test/data/KEGG/K02338.orthology +180 -52
- data/test/data/KEGG/M00118.module +44 -0
- data/test/data/KEGG/T00005.genome +140 -0
- data/test/data/KEGG/T00070.genome +34 -0
- data/test/data/KEGG/b0529.gene +47 -0
- data/test/data/KEGG/ec00072.pathway +23 -0
- data/test/data/KEGG/hsa00790.pathway +59 -0
- data/test/data/KEGG/ko00312.pathway +16 -0
- data/test/data/KEGG/map00030.pathway +37 -0
- data/test/data/KEGG/map00052.pathway +13 -0
- data/test/data/KEGG/rn00250.pathway +114 -0
- data/test/data/clustalw/example1.aln +58 -0
- data/test/data/go/selected_component.ontology +12 -0
- data/test/data/go/selected_gene_association.sgd +31 -0
- data/test/data/go/selected_wikipedia2go +13 -0
- data/test/data/medline/20146148_modified.medline +54 -0
- data/test/data/paml/codeml/models/aa.aln +26 -0
- data/test/data/paml/codeml/models/aa.dnd +13 -0
- data/test/data/paml/codeml/models/aa.ph +13 -0
- data/test/data/paml/codeml/models/alignment.phy +49 -0
- data/test/data/paml/codeml/models/results0-3.txt +312 -0
- data/test/data/paml/codeml/models/results7-8.txt +340 -0
- data/test/functional/bio/io/test_togows.rb +8 -8
- data/test/functional/bio/test_command.rb +7 -6
- data/test/unit/bio/appl/clustalw/test_report.rb +80 -0
- data/test/unit/bio/appl/paml/codeml/test_rates.rb +6 -6
- data/test/unit/bio/appl/paml/codeml/test_report.rb +231 -24
- data/test/unit/bio/appl/paml/codeml/test_report_single.rb +46 -0
- data/test/unit/bio/db/embl/test_sptr.rb +1 -1
- data/test/unit/bio/db/fasta/test_defline.rb +160 -0
- data/test/unit/bio/db/fasta/test_defline_misc.rb +490 -0
- data/test/unit/bio/db/kegg/test_genes.rb +281 -1
- data/test/unit/bio/db/kegg/test_genome.rb +408 -0
- data/test/unit/bio/db/kegg/test_module.rb +246 -0
- data/test/unit/bio/db/kegg/test_orthology.rb +95 -0
- data/test/unit/bio/db/kegg/test_pathway.rb +1250 -0
- data/test/unit/bio/db/test_aaindex.rb +8 -7
- data/test/unit/bio/db/test_fastq.rb +36 -0
- data/test/unit/bio/db/test_go.rb +171 -0
- data/test/unit/bio/db/test_medline.rb +148 -0
- data/test/unit/bio/db/test_qual.rb +9 -2
- data/test/unit/bio/sequence/test_sequence_masker.rb +169 -0
- data/test/unit/bio/test_tree.rb +260 -1
- data/test/unit/bio/util/test_contingency_table.rb +7 -7
- metadata +53 -6
data/KNOWN_ISSUES.rdoc
CHANGED
@@ -4,7 +4,7 @@ License:: The Ruby License
|
|
4
4
|
|
5
5
|
= Known issues and bugs in BioRuby
|
6
6
|
|
7
|
-
Below are known issues and bugs in BioRuby.
|
7
|
+
Below are known issues and bugs in BioRuby. Patches to fix them are welcome.
|
8
8
|
We hope they will be fixed in the future.
|
9
9
|
|
10
10
|
Items marked with (WONT_FIX) tags would not be fixed within BioRuby because
|
@@ -69,6 +69,16 @@ This indicates that br_bioflat.rb and Bio::FlatFileIndex may create
|
|
69
69
|
incorrect indexes on mswin32, mingw32, and bccwin32. In addition,
|
70
70
|
Bio::FlatFile may return incorrect data.
|
71
71
|
|
72
|
+
==== String escaping of command-line arguments
|
73
|
+
|
74
|
+
After BioRuby 1.4.1, in Ruby 1.9.X running on Windows, escaping of
|
75
|
+
command-line arguments are processed by the Ruby interpreter. Before BioRuby
|
76
|
+
1.4.0, the escaping is executed in Bio::Command#escape_shell_windows, and
|
77
|
+
the behavior is different from the Ruby interpreter's one.
|
78
|
+
|
79
|
+
Curreltly, due to the change, test/functional/bio/test_command.rb may fail
|
80
|
+
on Windows with Ruby 1.9.X.
|
81
|
+
|
72
82
|
==== Windows 95/98/98SE/ME
|
73
83
|
|
74
84
|
(WONT_FIX) Some methods that call external programs may not work in
|
data/README.rdoc
CHANGED
@@ -9,7 +9,7 @@ License:: The Ruby License
|
|
9
9
|
|
10
10
|
= BioRuby
|
11
11
|
|
12
|
-
Copyright (C) 2001-
|
12
|
+
Copyright (C) 2001-2010 Toshiaki Katayama <k@bioruby.org>
|
13
13
|
|
14
14
|
BioRuby is an open source Ruby library for developing bioinformatics
|
15
15
|
software. Object oriented scripting language Ruby has many features
|
@@ -42,6 +42,7 @@ services including KEGG API can be easily utilized by BioRuby.
|
|
42
42
|
README.rdoc:: This file. General information and installation procedure.
|
43
43
|
RELEASE_NOTES.rdoc:: News and important changes in this release.
|
44
44
|
KNOWN_ISSUES.rdoc:: Known issues and bugs in BioRuby.
|
45
|
+
doc/RELEASE_NOTES-1.4.0.rdoc:: News and incompatible changes from 1.3.1 to 1.4.0.
|
45
46
|
doc/Changes-1.3.rdoc:: News and incompatible changes from 1.2.1 to 1.3.0.
|
46
47
|
doc/Changes-0.7.rd:: News and incompatible changes from 0.6.4 to 1.2.1.
|
47
48
|
|
@@ -116,7 +117,7 @@ and could be obtained by the following procedure.
|
|
116
117
|
* Ruby 1.8.2 or later (except Ruby 1.9.0) -- http://www.ruby-lang.org/
|
117
118
|
* Ruby 1.8.7-p174 or later, or Ruby 1.8.6-p383 or later is recommended.
|
118
119
|
* Not yet fully ready with Ruby 1.9, although many components can now work
|
119
|
-
in Ruby 1.9.1.
|
120
|
+
in Ruby 1.9.1 and Ruby 1.9.2.
|
120
121
|
|
121
122
|
== OPTIONAL REQUIREMENTS
|
122
123
|
|
data/RELEASE_NOTES.rdoc
CHANGED
@@ -1,166 +1,104 @@
|
|
1
|
-
= BioRuby 1.4.
|
1
|
+
= BioRuby 1.4.1 RELEASE NOTES
|
2
2
|
|
3
|
-
A lot of changes have been made to the BioRuby 1.4.
|
3
|
+
A lot of changes have been made to the BioRuby 1.4.1 after the version 1.4.0
|
4
4
|
is released. This document describes important and/or incompatible changes
|
5
|
-
since the BioRuby 1.
|
5
|
+
since the BioRuby 1.4.0 release.
|
6
6
|
|
7
|
-
|
8
|
-
|
9
|
-
=== PhyloXML support
|
10
|
-
|
11
|
-
Support for reading and writing PhyloXML file format is added. New classes
|
12
|
-
Bio::PhyloXML::Parser and Bio::PhyloXML::Writer are used to read and write
|
13
|
-
a PhyloXML file, respectively.
|
14
|
-
|
15
|
-
The code is developed by Diana Jaunzeikare, mentored by Christian M Zmasek
|
16
|
-
and co-mentors, supported by Google Summer of Code 2009 in collaboration
|
17
|
-
with the National Evolutionary Synthesis Center (NESCent).
|
18
|
-
|
19
|
-
=== FASTQ file format support
|
7
|
+
For known problems, see KNOWN_ISSUES.rdoc.
|
20
8
|
|
21
|
-
|
22
|
-
three FASTQ format variants are supported.
|
9
|
+
== New features
|
23
10
|
|
24
|
-
|
25
|
-
of the FASTQ format is supported (although the three format variants should
|
26
|
-
be specified later by users if quality scores are needed).
|
11
|
+
=== PAML Codeml support is significantly improved
|
27
12
|
|
28
|
-
|
29
|
-
|
30
|
-
"to_biosequnece" method. Bio::Sequence#output now supports output of the
|
31
|
-
FASTQ format.
|
13
|
+
PAML Codeml result parser is completely rewritten and is significantly
|
14
|
+
improved. The code is developed by Pjotr Prins.
|
32
15
|
|
33
|
-
|
34
|
-
open-bio-l mailing list. The prototype of Bio::Fastq class was first
|
35
|
-
developed during the BioHackathon 2009 held in Okinawa.
|
16
|
+
=== KEGG PATHWAY and KEGG MODULE parser
|
36
17
|
|
37
|
-
|
18
|
+
Parsers for KEGG PATHWAY and KEGG MODULE data are added. The code is developed
|
19
|
+
by Kozo Nishida and Toshiaki Katayama.
|
38
20
|
|
39
|
-
|
40
|
-
formats are supported. The code is developed by Anthony Underwood.
|
21
|
+
=== Bio::KEGG improvements
|
41
22
|
|
42
|
-
|
23
|
+
Following new methods are added.
|
43
24
|
|
44
|
-
|
45
|
-
|
46
|
-
|
25
|
+
* Bio::KEGG::GENES#keggclass, keggclasses, names_as_array, names,
|
26
|
+
motifs_as_strings, motifs_as_hash, motifs
|
27
|
+
* Bio::KEGG::GENOME#original_databases
|
47
28
|
|
48
|
-
===
|
29
|
+
=== Test codes are added and improved.
|
49
30
|
|
50
|
-
|
51
|
-
|
52
|
-
addition, return value types of some methods are also changed for unifying
|
53
|
-
APIs among KEGG parser classes. See incompatible changes below for details.
|
31
|
+
Test codes are added and improved. Tney are developed by Kazuhiro Hayashi,
|
32
|
+
Kozo Nishida, John Prince, and Naohisa Goto.
|
54
33
|
|
55
|
-
===
|
34
|
+
=== Other new methods
|
56
35
|
|
57
|
-
|
58
|
-
|
59
|
-
|
36
|
+
* Bio::Fastq#mask
|
37
|
+
* Bio::Sequence#output_fasta
|
38
|
+
* Bio::ClustalW::Report#get_sequence
|
39
|
+
* Bio::Reference#==
|
40
|
+
* Bio::Location#==
|
41
|
+
* Bio::Locations#==
|
42
|
+
* Bio::FastaNumericFormat#to_biosequence
|
60
43
|
|
61
|
-
|
44
|
+
== Bug fixes
|
62
45
|
|
63
|
-
|
64
|
-
and the library path and test data path can be specified with environment
|
65
|
-
variables. BIORUBY_TEST_LIB is the path to be added to the Ruby's $LOAD_PATH.
|
66
|
-
For example, to test BioRuby installed in
|
67
|
-
/usr/local/lib/site_ruby/1.8, run
|
68
|
-
env BIORUBY_TEST_LIB=/usr/local/lib/site_ruby/1.8 ruby test/runner.rb
|
46
|
+
=== Bio::Tree
|
69
47
|
|
70
|
-
|
71
|
-
flag to turn on debug of the tests.
|
48
|
+
Following methods did not work correctly.
|
72
49
|
|
73
|
-
|
50
|
+
* Bio::Tree#collect_edge!
|
51
|
+
* Bio::Tree#remove_edge_if
|
74
52
|
|
75
|
-
===
|
53
|
+
=== Bio::KEGG::GENES and Bio::KEGG::GENOME
|
76
54
|
|
77
|
-
|
78
|
-
|
55
|
+
* Fixed bugs in Bio::KEGG::GENES#pathway.
|
56
|
+
* Fixed parser errors due to the format changes of KEGG GENES and KEGG GENOME.
|
79
57
|
|
80
|
-
===
|
58
|
+
=== Other bug fixes
|
81
59
|
|
82
|
-
|
83
|
-
|
84
|
-
|
60
|
+
* In Bio::Command, changed not to call fork(2) on platforms that do not
|
61
|
+
support it.
|
62
|
+
* Bio::MEDLINE#initialize should handle continuation of lines.
|
63
|
+
* Typo and a missing field in Bio::GO::GeneAssociation#to_str.
|
64
|
+
* Bug fix of Bio::FastaNumericFormat#to_biosequence.
|
65
|
+
* Fixed UniProt GN parsing issue in Bio::SPTR.
|
85
66
|
|
86
67
|
== Incompatible changes
|
87
68
|
|
88
|
-
=== Bio::
|
89
|
-
|
90
|
-
NCBI announces that all Entrez E-utility requests must contain email and
|
91
|
-
tool parameters, and requests without them will return error after June
|
92
|
-
2010.
|
93
|
-
|
94
|
-
To set default email address and tool name, following methods are added.
|
95
|
-
* Bio::NCBI.default_email=(email)
|
96
|
-
* Bio::NCBI.default_tool=(tool_name)
|
97
|
-
|
98
|
-
For every query, Bio::NCBI::REST checks the email and tool parameters and
|
99
|
-
raises error if they are empty.
|
100
|
-
|
101
|
-
IMPORTANT NOTE: No default email address is preset in BioRuby. Programmers
|
102
|
-
using BioRuby must set their own email address or implement to get user's
|
103
|
-
email address in some way (from input form, configuration file, etc).
|
104
|
-
|
105
|
-
Default tool name is set as "#{$0} (bioruby/#{Bio::BIORUBY_VERSION_ID})".
|
106
|
-
For example, if you run "ruby my_script.rb" with BioRuby 1.4.0, the value is
|
107
|
-
"my_script.rb (bioruby/1.4.0)".
|
108
|
-
|
109
|
-
=== Bio::KEGG
|
110
|
-
|
111
|
-
==== dblinks method
|
112
|
-
|
113
|
-
In Bio::KEGG::COMPOUND, DRUG, ENZYME, GLYCAN and ORTHOLOGY, the method
|
114
|
-
dblinks is changed to return a Hash. Each key of the hash is a database name
|
115
|
-
and its value is an array of entry IDs in the database. If old behavior
|
116
|
-
(returns raw entry lines as an array of strings) is needed, use
|
117
|
-
dblinks_as_strings.
|
118
|
-
|
119
|
-
==== pathways method
|
120
|
-
|
121
|
-
In Bio::KEGG::COMPOUND, DRUG, ENZYME, GENES, GLYCAN and REACTION, the
|
122
|
-
method pathways is changed to return a Hash. Each key of the hash is a
|
123
|
-
pathway ID and its value is the description of the pathway.
|
124
|
-
|
125
|
-
In Bio::KEGG::GENES, if old behavior (returns pathway IDs as an Array) is
|
126
|
-
needed, use pathways.keys.
|
127
|
-
|
128
|
-
In Bio::KEGG::COMPOUND, DRUG, ENZYME, GLYCAN, and REACTION, if old behavior
|
129
|
-
(returns raw entry lines as an array of strings) is needed, use
|
130
|
-
pathways_as_strings.
|
131
|
-
|
132
|
-
Note that Bio::KEGG::ORTHOLOGY#pathways is not changed (returns an array
|
133
|
-
containing pathway IDs).
|
69
|
+
=== Bio::PAML::Codeml::Report
|
134
70
|
|
135
|
-
|
71
|
+
The code is completely rewritten. See the RDoc for details.
|
136
72
|
|
137
|
-
|
138
|
-
changed to return a Hash. Each key of the hash is a ortholog ID and its
|
139
|
-
value is the name of the ortholog. If old behavior (returns raw entry lines
|
140
|
-
as an array of strings) is needed, use orthologs_as_strings.
|
73
|
+
=== Bio::KEGG::ORTHOLOGY
|
141
74
|
|
142
|
-
|
75
|
+
Bio::KEGG::ORTHOLOGY#pathways is changed to return a hash. The old pathway
|
76
|
+
method is renamed to pathways_in_keggclass for compatibility.
|
143
77
|
|
144
|
-
|
145
|
-
return a Hash that is the same as Bio::KEGG::ORTHOLOGY#genes_as_hash.
|
146
|
-
If old behavior (returns raw entry lines as an array of strings) is needed,
|
147
|
-
use genes_as_strings.
|
78
|
+
=== Bio::AAindex2
|
148
79
|
|
149
|
-
|
80
|
+
Bio::AAindex2 now copies each symmetric element for lower triangular matrix
|
81
|
+
to the upper right part, because the Matrix class in Ruby 1.9.2 no longer
|
82
|
+
accepts any dimension mismatches. We think the previous behavior is a bug.
|
150
83
|
|
151
|
-
Bio::
|
152
|
-
hash is a KEGG Rpair ID and its value is an array containing name and type.
|
153
|
-
If old behavior (returns as tokens) is needed, use rpairs_as_tokens.
|
84
|
+
=== Bio::MEDLINE
|
154
85
|
|
155
|
-
|
86
|
+
Bio::MEDLINE#reference no longer puts empty values in the returned
|
87
|
+
Bio::Reference object. We think the previous behavior is a bug.
|
88
|
+
We also think the effect is very small.
|
156
89
|
|
157
|
-
|
90
|
+
== Known issues
|
158
91
|
|
159
|
-
|
92
|
+
The following issues are added or updated. See KNOWN_ISSUES.rdoc for other
|
93
|
+
already known issues.
|
160
94
|
|
161
|
-
|
95
|
+
=== String escaping of command-line arguments in Ruby 1.9.X on Windows
|
162
96
|
|
163
|
-
|
97
|
+
After BioRuby 1.4.1, in Ruby 1.9.X running on Windows, escaping of
|
98
|
+
command-line arguments are processed by the Ruby interpreter. Before BioRuby
|
99
|
+
1.4.0, the escaping is executed in Bio::Command#escape_shell_windows, and
|
100
|
+
the behavior is different from the Ruby interpreter's one.
|
164
101
|
|
165
|
-
|
102
|
+
Curreltly, due to the change, test/functional/bio/test_command.rb may fail
|
103
|
+
on Windows with Ruby 1.9.X.
|
166
104
|
|
data/bioruby.gemspec
CHANGED
@@ -3,7 +3,7 @@
|
|
3
3
|
#
|
4
4
|
Gem::Specification.new do |s|
|
5
5
|
s.name = 'bio'
|
6
|
-
s.version = "1.4.
|
6
|
+
s.version = "1.4.1"
|
7
7
|
|
8
8
|
s.author = "BioRuby project"
|
9
9
|
s.email = "staff@bioruby.org"
|
@@ -37,6 +37,7 @@ Gem::Specification.new do |s|
|
|
37
37
|
"doc/Changes-1.3.rdoc",
|
38
38
|
"doc/KEGG_API.rd",
|
39
39
|
"doc/KEGG_API.rd.ja",
|
40
|
+
"doc/RELEASE_NOTES-1.4.0.rdoc",
|
40
41
|
"doc/Tutorial.rd",
|
41
42
|
"doc/Tutorial.rd.html",
|
42
43
|
"doc/Tutorial.rd.ja",
|
@@ -124,6 +125,7 @@ Gem::Specification.new do |s|
|
|
124
125
|
"lib/bio/db/fasta/format_fasta.rb",
|
125
126
|
"lib/bio/db/fasta/format_qual.rb",
|
126
127
|
"lib/bio/db/fasta/qual.rb",
|
128
|
+
"lib/bio/db/fasta/qual_to_biosequence.rb",
|
127
129
|
"lib/bio/db/fastq.rb",
|
128
130
|
"lib/bio/db/fastq/fastq_to_biosequence.rb",
|
129
131
|
"lib/bio/db/fastq/format_fastq.rb",
|
@@ -147,7 +149,9 @@ Gem::Specification.new do |s|
|
|
147
149
|
"lib/bio/db/kegg/glycan.rb",
|
148
150
|
"lib/bio/db/kegg/keggtab.rb",
|
149
151
|
"lib/bio/db/kegg/kgml.rb",
|
152
|
+
"lib/bio/db/kegg/module.rb",
|
150
153
|
"lib/bio/db/kegg/orthology.rb",
|
154
|
+
"lib/bio/db/kegg/pathway.rb",
|
151
155
|
"lib/bio/db/kegg/reaction.rb",
|
152
156
|
"lib/bio/db/kegg/taxonomy.rb",
|
153
157
|
"lib/bio/db/lasergene.rb",
|
@@ -219,6 +223,7 @@ Gem::Specification.new do |s|
|
|
219
223
|
"lib/bio/sequence/generic.rb",
|
220
224
|
"lib/bio/sequence/na.rb",
|
221
225
|
"lib/bio/sequence/quality_score.rb",
|
226
|
+
"lib/bio/sequence/sequence_masker.rb",
|
222
227
|
"lib/bio/shell.rb",
|
223
228
|
"lib/bio/shell/core.rb",
|
224
229
|
"lib/bio/shell/demo.rb",
|
@@ -370,7 +375,17 @@ Gem::Specification.new do |s|
|
|
370
375
|
"test/data/KEGG/G00024.glycan",
|
371
376
|
"test/data/KEGG/G01366.glycan",
|
372
377
|
"test/data/KEGG/K02338.orthology",
|
378
|
+
"test/data/KEGG/M00118.module",
|
373
379
|
"test/data/KEGG/R00006.reaction",
|
380
|
+
"test/data/KEGG/T00005.genome",
|
381
|
+
"test/data/KEGG/T00070.genome",
|
382
|
+
"test/data/KEGG/b0529.gene",
|
383
|
+
"test/data/KEGG/ec00072.pathway",
|
384
|
+
"test/data/KEGG/hsa00790.pathway",
|
385
|
+
"test/data/KEGG/ko00312.pathway",
|
386
|
+
"test/data/KEGG/map00030.pathway",
|
387
|
+
"test/data/KEGG/map00052.pathway",
|
388
|
+
"test/data/KEGG/rn00250.pathway",
|
374
389
|
"test/data/SOSUI/sample.report",
|
375
390
|
"test/data/TMHMM/sample.report",
|
376
391
|
"test/data/aaindex/DAYM780301",
|
@@ -383,6 +398,7 @@ Gem::Specification.new do |s|
|
|
383
398
|
"test/data/blast/b0002.faa.m7",
|
384
399
|
"test/data/blast/b0002.faa.m8",
|
385
400
|
"test/data/blast/blastp-multi.m7",
|
401
|
+
"test/data/clustalw/example1.aln",
|
386
402
|
"test/data/command/echoarg2.bat",
|
387
403
|
"test/data/embl/AB090716.embl",
|
388
404
|
"test/data/embl/AB090716.embl.rel89",
|
@@ -441,13 +457,23 @@ Gem::Specification.new do |s|
|
|
441
457
|
"test/data/fastq/wrapping_original_sanger.fastq",
|
442
458
|
"test/data/gcg/pileup-aa.msf",
|
443
459
|
"test/data/genscan/sample.report",
|
460
|
+
"test/data/go/selected_component.ontology",
|
461
|
+
"test/data/go/selected_gene_association.sgd",
|
462
|
+
"test/data/go/selected_wikipedia2go",
|
444
463
|
"test/data/iprscan/merged.raw",
|
445
464
|
"test/data/iprscan/merged.txt",
|
465
|
+
"test/data/medline/20146148_modified.medline",
|
446
466
|
"test/data/meme/db",
|
447
467
|
"test/data/meme/mast",
|
448
468
|
"test/data/meme/mast.out",
|
449
469
|
"test/data/meme/meme.out",
|
450
470
|
"test/data/paml/codeml/control_file.txt",
|
471
|
+
"test/data/paml/codeml/models/aa.aln",
|
472
|
+
"test/data/paml/codeml/models/aa.dnd",
|
473
|
+
"test/data/paml/codeml/models/aa.ph",
|
474
|
+
"test/data/paml/codeml/models/alignment.phy",
|
475
|
+
"test/data/paml/codeml/models/results0-3.txt",
|
476
|
+
"test/data/paml/codeml/models/results7-8.txt",
|
451
477
|
"test/data/paml/codeml/output.txt",
|
452
478
|
"test/data/paml/codeml/rates",
|
453
479
|
"test/data/phyloxml/apaf.xml",
|
@@ -479,6 +505,7 @@ Gem::Specification.new do |s|
|
|
479
505
|
"test/unit/bio/appl/blast/test_ncbioptions.rb",
|
480
506
|
"test/unit/bio/appl/blast/test_report.rb",
|
481
507
|
"test/unit/bio/appl/blast/test_rpsblast.rb",
|
508
|
+
"test/unit/bio/appl/clustalw/test_report.rb",
|
482
509
|
"test/unit/bio/appl/gcg/test_msf.rb",
|
483
510
|
"test/unit/bio/appl/genscan/test_report.rb",
|
484
511
|
"test/unit/bio/appl/hmmer/test_report.rb",
|
@@ -489,6 +516,7 @@ Gem::Specification.new do |s|
|
|
489
516
|
"test/unit/bio/appl/meme/test_motif.rb",
|
490
517
|
"test/unit/bio/appl/paml/codeml/test_rates.rb",
|
491
518
|
"test/unit/bio/appl/paml/codeml/test_report.rb",
|
519
|
+
"test/unit/bio/appl/paml/codeml/test_report_single.rb",
|
492
520
|
"test/unit/bio/appl/paml/test_codeml.rb",
|
493
521
|
"test/unit/bio/appl/sim4/test_report.rb",
|
494
522
|
"test/unit/bio/appl/sosui/test_report.rb",
|
@@ -508,13 +536,18 @@ Gem::Specification.new do |s|
|
|
508
536
|
"test/unit/bio/db/embl/test_embl_to_bioseq.rb",
|
509
537
|
"test/unit/bio/db/embl/test_sptr.rb",
|
510
538
|
"test/unit/bio/db/embl/test_uniprot.rb",
|
539
|
+
"test/unit/bio/db/fasta/test_defline.rb",
|
540
|
+
"test/unit/bio/db/fasta/test_defline_misc.rb",
|
511
541
|
"test/unit/bio/db/fasta/test_format_qual.rb",
|
512
542
|
"test/unit/bio/db/kegg/test_compound.rb",
|
513
543
|
"test/unit/bio/db/kegg/test_drug.rb",
|
514
544
|
"test/unit/bio/db/kegg/test_enzyme.rb",
|
515
545
|
"test/unit/bio/db/kegg/test_genes.rb",
|
546
|
+
"test/unit/bio/db/kegg/test_genome.rb",
|
516
547
|
"test/unit/bio/db/kegg/test_glycan.rb",
|
548
|
+
"test/unit/bio/db/kegg/test_module.rb",
|
517
549
|
"test/unit/bio/db/kegg/test_orthology.rb",
|
550
|
+
"test/unit/bio/db/kegg/test_pathway.rb",
|
518
551
|
"test/unit/bio/db/kegg/test_reaction.rb",
|
519
552
|
"test/unit/bio/db/pdb/test_pdb.rb",
|
520
553
|
"test/unit/bio/db/sanger_chromatogram/test_abif.rb",
|
@@ -523,6 +556,7 @@ Gem::Specification.new do |s|
|
|
523
556
|
"test/unit/bio/db/test_fasta.rb",
|
524
557
|
"test/unit/bio/db/test_fastq.rb",
|
525
558
|
"test/unit/bio/db/test_gff.rb",
|
559
|
+
"test/unit/bio/db/test_go.rb",
|
526
560
|
"test/unit/bio/db/test_lasergene.rb",
|
527
561
|
"test/unit/bio/db/test_medline.rb",
|
528
562
|
"test/unit/bio/db/test_newick.rb",
|
@@ -548,6 +582,7 @@ Gem::Specification.new do |s|
|
|
548
582
|
"test/unit/bio/sequence/test_dblink.rb",
|
549
583
|
"test/unit/bio/sequence/test_na.rb",
|
550
584
|
"test/unit/bio/sequence/test_quality_score.rb",
|
585
|
+
"test/unit/bio/sequence/test_sequence_masker.rb",
|
551
586
|
"test/unit/bio/shell/plugin/test_seq.rb",
|
552
587
|
"test/unit/bio/test_alignment.rb",
|
553
588
|
"test/unit/bio/test_command.rb",
|
@@ -588,7 +623,8 @@ Gem::Specification.new do |s|
|
|
588
623
|
"README.rdoc",
|
589
624
|
"README_DEV.rdoc",
|
590
625
|
"RELEASE_NOTES.rdoc",
|
591
|
-
"doc/Changes-1.3.rdoc"
|
626
|
+
"doc/Changes-1.3.rdoc",
|
627
|
+
"doc/RELEASE_NOTES-1.4.0.rdoc"
|
592
628
|
]
|
593
629
|
s.rdoc_options << '--main' << 'README.rdoc'
|
594
630
|
s.rdoc_options << '--title' << 'BioRuby API documentation'
|
@@ -0,0 +1,167 @@
|
|
1
|
+
= BioRuby 1.4.0 RELEASE NOTES
|
2
|
+
|
3
|
+
A lot of changes have been made to the BioRuby 1.4.0 after the version 1.3.1
|
4
|
+
is released. This document describes important and/or incompatible changes
|
5
|
+
since the BioRuby 1.3.1 release.
|
6
|
+
|
7
|
+
== New features
|
8
|
+
|
9
|
+
=== PhyloXML support
|
10
|
+
|
11
|
+
Support for reading and writing PhyloXML file format is added. New classes
|
12
|
+
Bio::PhyloXML::Parser and Bio::PhyloXML::Writer are used to read and write
|
13
|
+
a PhyloXML file, respectively.
|
14
|
+
|
15
|
+
The code is developed by Diana Jaunzeikare, mentored by Christian M Zmasek
|
16
|
+
and co-mentors, supported by Google Summer of Code 2009 in collaboration
|
17
|
+
with the National Evolutionary Synthesis Center (NESCent).
|
18
|
+
|
19
|
+
=== FASTQ file format support
|
20
|
+
|
21
|
+
Support for reading and writing FASTQ file format is added. All of the
|
22
|
+
three FASTQ format variants are supported.
|
23
|
+
|
24
|
+
To read a FASTQ file, Bio::FlatFile can be used. File format auto-detection
|
25
|
+
of the FASTQ format is supported (although the three format variants should
|
26
|
+
be specified later by users if quality scores are needed).
|
27
|
+
|
28
|
+
New class Bio::Fastq is the parser class for the FASTQ format. An object
|
29
|
+
of the Bio::Fastq class can be converted to a Bio::Sequence object with the
|
30
|
+
"to_biosequnece" method. Bio::Sequence#output now supports output of the
|
31
|
+
FASTQ format.
|
32
|
+
|
33
|
+
The code is written by Naohisa Goto, with the help of discussions in the
|
34
|
+
open-bio-l mailing list. The prototype of Bio::Fastq class was first
|
35
|
+
developed during the BioHackathon 2009 held in Okinawa.
|
36
|
+
|
37
|
+
=== DNA chromatogram support
|
38
|
+
|
39
|
+
Support for reading DNA chromatogram files are added. SCF and and ABIF file
|
40
|
+
formats are supported. The code is developed by Anthony Underwood.
|
41
|
+
|
42
|
+
=== MEME (motif-based sequence analysis tools) support
|
43
|
+
|
44
|
+
Support for running MAST (Motif Aliginment & Search Tool, part of the MEME
|
45
|
+
Suite, motif-based sequence analysis tools) and parsing its results are
|
46
|
+
added. The code is developed by Adam Kraut.
|
47
|
+
|
48
|
+
=== Improvement of KEGG parser classes
|
49
|
+
|
50
|
+
Some new methods are added to parse new fields added to some KEGG file
|
51
|
+
formats. Unit tests for KEGG parsers are also added and improved. In
|
52
|
+
addition, return value types of some methods are also changed for unifying
|
53
|
+
APIs among KEGG parser classes. See incompatible changes below for details.
|
54
|
+
Most of them are contributed by Kozo Nishida.
|
55
|
+
|
56
|
+
=== Many sample scripts are added
|
57
|
+
|
58
|
+
Many sample scripts showing demonstrations of usages of classes are added.
|
59
|
+
They are moved from primitive test codes for the classes described in the
|
60
|
+
"if __FILE__ == $0" convention in the library files.
|
61
|
+
|
62
|
+
=== Unit tests can test installed BioRuby
|
63
|
+
|
64
|
+
Mechanism to load library and to find test data in the unit tests are changed,
|
65
|
+
and the library path and test data path can be specified with environment
|
66
|
+
variables. BIORUBY_TEST_LIB is the path to be added to the Ruby's $LOAD_PATH.
|
67
|
+
For example, to test BioRuby installed in
|
68
|
+
/usr/local/lib/site_ruby/1.8, run
|
69
|
+
env BIORUBY_TEST_LIB=/usr/local/lib/site_ruby/1.8 ruby test/runner.rb
|
70
|
+
|
71
|
+
BIORUBY_TEST_DATA is the path of the test data, and BIORUBY_TEST_DEBUG is a
|
72
|
+
flag to turn on debug of the tests.
|
73
|
+
|
74
|
+
== Deprecated features
|
75
|
+
|
76
|
+
=== ChangeLog is replaced by git log
|
77
|
+
|
78
|
+
ChangeLog is replaced by the output of git-log command, and ChangeLog before
|
79
|
+
the 1.3.1 release is moved to doc/ChangeLog-before-1.3.1.
|
80
|
+
|
81
|
+
=== "if __FILE__ == $0" convention
|
82
|
+
|
83
|
+
Primitive test codes in the "if __FILE__ == $0" convention are removed and
|
84
|
+
the codes are moved to the sample scripts named sample/demo_*.rb (except
|
85
|
+
some older or deprecated files).
|
86
|
+
|
87
|
+
== Incompatible changes
|
88
|
+
|
89
|
+
=== Bio::NCBI::REST
|
90
|
+
|
91
|
+
NCBI announces that all Entrez E-utility requests must contain email and
|
92
|
+
tool parameters, and requests without them will return error after June
|
93
|
+
2010.
|
94
|
+
|
95
|
+
To set default email address and tool name, following methods are added.
|
96
|
+
* Bio::NCBI.default_email=(email)
|
97
|
+
* Bio::NCBI.default_tool=(tool_name)
|
98
|
+
|
99
|
+
For every query, Bio::NCBI::REST checks the email and tool parameters and
|
100
|
+
raises error if they are empty.
|
101
|
+
|
102
|
+
IMPORTANT NOTE: No default email address is preset in BioRuby. Programmers
|
103
|
+
using BioRuby must set their own email address or implement to get user's
|
104
|
+
email address in some way (from input form, configuration file, etc).
|
105
|
+
|
106
|
+
Default tool name is set as "#{$0} (bioruby/#{Bio::BIORUBY_VERSION_ID})".
|
107
|
+
For example, if you run "ruby my_script.rb" with BioRuby 1.4.0, the value is
|
108
|
+
"my_script.rb (bioruby/1.4.0)".
|
109
|
+
|
110
|
+
=== Bio::KEGG
|
111
|
+
|
112
|
+
==== dblinks method
|
113
|
+
|
114
|
+
In Bio::KEGG::COMPOUND, DRUG, ENZYME, GLYCAN and ORTHOLOGY, the method
|
115
|
+
dblinks is changed to return a Hash. Each key of the hash is a database name
|
116
|
+
and its value is an array of entry IDs in the database. If old behavior
|
117
|
+
(returns raw entry lines as an array of strings) is needed, use
|
118
|
+
dblinks_as_strings.
|
119
|
+
|
120
|
+
==== pathways method
|
121
|
+
|
122
|
+
In Bio::KEGG::COMPOUND, DRUG, ENZYME, GENES, GLYCAN and REACTION, the
|
123
|
+
method pathways is changed to return a Hash. Each key of the hash is a
|
124
|
+
pathway ID and its value is the description of the pathway.
|
125
|
+
|
126
|
+
In Bio::KEGG::GENES, if old behavior (returns pathway IDs as an Array) is
|
127
|
+
needed, use pathways.keys.
|
128
|
+
|
129
|
+
In Bio::KEGG::COMPOUND, DRUG, ENZYME, GLYCAN, and REACTION, if old behavior
|
130
|
+
(returns raw entry lines as an array of strings) is needed, use
|
131
|
+
pathways_as_strings.
|
132
|
+
|
133
|
+
Note that Bio::KEGG::ORTHOLOGY#pathways is not changed (returns an array
|
134
|
+
containing pathway IDs).
|
135
|
+
|
136
|
+
==== orthologs method
|
137
|
+
|
138
|
+
In Bio::KEGG::ENZYME, GENES, GLYCAN and REACTION, the method orthologs is
|
139
|
+
changed to return a Hash. Each key of the hash is a ortholog ID and its
|
140
|
+
value is the name of the ortholog. If old behavior (returns raw entry lines
|
141
|
+
as an array of strings) is needed, use orthologs_as_strings.
|
142
|
+
|
143
|
+
==== genes method
|
144
|
+
|
145
|
+
In Bio::KEGG::ENZYME#genes and Bio::KEGG::ORTHOLOGY#genes is changed to
|
146
|
+
return a Hash that is the same as Bio::KEGG::ORTHOLOGY#genes_as_hash.
|
147
|
+
If old behavior (returns raw entry lines as an array of strings) is needed,
|
148
|
+
use genes_as_strings.
|
149
|
+
|
150
|
+
==== Bio::KEGG:REACTION#rpairs
|
151
|
+
|
152
|
+
Bio::KEGG::REACTION#rpairs is changed to return a Hash. Each key of the
|
153
|
+
hash is a KEGG Rpair ID and its value is an array containing name and type.
|
154
|
+
If old behavior (returns as tokens) is needed, use rpairs_as_tokens.
|
155
|
+
|
156
|
+
==== Bio::KEGG::ORTHOLOGY
|
157
|
+
|
158
|
+
Bio::KEGG:ORTHOLOGY#dblinks_as_hash does not lower-case database names.
|
159
|
+
|
160
|
+
=== Bio::RestrictionEnzyme
|
161
|
+
|
162
|
+
Format validation when creating an object is turned off because of efficiency.
|
163
|
+
|
164
|
+
== Known problems
|
165
|
+
|
166
|
+
See KNOWN_ISSUES.rdoc for details.
|
167
|
+
|