bio 1.4.0 → 1.4.1

Sign up to get free protection for your applications and to get access to all the features.
Files changed (82) hide show
  1. data/ChangeLog +1712 -0
  2. data/KNOWN_ISSUES.rdoc +11 -1
  3. data/README.rdoc +3 -2
  4. data/RELEASE_NOTES.rdoc +65 -127
  5. data/bioruby.gemspec +38 -2
  6. data/doc/RELEASE_NOTES-1.4.0.rdoc +167 -0
  7. data/doc/Tutorial.rd +74 -16
  8. data/doc/Tutorial.rd.html +68 -16
  9. data/lib/bio.rb +2 -0
  10. data/lib/bio/appl/clustalw/report.rb +18 -0
  11. data/lib/bio/appl/paml/codeml/report.rb +579 -21
  12. data/lib/bio/command.rb +149 -21
  13. data/lib/bio/db/aaindex.rb +11 -1
  14. data/lib/bio/db/embl/sptr.rb +1 -1
  15. data/lib/bio/db/fasta/defline.rb +7 -2
  16. data/lib/bio/db/fasta/qual.rb +24 -0
  17. data/lib/bio/db/fasta/qual_to_biosequence.rb +29 -0
  18. data/lib/bio/db/fastq.rb +15 -0
  19. data/lib/bio/db/go.rb +2 -2
  20. data/lib/bio/db/kegg/common.rb +109 -5
  21. data/lib/bio/db/kegg/genes.rb +61 -15
  22. data/lib/bio/db/kegg/genome.rb +43 -38
  23. data/lib/bio/db/kegg/module.rb +158 -0
  24. data/lib/bio/db/kegg/orthology.rb +40 -1
  25. data/lib/bio/db/kegg/pathway.rb +254 -0
  26. data/lib/bio/db/medline.rb +6 -2
  27. data/lib/bio/io/flatfile/autodetection.rb +6 -0
  28. data/lib/bio/location.rb +39 -0
  29. data/lib/bio/reference.rb +24 -0
  30. data/lib/bio/sequence.rb +2 -0
  31. data/lib/bio/sequence/adapter.rb +1 -0
  32. data/lib/bio/sequence/format.rb +14 -0
  33. data/lib/bio/sequence/sequence_masker.rb +95 -0
  34. data/lib/bio/tree.rb +4 -4
  35. data/lib/bio/util/restriction_enzyme/double_stranded/aligned_strands.rb +5 -0
  36. data/lib/bio/version.rb +1 -1
  37. data/setup.rb +5 -0
  38. data/test/data/KEGG/K02338.orthology +180 -52
  39. data/test/data/KEGG/M00118.module +44 -0
  40. data/test/data/KEGG/T00005.genome +140 -0
  41. data/test/data/KEGG/T00070.genome +34 -0
  42. data/test/data/KEGG/b0529.gene +47 -0
  43. data/test/data/KEGG/ec00072.pathway +23 -0
  44. data/test/data/KEGG/hsa00790.pathway +59 -0
  45. data/test/data/KEGG/ko00312.pathway +16 -0
  46. data/test/data/KEGG/map00030.pathway +37 -0
  47. data/test/data/KEGG/map00052.pathway +13 -0
  48. data/test/data/KEGG/rn00250.pathway +114 -0
  49. data/test/data/clustalw/example1.aln +58 -0
  50. data/test/data/go/selected_component.ontology +12 -0
  51. data/test/data/go/selected_gene_association.sgd +31 -0
  52. data/test/data/go/selected_wikipedia2go +13 -0
  53. data/test/data/medline/20146148_modified.medline +54 -0
  54. data/test/data/paml/codeml/models/aa.aln +26 -0
  55. data/test/data/paml/codeml/models/aa.dnd +13 -0
  56. data/test/data/paml/codeml/models/aa.ph +13 -0
  57. data/test/data/paml/codeml/models/alignment.phy +49 -0
  58. data/test/data/paml/codeml/models/results0-3.txt +312 -0
  59. data/test/data/paml/codeml/models/results7-8.txt +340 -0
  60. data/test/functional/bio/io/test_togows.rb +8 -8
  61. data/test/functional/bio/test_command.rb +7 -6
  62. data/test/unit/bio/appl/clustalw/test_report.rb +80 -0
  63. data/test/unit/bio/appl/paml/codeml/test_rates.rb +6 -6
  64. data/test/unit/bio/appl/paml/codeml/test_report.rb +231 -24
  65. data/test/unit/bio/appl/paml/codeml/test_report_single.rb +46 -0
  66. data/test/unit/bio/db/embl/test_sptr.rb +1 -1
  67. data/test/unit/bio/db/fasta/test_defline.rb +160 -0
  68. data/test/unit/bio/db/fasta/test_defline_misc.rb +490 -0
  69. data/test/unit/bio/db/kegg/test_genes.rb +281 -1
  70. data/test/unit/bio/db/kegg/test_genome.rb +408 -0
  71. data/test/unit/bio/db/kegg/test_module.rb +246 -0
  72. data/test/unit/bio/db/kegg/test_orthology.rb +95 -0
  73. data/test/unit/bio/db/kegg/test_pathway.rb +1250 -0
  74. data/test/unit/bio/db/test_aaindex.rb +8 -7
  75. data/test/unit/bio/db/test_fastq.rb +36 -0
  76. data/test/unit/bio/db/test_go.rb +171 -0
  77. data/test/unit/bio/db/test_medline.rb +148 -0
  78. data/test/unit/bio/db/test_qual.rb +9 -2
  79. data/test/unit/bio/sequence/test_sequence_masker.rb +169 -0
  80. data/test/unit/bio/test_tree.rb +260 -1
  81. data/test/unit/bio/util/test_contingency_table.rb +7 -7
  82. metadata +53 -6
@@ -4,7 +4,7 @@ License:: The Ruby License
4
4
 
5
5
  = Known issues and bugs in BioRuby
6
6
 
7
- Below are known issues and bugs in BioRuby. Pathes to fix them are welcomed.
7
+ Below are known issues and bugs in BioRuby. Patches to fix them are welcome.
8
8
  We hope they will be fixed in the future.
9
9
 
10
10
  Items marked with (WONT_FIX) tags would not be fixed within BioRuby because
@@ -69,6 +69,16 @@ This indicates that br_bioflat.rb and Bio::FlatFileIndex may create
69
69
  incorrect indexes on mswin32, mingw32, and bccwin32. In addition,
70
70
  Bio::FlatFile may return incorrect data.
71
71
 
72
+ ==== String escaping of command-line arguments
73
+
74
+ After BioRuby 1.4.1, in Ruby 1.9.X running on Windows, escaping of
75
+ command-line arguments are processed by the Ruby interpreter. Before BioRuby
76
+ 1.4.0, the escaping is executed in Bio::Command#escape_shell_windows, and
77
+ the behavior is different from the Ruby interpreter's one.
78
+
79
+ Curreltly, due to the change, test/functional/bio/test_command.rb may fail
80
+ on Windows with Ruby 1.9.X.
81
+
72
82
  ==== Windows 95/98/98SE/ME
73
83
 
74
84
  (WONT_FIX) Some methods that call external programs may not work in
@@ -9,7 +9,7 @@ License:: The Ruby License
9
9
 
10
10
  = BioRuby
11
11
 
12
- Copyright (C) 2001-2009 Toshiaki Katayama <k@bioruby.org>
12
+ Copyright (C) 2001-2010 Toshiaki Katayama <k@bioruby.org>
13
13
 
14
14
  BioRuby is an open source Ruby library for developing bioinformatics
15
15
  software. Object oriented scripting language Ruby has many features
@@ -42,6 +42,7 @@ services including KEGG API can be easily utilized by BioRuby.
42
42
  README.rdoc:: This file. General information and installation procedure.
43
43
  RELEASE_NOTES.rdoc:: News and important changes in this release.
44
44
  KNOWN_ISSUES.rdoc:: Known issues and bugs in BioRuby.
45
+ doc/RELEASE_NOTES-1.4.0.rdoc:: News and incompatible changes from 1.3.1 to 1.4.0.
45
46
  doc/Changes-1.3.rdoc:: News and incompatible changes from 1.2.1 to 1.3.0.
46
47
  doc/Changes-0.7.rd:: News and incompatible changes from 0.6.4 to 1.2.1.
47
48
 
@@ -116,7 +117,7 @@ and could be obtained by the following procedure.
116
117
  * Ruby 1.8.2 or later (except Ruby 1.9.0) -- http://www.ruby-lang.org/
117
118
  * Ruby 1.8.7-p174 or later, or Ruby 1.8.6-p383 or later is recommended.
118
119
  * Not yet fully ready with Ruby 1.9, although many components can now work
119
- in Ruby 1.9.1.
120
+ in Ruby 1.9.1 and Ruby 1.9.2.
120
121
 
121
122
  == OPTIONAL REQUIREMENTS
122
123
 
@@ -1,166 +1,104 @@
1
- = BioRuby 1.4.0 RELEASE NOTES
1
+ = BioRuby 1.4.1 RELEASE NOTES
2
2
 
3
- A lot of changes have been made to the BioRuby 1.4.0 after the version 1.3.1
3
+ A lot of changes have been made to the BioRuby 1.4.1 after the version 1.4.0
4
4
  is released. This document describes important and/or incompatible changes
5
- since the BioRuby 1.3.1 release.
5
+ since the BioRuby 1.4.0 release.
6
6
 
7
- == New features
8
-
9
- === PhyloXML support
10
-
11
- Support for reading and writing PhyloXML file format is added. New classes
12
- Bio::PhyloXML::Parser and Bio::PhyloXML::Writer are used to read and write
13
- a PhyloXML file, respectively.
14
-
15
- The code is developed by Diana Jaunzeikare, mentored by Christian M Zmasek
16
- and co-mentors, supported by Google Summer of Code 2009 in collaboration
17
- with the National Evolutionary Synthesis Center (NESCent).
18
-
19
- === FASTQ file format support
7
+ For known problems, see KNOWN_ISSUES.rdoc.
20
8
 
21
- Support for reading and writing FASTQ file format is added. All of the
22
- three FASTQ format variants are supported.
9
+ == New features
23
10
 
24
- To read a FASTQ file, Bio::FlatFile can be used. File format auto-detection
25
- of the FASTQ format is supported (although the three format variants should
26
- be specified later by users if quality scores are needed).
11
+ === PAML Codeml support is significantly improved
27
12
 
28
- New class Bio::Fastq is the parser class for the FASTQ format. An object
29
- of the Bio::Fastq class can be converted to a Bio::Sequence object with the
30
- "to_biosequnece" method. Bio::Sequence#output now supports output of the
31
- FASTQ format.
13
+ PAML Codeml result parser is completely rewritten and is significantly
14
+ improved. The code is developed by Pjotr Prins.
32
15
 
33
- The code is written by Naohisa Goto, with the help of discussions in the
34
- open-bio-l mailing list. The prototype of Bio::Fastq class was first
35
- developed during the BioHackathon 2009 held in Okinawa.
16
+ === KEGG PATHWAY and KEGG MODULE parser
36
17
 
37
- === DNA chromatogram support
18
+ Parsers for KEGG PATHWAY and KEGG MODULE data are added. The code is developed
19
+ by Kozo Nishida and Toshiaki Katayama.
38
20
 
39
- Support for reading DNA chromatogram files are added. SCF and and ABIF file
40
- formats are supported. The code is developed by Anthony Underwood.
21
+ === Bio::KEGG improvements
41
22
 
42
- === MEME (motif-based sequence analysis tools) support
23
+ Following new methods are added.
43
24
 
44
- Support for running MAST (Motif Aliginment & Search Tool, part of the MEME
45
- Suite, motif-based sequence analysis tools) and parsing its results are
46
- added. The code is developed by Adam Kraut.
25
+ * Bio::KEGG::GENES#keggclass, keggclasses, names_as_array, names,
26
+ motifs_as_strings, motifs_as_hash, motifs
27
+ * Bio::KEGG::GENOME#original_databases
47
28
 
48
- === Improvement of KEGG parser classes
29
+ === Test codes are added and improved.
49
30
 
50
- Some new methods are added to parse new fields added to some KEGG file
51
- formats. Unit tests for KEGG parsers are also added and improved. In
52
- addition, return value types of some methods are also changed for unifying
53
- APIs among KEGG parser classes. See incompatible changes below for details.
31
+ Test codes are added and improved. Tney are developed by Kazuhiro Hayashi,
32
+ Kozo Nishida, John Prince, and Naohisa Goto.
54
33
 
55
- === Many sample scripts are added
34
+ === Other new methods
56
35
 
57
- Many sample scripts showing demonstrations of usages of classes are added.
58
- They are moved from primitive test codes for the classes described in the
59
- "if __FILE__ == $0" convention in the library files.
36
+ * Bio::Fastq#mask
37
+ * Bio::Sequence#output_fasta
38
+ * Bio::ClustalW::Report#get_sequence
39
+ * Bio::Reference#==
40
+ * Bio::Location#==
41
+ * Bio::Locations#==
42
+ * Bio::FastaNumericFormat#to_biosequence
60
43
 
61
- === Unit tests can test installed BioRuby
44
+ == Bug fixes
62
45
 
63
- Mechanism to load library and to find test data in the unit tests are changed,
64
- and the library path and test data path can be specified with environment
65
- variables. BIORUBY_TEST_LIB is the path to be added to the Ruby's $LOAD_PATH.
66
- For example, to test BioRuby installed in
67
- /usr/local/lib/site_ruby/1.8, run
68
- env BIORUBY_TEST_LIB=/usr/local/lib/site_ruby/1.8 ruby test/runner.rb
46
+ === Bio::Tree
69
47
 
70
- BIORUBY_TEST_DATA is the path of the test data, and BIORUBY_TEST_DEBUG is a
71
- flag to turn on debug of the tests.
48
+ Following methods did not work correctly.
72
49
 
73
- == Deprecated features
50
+ * Bio::Tree#collect_edge!
51
+ * Bio::Tree#remove_edge_if
74
52
 
75
- === ChangeLog is replaced by git log
53
+ === Bio::KEGG::GENES and Bio::KEGG::GENOME
76
54
 
77
- ChangeLog is replaced by the output of git-log command, and ChangeLog before
78
- the 1.3.1 release is moved to doc/ChangeLog-before-1.3.1.
55
+ * Fixed bugs in Bio::KEGG::GENES#pathway.
56
+ * Fixed parser errors due to the format changes of KEGG GENES and KEGG GENOME.
79
57
 
80
- === "if __FILE__ == $0" convention
58
+ === Other bug fixes
81
59
 
82
- Primitive test codes in the "if __FILE__ == $0" convention are removed and
83
- the codes are moved to the sample scripts named sample/demo_*.rb (except
84
- some older or deprecated files).
60
+ * In Bio::Command, changed not to call fork(2) on platforms that do not
61
+ support it.
62
+ * Bio::MEDLINE#initialize should handle continuation of lines.
63
+ * Typo and a missing field in Bio::GO::GeneAssociation#to_str.
64
+ * Bug fix of Bio::FastaNumericFormat#to_biosequence.
65
+ * Fixed UniProt GN parsing issue in Bio::SPTR.
85
66
 
86
67
  == Incompatible changes
87
68
 
88
- === Bio::NCBI::REST
89
-
90
- NCBI announces that all Entrez E-utility requests must contain email and
91
- tool parameters, and requests without them will return error after June
92
- 2010.
93
-
94
- To set default email address and tool name, following methods are added.
95
- * Bio::NCBI.default_email=(email)
96
- * Bio::NCBI.default_tool=(tool_name)
97
-
98
- For every query, Bio::NCBI::REST checks the email and tool parameters and
99
- raises error if they are empty.
100
-
101
- IMPORTANT NOTE: No default email address is preset in BioRuby. Programmers
102
- using BioRuby must set their own email address or implement to get user's
103
- email address in some way (from input form, configuration file, etc).
104
-
105
- Default tool name is set as "#{$0} (bioruby/#{Bio::BIORUBY_VERSION_ID})".
106
- For example, if you run "ruby my_script.rb" with BioRuby 1.4.0, the value is
107
- "my_script.rb (bioruby/1.4.0)".
108
-
109
- === Bio::KEGG
110
-
111
- ==== dblinks method
112
-
113
- In Bio::KEGG::COMPOUND, DRUG, ENZYME, GLYCAN and ORTHOLOGY, the method
114
- dblinks is changed to return a Hash. Each key of the hash is a database name
115
- and its value is an array of entry IDs in the database. If old behavior
116
- (returns raw entry lines as an array of strings) is needed, use
117
- dblinks_as_strings.
118
-
119
- ==== pathways method
120
-
121
- In Bio::KEGG::COMPOUND, DRUG, ENZYME, GENES, GLYCAN and REACTION, the
122
- method pathways is changed to return a Hash. Each key of the hash is a
123
- pathway ID and its value is the description of the pathway.
124
-
125
- In Bio::KEGG::GENES, if old behavior (returns pathway IDs as an Array) is
126
- needed, use pathways.keys.
127
-
128
- In Bio::KEGG::COMPOUND, DRUG, ENZYME, GLYCAN, and REACTION, if old behavior
129
- (returns raw entry lines as an array of strings) is needed, use
130
- pathways_as_strings.
131
-
132
- Note that Bio::KEGG::ORTHOLOGY#pathways is not changed (returns an array
133
- containing pathway IDs).
69
+ === Bio::PAML::Codeml::Report
134
70
 
135
- ==== orthologs method
71
+ The code is completely rewritten. See the RDoc for details.
136
72
 
137
- In Bio::KEGG::ENZYME, GENES, GLYCAN and REACTION, the method orthologs is
138
- changed to return a Hash. Each key of the hash is a ortholog ID and its
139
- value is the name of the ortholog. If old behavior (returns raw entry lines
140
- as an array of strings) is needed, use orthologs_as_strings.
73
+ === Bio::KEGG::ORTHOLOGY
141
74
 
142
- ==== genes method
75
+ Bio::KEGG::ORTHOLOGY#pathways is changed to return a hash. The old pathway
76
+ method is renamed to pathways_in_keggclass for compatibility.
143
77
 
144
- In Bio::KEGG::ENZYME#genes and Bio::KEGG::ORTHOLOGY#genes is changed to
145
- return a Hash that is the same as Bio::KEGG::ORTHOLOGY#genes_as_hash.
146
- If old behavior (returns raw entry lines as an array of strings) is needed,
147
- use genes_as_strings.
78
+ === Bio::AAindex2
148
79
 
149
- ==== Bio::KEGG:REACTION#rpairs
80
+ Bio::AAindex2 now copies each symmetric element for lower triangular matrix
81
+ to the upper right part, because the Matrix class in Ruby 1.9.2 no longer
82
+ accepts any dimension mismatches. We think the previous behavior is a bug.
150
83
 
151
- Bio::KEGG::REACTION#rpairs is changed to return a Hash. Each key of the
152
- hash is a KEGG Rpair ID and its value is an array containing name and type.
153
- If old behavior (returns as tokens) is needed, use rpairs_as_tokens.
84
+ === Bio::MEDLINE
154
85
 
155
- ==== Bio::KEGG::ORTHOLOGY
86
+ Bio::MEDLINE#reference no longer puts empty values in the returned
87
+ Bio::Reference object. We think the previous behavior is a bug.
88
+ We also think the effect is very small.
156
89
 
157
- Bio::KEGG:ORTHOLOGY#dblinks_as_hash does not lower-case database names.
90
+ == Known issues
158
91
 
159
- === Bio::RestrictionEnzyme
92
+ The following issues are added or updated. See KNOWN_ISSUES.rdoc for other
93
+ already known issues.
160
94
 
161
- Format validation when creating an object is turned off because of efficiency.
95
+ === String escaping of command-line arguments in Ruby 1.9.X on Windows
162
96
 
163
- == Known problems
97
+ After BioRuby 1.4.1, in Ruby 1.9.X running on Windows, escaping of
98
+ command-line arguments are processed by the Ruby interpreter. Before BioRuby
99
+ 1.4.0, the escaping is executed in Bio::Command#escape_shell_windows, and
100
+ the behavior is different from the Ruby interpreter's one.
164
101
 
165
- See KNOWN_ISSUES.rdoc for details.
102
+ Curreltly, due to the change, test/functional/bio/test_command.rb may fail
103
+ on Windows with Ruby 1.9.X.
166
104
 
@@ -3,7 +3,7 @@
3
3
  #
4
4
  Gem::Specification.new do |s|
5
5
  s.name = 'bio'
6
- s.version = "1.4.0"
6
+ s.version = "1.4.1"
7
7
 
8
8
  s.author = "BioRuby project"
9
9
  s.email = "staff@bioruby.org"
@@ -37,6 +37,7 @@ Gem::Specification.new do |s|
37
37
  "doc/Changes-1.3.rdoc",
38
38
  "doc/KEGG_API.rd",
39
39
  "doc/KEGG_API.rd.ja",
40
+ "doc/RELEASE_NOTES-1.4.0.rdoc",
40
41
  "doc/Tutorial.rd",
41
42
  "doc/Tutorial.rd.html",
42
43
  "doc/Tutorial.rd.ja",
@@ -124,6 +125,7 @@ Gem::Specification.new do |s|
124
125
  "lib/bio/db/fasta/format_fasta.rb",
125
126
  "lib/bio/db/fasta/format_qual.rb",
126
127
  "lib/bio/db/fasta/qual.rb",
128
+ "lib/bio/db/fasta/qual_to_biosequence.rb",
127
129
  "lib/bio/db/fastq.rb",
128
130
  "lib/bio/db/fastq/fastq_to_biosequence.rb",
129
131
  "lib/bio/db/fastq/format_fastq.rb",
@@ -147,7 +149,9 @@ Gem::Specification.new do |s|
147
149
  "lib/bio/db/kegg/glycan.rb",
148
150
  "lib/bio/db/kegg/keggtab.rb",
149
151
  "lib/bio/db/kegg/kgml.rb",
152
+ "lib/bio/db/kegg/module.rb",
150
153
  "lib/bio/db/kegg/orthology.rb",
154
+ "lib/bio/db/kegg/pathway.rb",
151
155
  "lib/bio/db/kegg/reaction.rb",
152
156
  "lib/bio/db/kegg/taxonomy.rb",
153
157
  "lib/bio/db/lasergene.rb",
@@ -219,6 +223,7 @@ Gem::Specification.new do |s|
219
223
  "lib/bio/sequence/generic.rb",
220
224
  "lib/bio/sequence/na.rb",
221
225
  "lib/bio/sequence/quality_score.rb",
226
+ "lib/bio/sequence/sequence_masker.rb",
222
227
  "lib/bio/shell.rb",
223
228
  "lib/bio/shell/core.rb",
224
229
  "lib/bio/shell/demo.rb",
@@ -370,7 +375,17 @@ Gem::Specification.new do |s|
370
375
  "test/data/KEGG/G00024.glycan",
371
376
  "test/data/KEGG/G01366.glycan",
372
377
  "test/data/KEGG/K02338.orthology",
378
+ "test/data/KEGG/M00118.module",
373
379
  "test/data/KEGG/R00006.reaction",
380
+ "test/data/KEGG/T00005.genome",
381
+ "test/data/KEGG/T00070.genome",
382
+ "test/data/KEGG/b0529.gene",
383
+ "test/data/KEGG/ec00072.pathway",
384
+ "test/data/KEGG/hsa00790.pathway",
385
+ "test/data/KEGG/ko00312.pathway",
386
+ "test/data/KEGG/map00030.pathway",
387
+ "test/data/KEGG/map00052.pathway",
388
+ "test/data/KEGG/rn00250.pathway",
374
389
  "test/data/SOSUI/sample.report",
375
390
  "test/data/TMHMM/sample.report",
376
391
  "test/data/aaindex/DAYM780301",
@@ -383,6 +398,7 @@ Gem::Specification.new do |s|
383
398
  "test/data/blast/b0002.faa.m7",
384
399
  "test/data/blast/b0002.faa.m8",
385
400
  "test/data/blast/blastp-multi.m7",
401
+ "test/data/clustalw/example1.aln",
386
402
  "test/data/command/echoarg2.bat",
387
403
  "test/data/embl/AB090716.embl",
388
404
  "test/data/embl/AB090716.embl.rel89",
@@ -441,13 +457,23 @@ Gem::Specification.new do |s|
441
457
  "test/data/fastq/wrapping_original_sanger.fastq",
442
458
  "test/data/gcg/pileup-aa.msf",
443
459
  "test/data/genscan/sample.report",
460
+ "test/data/go/selected_component.ontology",
461
+ "test/data/go/selected_gene_association.sgd",
462
+ "test/data/go/selected_wikipedia2go",
444
463
  "test/data/iprscan/merged.raw",
445
464
  "test/data/iprscan/merged.txt",
465
+ "test/data/medline/20146148_modified.medline",
446
466
  "test/data/meme/db",
447
467
  "test/data/meme/mast",
448
468
  "test/data/meme/mast.out",
449
469
  "test/data/meme/meme.out",
450
470
  "test/data/paml/codeml/control_file.txt",
471
+ "test/data/paml/codeml/models/aa.aln",
472
+ "test/data/paml/codeml/models/aa.dnd",
473
+ "test/data/paml/codeml/models/aa.ph",
474
+ "test/data/paml/codeml/models/alignment.phy",
475
+ "test/data/paml/codeml/models/results0-3.txt",
476
+ "test/data/paml/codeml/models/results7-8.txt",
451
477
  "test/data/paml/codeml/output.txt",
452
478
  "test/data/paml/codeml/rates",
453
479
  "test/data/phyloxml/apaf.xml",
@@ -479,6 +505,7 @@ Gem::Specification.new do |s|
479
505
  "test/unit/bio/appl/blast/test_ncbioptions.rb",
480
506
  "test/unit/bio/appl/blast/test_report.rb",
481
507
  "test/unit/bio/appl/blast/test_rpsblast.rb",
508
+ "test/unit/bio/appl/clustalw/test_report.rb",
482
509
  "test/unit/bio/appl/gcg/test_msf.rb",
483
510
  "test/unit/bio/appl/genscan/test_report.rb",
484
511
  "test/unit/bio/appl/hmmer/test_report.rb",
@@ -489,6 +516,7 @@ Gem::Specification.new do |s|
489
516
  "test/unit/bio/appl/meme/test_motif.rb",
490
517
  "test/unit/bio/appl/paml/codeml/test_rates.rb",
491
518
  "test/unit/bio/appl/paml/codeml/test_report.rb",
519
+ "test/unit/bio/appl/paml/codeml/test_report_single.rb",
492
520
  "test/unit/bio/appl/paml/test_codeml.rb",
493
521
  "test/unit/bio/appl/sim4/test_report.rb",
494
522
  "test/unit/bio/appl/sosui/test_report.rb",
@@ -508,13 +536,18 @@ Gem::Specification.new do |s|
508
536
  "test/unit/bio/db/embl/test_embl_to_bioseq.rb",
509
537
  "test/unit/bio/db/embl/test_sptr.rb",
510
538
  "test/unit/bio/db/embl/test_uniprot.rb",
539
+ "test/unit/bio/db/fasta/test_defline.rb",
540
+ "test/unit/bio/db/fasta/test_defline_misc.rb",
511
541
  "test/unit/bio/db/fasta/test_format_qual.rb",
512
542
  "test/unit/bio/db/kegg/test_compound.rb",
513
543
  "test/unit/bio/db/kegg/test_drug.rb",
514
544
  "test/unit/bio/db/kegg/test_enzyme.rb",
515
545
  "test/unit/bio/db/kegg/test_genes.rb",
546
+ "test/unit/bio/db/kegg/test_genome.rb",
516
547
  "test/unit/bio/db/kegg/test_glycan.rb",
548
+ "test/unit/bio/db/kegg/test_module.rb",
517
549
  "test/unit/bio/db/kegg/test_orthology.rb",
550
+ "test/unit/bio/db/kegg/test_pathway.rb",
518
551
  "test/unit/bio/db/kegg/test_reaction.rb",
519
552
  "test/unit/bio/db/pdb/test_pdb.rb",
520
553
  "test/unit/bio/db/sanger_chromatogram/test_abif.rb",
@@ -523,6 +556,7 @@ Gem::Specification.new do |s|
523
556
  "test/unit/bio/db/test_fasta.rb",
524
557
  "test/unit/bio/db/test_fastq.rb",
525
558
  "test/unit/bio/db/test_gff.rb",
559
+ "test/unit/bio/db/test_go.rb",
526
560
  "test/unit/bio/db/test_lasergene.rb",
527
561
  "test/unit/bio/db/test_medline.rb",
528
562
  "test/unit/bio/db/test_newick.rb",
@@ -548,6 +582,7 @@ Gem::Specification.new do |s|
548
582
  "test/unit/bio/sequence/test_dblink.rb",
549
583
  "test/unit/bio/sequence/test_na.rb",
550
584
  "test/unit/bio/sequence/test_quality_score.rb",
585
+ "test/unit/bio/sequence/test_sequence_masker.rb",
551
586
  "test/unit/bio/shell/plugin/test_seq.rb",
552
587
  "test/unit/bio/test_alignment.rb",
553
588
  "test/unit/bio/test_command.rb",
@@ -588,7 +623,8 @@ Gem::Specification.new do |s|
588
623
  "README.rdoc",
589
624
  "README_DEV.rdoc",
590
625
  "RELEASE_NOTES.rdoc",
591
- "doc/Changes-1.3.rdoc"
626
+ "doc/Changes-1.3.rdoc",
627
+ "doc/RELEASE_NOTES-1.4.0.rdoc"
592
628
  ]
593
629
  s.rdoc_options << '--main' << 'README.rdoc'
594
630
  s.rdoc_options << '--title' << 'BioRuby API documentation'
@@ -0,0 +1,167 @@
1
+ = BioRuby 1.4.0 RELEASE NOTES
2
+
3
+ A lot of changes have been made to the BioRuby 1.4.0 after the version 1.3.1
4
+ is released. This document describes important and/or incompatible changes
5
+ since the BioRuby 1.3.1 release.
6
+
7
+ == New features
8
+
9
+ === PhyloXML support
10
+
11
+ Support for reading and writing PhyloXML file format is added. New classes
12
+ Bio::PhyloXML::Parser and Bio::PhyloXML::Writer are used to read and write
13
+ a PhyloXML file, respectively.
14
+
15
+ The code is developed by Diana Jaunzeikare, mentored by Christian M Zmasek
16
+ and co-mentors, supported by Google Summer of Code 2009 in collaboration
17
+ with the National Evolutionary Synthesis Center (NESCent).
18
+
19
+ === FASTQ file format support
20
+
21
+ Support for reading and writing FASTQ file format is added. All of the
22
+ three FASTQ format variants are supported.
23
+
24
+ To read a FASTQ file, Bio::FlatFile can be used. File format auto-detection
25
+ of the FASTQ format is supported (although the three format variants should
26
+ be specified later by users if quality scores are needed).
27
+
28
+ New class Bio::Fastq is the parser class for the FASTQ format. An object
29
+ of the Bio::Fastq class can be converted to a Bio::Sequence object with the
30
+ "to_biosequnece" method. Bio::Sequence#output now supports output of the
31
+ FASTQ format.
32
+
33
+ The code is written by Naohisa Goto, with the help of discussions in the
34
+ open-bio-l mailing list. The prototype of Bio::Fastq class was first
35
+ developed during the BioHackathon 2009 held in Okinawa.
36
+
37
+ === DNA chromatogram support
38
+
39
+ Support for reading DNA chromatogram files are added. SCF and and ABIF file
40
+ formats are supported. The code is developed by Anthony Underwood.
41
+
42
+ === MEME (motif-based sequence analysis tools) support
43
+
44
+ Support for running MAST (Motif Aliginment & Search Tool, part of the MEME
45
+ Suite, motif-based sequence analysis tools) and parsing its results are
46
+ added. The code is developed by Adam Kraut.
47
+
48
+ === Improvement of KEGG parser classes
49
+
50
+ Some new methods are added to parse new fields added to some KEGG file
51
+ formats. Unit tests for KEGG parsers are also added and improved. In
52
+ addition, return value types of some methods are also changed for unifying
53
+ APIs among KEGG parser classes. See incompatible changes below for details.
54
+ Most of them are contributed by Kozo Nishida.
55
+
56
+ === Many sample scripts are added
57
+
58
+ Many sample scripts showing demonstrations of usages of classes are added.
59
+ They are moved from primitive test codes for the classes described in the
60
+ "if __FILE__ == $0" convention in the library files.
61
+
62
+ === Unit tests can test installed BioRuby
63
+
64
+ Mechanism to load library and to find test data in the unit tests are changed,
65
+ and the library path and test data path can be specified with environment
66
+ variables. BIORUBY_TEST_LIB is the path to be added to the Ruby's $LOAD_PATH.
67
+ For example, to test BioRuby installed in
68
+ /usr/local/lib/site_ruby/1.8, run
69
+ env BIORUBY_TEST_LIB=/usr/local/lib/site_ruby/1.8 ruby test/runner.rb
70
+
71
+ BIORUBY_TEST_DATA is the path of the test data, and BIORUBY_TEST_DEBUG is a
72
+ flag to turn on debug of the tests.
73
+
74
+ == Deprecated features
75
+
76
+ === ChangeLog is replaced by git log
77
+
78
+ ChangeLog is replaced by the output of git-log command, and ChangeLog before
79
+ the 1.3.1 release is moved to doc/ChangeLog-before-1.3.1.
80
+
81
+ === "if __FILE__ == $0" convention
82
+
83
+ Primitive test codes in the "if __FILE__ == $0" convention are removed and
84
+ the codes are moved to the sample scripts named sample/demo_*.rb (except
85
+ some older or deprecated files).
86
+
87
+ == Incompatible changes
88
+
89
+ === Bio::NCBI::REST
90
+
91
+ NCBI announces that all Entrez E-utility requests must contain email and
92
+ tool parameters, and requests without them will return error after June
93
+ 2010.
94
+
95
+ To set default email address and tool name, following methods are added.
96
+ * Bio::NCBI.default_email=(email)
97
+ * Bio::NCBI.default_tool=(tool_name)
98
+
99
+ For every query, Bio::NCBI::REST checks the email and tool parameters and
100
+ raises error if they are empty.
101
+
102
+ IMPORTANT NOTE: No default email address is preset in BioRuby. Programmers
103
+ using BioRuby must set their own email address or implement to get user's
104
+ email address in some way (from input form, configuration file, etc).
105
+
106
+ Default tool name is set as "#{$0} (bioruby/#{Bio::BIORUBY_VERSION_ID})".
107
+ For example, if you run "ruby my_script.rb" with BioRuby 1.4.0, the value is
108
+ "my_script.rb (bioruby/1.4.0)".
109
+
110
+ === Bio::KEGG
111
+
112
+ ==== dblinks method
113
+
114
+ In Bio::KEGG::COMPOUND, DRUG, ENZYME, GLYCAN and ORTHOLOGY, the method
115
+ dblinks is changed to return a Hash. Each key of the hash is a database name
116
+ and its value is an array of entry IDs in the database. If old behavior
117
+ (returns raw entry lines as an array of strings) is needed, use
118
+ dblinks_as_strings.
119
+
120
+ ==== pathways method
121
+
122
+ In Bio::KEGG::COMPOUND, DRUG, ENZYME, GENES, GLYCAN and REACTION, the
123
+ method pathways is changed to return a Hash. Each key of the hash is a
124
+ pathway ID and its value is the description of the pathway.
125
+
126
+ In Bio::KEGG::GENES, if old behavior (returns pathway IDs as an Array) is
127
+ needed, use pathways.keys.
128
+
129
+ In Bio::KEGG::COMPOUND, DRUG, ENZYME, GLYCAN, and REACTION, if old behavior
130
+ (returns raw entry lines as an array of strings) is needed, use
131
+ pathways_as_strings.
132
+
133
+ Note that Bio::KEGG::ORTHOLOGY#pathways is not changed (returns an array
134
+ containing pathway IDs).
135
+
136
+ ==== orthologs method
137
+
138
+ In Bio::KEGG::ENZYME, GENES, GLYCAN and REACTION, the method orthologs is
139
+ changed to return a Hash. Each key of the hash is a ortholog ID and its
140
+ value is the name of the ortholog. If old behavior (returns raw entry lines
141
+ as an array of strings) is needed, use orthologs_as_strings.
142
+
143
+ ==== genes method
144
+
145
+ In Bio::KEGG::ENZYME#genes and Bio::KEGG::ORTHOLOGY#genes is changed to
146
+ return a Hash that is the same as Bio::KEGG::ORTHOLOGY#genes_as_hash.
147
+ If old behavior (returns raw entry lines as an array of strings) is needed,
148
+ use genes_as_strings.
149
+
150
+ ==== Bio::KEGG:REACTION#rpairs
151
+
152
+ Bio::KEGG::REACTION#rpairs is changed to return a Hash. Each key of the
153
+ hash is a KEGG Rpair ID and its value is an array containing name and type.
154
+ If old behavior (returns as tokens) is needed, use rpairs_as_tokens.
155
+
156
+ ==== Bio::KEGG::ORTHOLOGY
157
+
158
+ Bio::KEGG:ORTHOLOGY#dblinks_as_hash does not lower-case database names.
159
+
160
+ === Bio::RestrictionEnzyme
161
+
162
+ Format validation when creating an object is turned off because of efficiency.
163
+
164
+ == Known problems
165
+
166
+ See KNOWN_ISSUES.rdoc for details.
167
+