bio 1.4.0 → 1.4.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (82) hide show
  1. data/ChangeLog +1712 -0
  2. data/KNOWN_ISSUES.rdoc +11 -1
  3. data/README.rdoc +3 -2
  4. data/RELEASE_NOTES.rdoc +65 -127
  5. data/bioruby.gemspec +38 -2
  6. data/doc/RELEASE_NOTES-1.4.0.rdoc +167 -0
  7. data/doc/Tutorial.rd +74 -16
  8. data/doc/Tutorial.rd.html +68 -16
  9. data/lib/bio.rb +2 -0
  10. data/lib/bio/appl/clustalw/report.rb +18 -0
  11. data/lib/bio/appl/paml/codeml/report.rb +579 -21
  12. data/lib/bio/command.rb +149 -21
  13. data/lib/bio/db/aaindex.rb +11 -1
  14. data/lib/bio/db/embl/sptr.rb +1 -1
  15. data/lib/bio/db/fasta/defline.rb +7 -2
  16. data/lib/bio/db/fasta/qual.rb +24 -0
  17. data/lib/bio/db/fasta/qual_to_biosequence.rb +29 -0
  18. data/lib/bio/db/fastq.rb +15 -0
  19. data/lib/bio/db/go.rb +2 -2
  20. data/lib/bio/db/kegg/common.rb +109 -5
  21. data/lib/bio/db/kegg/genes.rb +61 -15
  22. data/lib/bio/db/kegg/genome.rb +43 -38
  23. data/lib/bio/db/kegg/module.rb +158 -0
  24. data/lib/bio/db/kegg/orthology.rb +40 -1
  25. data/lib/bio/db/kegg/pathway.rb +254 -0
  26. data/lib/bio/db/medline.rb +6 -2
  27. data/lib/bio/io/flatfile/autodetection.rb +6 -0
  28. data/lib/bio/location.rb +39 -0
  29. data/lib/bio/reference.rb +24 -0
  30. data/lib/bio/sequence.rb +2 -0
  31. data/lib/bio/sequence/adapter.rb +1 -0
  32. data/lib/bio/sequence/format.rb +14 -0
  33. data/lib/bio/sequence/sequence_masker.rb +95 -0
  34. data/lib/bio/tree.rb +4 -4
  35. data/lib/bio/util/restriction_enzyme/double_stranded/aligned_strands.rb +5 -0
  36. data/lib/bio/version.rb +1 -1
  37. data/setup.rb +5 -0
  38. data/test/data/KEGG/K02338.orthology +180 -52
  39. data/test/data/KEGG/M00118.module +44 -0
  40. data/test/data/KEGG/T00005.genome +140 -0
  41. data/test/data/KEGG/T00070.genome +34 -0
  42. data/test/data/KEGG/b0529.gene +47 -0
  43. data/test/data/KEGG/ec00072.pathway +23 -0
  44. data/test/data/KEGG/hsa00790.pathway +59 -0
  45. data/test/data/KEGG/ko00312.pathway +16 -0
  46. data/test/data/KEGG/map00030.pathway +37 -0
  47. data/test/data/KEGG/map00052.pathway +13 -0
  48. data/test/data/KEGG/rn00250.pathway +114 -0
  49. data/test/data/clustalw/example1.aln +58 -0
  50. data/test/data/go/selected_component.ontology +12 -0
  51. data/test/data/go/selected_gene_association.sgd +31 -0
  52. data/test/data/go/selected_wikipedia2go +13 -0
  53. data/test/data/medline/20146148_modified.medline +54 -0
  54. data/test/data/paml/codeml/models/aa.aln +26 -0
  55. data/test/data/paml/codeml/models/aa.dnd +13 -0
  56. data/test/data/paml/codeml/models/aa.ph +13 -0
  57. data/test/data/paml/codeml/models/alignment.phy +49 -0
  58. data/test/data/paml/codeml/models/results0-3.txt +312 -0
  59. data/test/data/paml/codeml/models/results7-8.txt +340 -0
  60. data/test/functional/bio/io/test_togows.rb +8 -8
  61. data/test/functional/bio/test_command.rb +7 -6
  62. data/test/unit/bio/appl/clustalw/test_report.rb +80 -0
  63. data/test/unit/bio/appl/paml/codeml/test_rates.rb +6 -6
  64. data/test/unit/bio/appl/paml/codeml/test_report.rb +231 -24
  65. data/test/unit/bio/appl/paml/codeml/test_report_single.rb +46 -0
  66. data/test/unit/bio/db/embl/test_sptr.rb +1 -1
  67. data/test/unit/bio/db/fasta/test_defline.rb +160 -0
  68. data/test/unit/bio/db/fasta/test_defline_misc.rb +490 -0
  69. data/test/unit/bio/db/kegg/test_genes.rb +281 -1
  70. data/test/unit/bio/db/kegg/test_genome.rb +408 -0
  71. data/test/unit/bio/db/kegg/test_module.rb +246 -0
  72. data/test/unit/bio/db/kegg/test_orthology.rb +95 -0
  73. data/test/unit/bio/db/kegg/test_pathway.rb +1250 -0
  74. data/test/unit/bio/db/test_aaindex.rb +8 -7
  75. data/test/unit/bio/db/test_fastq.rb +36 -0
  76. data/test/unit/bio/db/test_go.rb +171 -0
  77. data/test/unit/bio/db/test_medline.rb +148 -0
  78. data/test/unit/bio/db/test_qual.rb +9 -2
  79. data/test/unit/bio/sequence/test_sequence_masker.rb +169 -0
  80. data/test/unit/bio/test_tree.rb +260 -1
  81. data/test/unit/bio/util/test_contingency_table.rb +7 -7
  82. metadata +53 -6
@@ -4,7 +4,7 @@ License:: The Ruby License
4
4
 
5
5
  = Known issues and bugs in BioRuby
6
6
 
7
- Below are known issues and bugs in BioRuby. Pathes to fix them are welcomed.
7
+ Below are known issues and bugs in BioRuby. Patches to fix them are welcome.
8
8
  We hope they will be fixed in the future.
9
9
 
10
10
  Items marked with (WONT_FIX) tags would not be fixed within BioRuby because
@@ -69,6 +69,16 @@ This indicates that br_bioflat.rb and Bio::FlatFileIndex may create
69
69
  incorrect indexes on mswin32, mingw32, and bccwin32. In addition,
70
70
  Bio::FlatFile may return incorrect data.
71
71
 
72
+ ==== String escaping of command-line arguments
73
+
74
+ After BioRuby 1.4.1, in Ruby 1.9.X running on Windows, escaping of
75
+ command-line arguments are processed by the Ruby interpreter. Before BioRuby
76
+ 1.4.0, the escaping is executed in Bio::Command#escape_shell_windows, and
77
+ the behavior is different from the Ruby interpreter's one.
78
+
79
+ Curreltly, due to the change, test/functional/bio/test_command.rb may fail
80
+ on Windows with Ruby 1.9.X.
81
+
72
82
  ==== Windows 95/98/98SE/ME
73
83
 
74
84
  (WONT_FIX) Some methods that call external programs may not work in
@@ -9,7 +9,7 @@ License:: The Ruby License
9
9
 
10
10
  = BioRuby
11
11
 
12
- Copyright (C) 2001-2009 Toshiaki Katayama <k@bioruby.org>
12
+ Copyright (C) 2001-2010 Toshiaki Katayama <k@bioruby.org>
13
13
 
14
14
  BioRuby is an open source Ruby library for developing bioinformatics
15
15
  software. Object oriented scripting language Ruby has many features
@@ -42,6 +42,7 @@ services including KEGG API can be easily utilized by BioRuby.
42
42
  README.rdoc:: This file. General information and installation procedure.
43
43
  RELEASE_NOTES.rdoc:: News and important changes in this release.
44
44
  KNOWN_ISSUES.rdoc:: Known issues and bugs in BioRuby.
45
+ doc/RELEASE_NOTES-1.4.0.rdoc:: News and incompatible changes from 1.3.1 to 1.4.0.
45
46
  doc/Changes-1.3.rdoc:: News and incompatible changes from 1.2.1 to 1.3.0.
46
47
  doc/Changes-0.7.rd:: News and incompatible changes from 0.6.4 to 1.2.1.
47
48
 
@@ -116,7 +117,7 @@ and could be obtained by the following procedure.
116
117
  * Ruby 1.8.2 or later (except Ruby 1.9.0) -- http://www.ruby-lang.org/
117
118
  * Ruby 1.8.7-p174 or later, or Ruby 1.8.6-p383 or later is recommended.
118
119
  * Not yet fully ready with Ruby 1.9, although many components can now work
119
- in Ruby 1.9.1.
120
+ in Ruby 1.9.1 and Ruby 1.9.2.
120
121
 
121
122
  == OPTIONAL REQUIREMENTS
122
123
 
@@ -1,166 +1,104 @@
1
- = BioRuby 1.4.0 RELEASE NOTES
1
+ = BioRuby 1.4.1 RELEASE NOTES
2
2
 
3
- A lot of changes have been made to the BioRuby 1.4.0 after the version 1.3.1
3
+ A lot of changes have been made to the BioRuby 1.4.1 after the version 1.4.0
4
4
  is released. This document describes important and/or incompatible changes
5
- since the BioRuby 1.3.1 release.
5
+ since the BioRuby 1.4.0 release.
6
6
 
7
- == New features
8
-
9
- === PhyloXML support
10
-
11
- Support for reading and writing PhyloXML file format is added. New classes
12
- Bio::PhyloXML::Parser and Bio::PhyloXML::Writer are used to read and write
13
- a PhyloXML file, respectively.
14
-
15
- The code is developed by Diana Jaunzeikare, mentored by Christian M Zmasek
16
- and co-mentors, supported by Google Summer of Code 2009 in collaboration
17
- with the National Evolutionary Synthesis Center (NESCent).
18
-
19
- === FASTQ file format support
7
+ For known problems, see KNOWN_ISSUES.rdoc.
20
8
 
21
- Support for reading and writing FASTQ file format is added. All of the
22
- three FASTQ format variants are supported.
9
+ == New features
23
10
 
24
- To read a FASTQ file, Bio::FlatFile can be used. File format auto-detection
25
- of the FASTQ format is supported (although the three format variants should
26
- be specified later by users if quality scores are needed).
11
+ === PAML Codeml support is significantly improved
27
12
 
28
- New class Bio::Fastq is the parser class for the FASTQ format. An object
29
- of the Bio::Fastq class can be converted to a Bio::Sequence object with the
30
- "to_biosequnece" method. Bio::Sequence#output now supports output of the
31
- FASTQ format.
13
+ PAML Codeml result parser is completely rewritten and is significantly
14
+ improved. The code is developed by Pjotr Prins.
32
15
 
33
- The code is written by Naohisa Goto, with the help of discussions in the
34
- open-bio-l mailing list. The prototype of Bio::Fastq class was first
35
- developed during the BioHackathon 2009 held in Okinawa.
16
+ === KEGG PATHWAY and KEGG MODULE parser
36
17
 
37
- === DNA chromatogram support
18
+ Parsers for KEGG PATHWAY and KEGG MODULE data are added. The code is developed
19
+ by Kozo Nishida and Toshiaki Katayama.
38
20
 
39
- Support for reading DNA chromatogram files are added. SCF and and ABIF file
40
- formats are supported. The code is developed by Anthony Underwood.
21
+ === Bio::KEGG improvements
41
22
 
42
- === MEME (motif-based sequence analysis tools) support
23
+ Following new methods are added.
43
24
 
44
- Support for running MAST (Motif Aliginment & Search Tool, part of the MEME
45
- Suite, motif-based sequence analysis tools) and parsing its results are
46
- added. The code is developed by Adam Kraut.
25
+ * Bio::KEGG::GENES#keggclass, keggclasses, names_as_array, names,
26
+ motifs_as_strings, motifs_as_hash, motifs
27
+ * Bio::KEGG::GENOME#original_databases
47
28
 
48
- === Improvement of KEGG parser classes
29
+ === Test codes are added and improved.
49
30
 
50
- Some new methods are added to parse new fields added to some KEGG file
51
- formats. Unit tests for KEGG parsers are also added and improved. In
52
- addition, return value types of some methods are also changed for unifying
53
- APIs among KEGG parser classes. See incompatible changes below for details.
31
+ Test codes are added and improved. Tney are developed by Kazuhiro Hayashi,
32
+ Kozo Nishida, John Prince, and Naohisa Goto.
54
33
 
55
- === Many sample scripts are added
34
+ === Other new methods
56
35
 
57
- Many sample scripts showing demonstrations of usages of classes are added.
58
- They are moved from primitive test codes for the classes described in the
59
- "if __FILE__ == $0" convention in the library files.
36
+ * Bio::Fastq#mask
37
+ * Bio::Sequence#output_fasta
38
+ * Bio::ClustalW::Report#get_sequence
39
+ * Bio::Reference#==
40
+ * Bio::Location#==
41
+ * Bio::Locations#==
42
+ * Bio::FastaNumericFormat#to_biosequence
60
43
 
61
- === Unit tests can test installed BioRuby
44
+ == Bug fixes
62
45
 
63
- Mechanism to load library and to find test data in the unit tests are changed,
64
- and the library path and test data path can be specified with environment
65
- variables. BIORUBY_TEST_LIB is the path to be added to the Ruby's $LOAD_PATH.
66
- For example, to test BioRuby installed in
67
- /usr/local/lib/site_ruby/1.8, run
68
- env BIORUBY_TEST_LIB=/usr/local/lib/site_ruby/1.8 ruby test/runner.rb
46
+ === Bio::Tree
69
47
 
70
- BIORUBY_TEST_DATA is the path of the test data, and BIORUBY_TEST_DEBUG is a
71
- flag to turn on debug of the tests.
48
+ Following methods did not work correctly.
72
49
 
73
- == Deprecated features
50
+ * Bio::Tree#collect_edge!
51
+ * Bio::Tree#remove_edge_if
74
52
 
75
- === ChangeLog is replaced by git log
53
+ === Bio::KEGG::GENES and Bio::KEGG::GENOME
76
54
 
77
- ChangeLog is replaced by the output of git-log command, and ChangeLog before
78
- the 1.3.1 release is moved to doc/ChangeLog-before-1.3.1.
55
+ * Fixed bugs in Bio::KEGG::GENES#pathway.
56
+ * Fixed parser errors due to the format changes of KEGG GENES and KEGG GENOME.
79
57
 
80
- === "if __FILE__ == $0" convention
58
+ === Other bug fixes
81
59
 
82
- Primitive test codes in the "if __FILE__ == $0" convention are removed and
83
- the codes are moved to the sample scripts named sample/demo_*.rb (except
84
- some older or deprecated files).
60
+ * In Bio::Command, changed not to call fork(2) on platforms that do not
61
+ support it.
62
+ * Bio::MEDLINE#initialize should handle continuation of lines.
63
+ * Typo and a missing field in Bio::GO::GeneAssociation#to_str.
64
+ * Bug fix of Bio::FastaNumericFormat#to_biosequence.
65
+ * Fixed UniProt GN parsing issue in Bio::SPTR.
85
66
 
86
67
  == Incompatible changes
87
68
 
88
- === Bio::NCBI::REST
89
-
90
- NCBI announces that all Entrez E-utility requests must contain email and
91
- tool parameters, and requests without them will return error after June
92
- 2010.
93
-
94
- To set default email address and tool name, following methods are added.
95
- * Bio::NCBI.default_email=(email)
96
- * Bio::NCBI.default_tool=(tool_name)
97
-
98
- For every query, Bio::NCBI::REST checks the email and tool parameters and
99
- raises error if they are empty.
100
-
101
- IMPORTANT NOTE: No default email address is preset in BioRuby. Programmers
102
- using BioRuby must set their own email address or implement to get user's
103
- email address in some way (from input form, configuration file, etc).
104
-
105
- Default tool name is set as "#{$0} (bioruby/#{Bio::BIORUBY_VERSION_ID})".
106
- For example, if you run "ruby my_script.rb" with BioRuby 1.4.0, the value is
107
- "my_script.rb (bioruby/1.4.0)".
108
-
109
- === Bio::KEGG
110
-
111
- ==== dblinks method
112
-
113
- In Bio::KEGG::COMPOUND, DRUG, ENZYME, GLYCAN and ORTHOLOGY, the method
114
- dblinks is changed to return a Hash. Each key of the hash is a database name
115
- and its value is an array of entry IDs in the database. If old behavior
116
- (returns raw entry lines as an array of strings) is needed, use
117
- dblinks_as_strings.
118
-
119
- ==== pathways method
120
-
121
- In Bio::KEGG::COMPOUND, DRUG, ENZYME, GENES, GLYCAN and REACTION, the
122
- method pathways is changed to return a Hash. Each key of the hash is a
123
- pathway ID and its value is the description of the pathway.
124
-
125
- In Bio::KEGG::GENES, if old behavior (returns pathway IDs as an Array) is
126
- needed, use pathways.keys.
127
-
128
- In Bio::KEGG::COMPOUND, DRUG, ENZYME, GLYCAN, and REACTION, if old behavior
129
- (returns raw entry lines as an array of strings) is needed, use
130
- pathways_as_strings.
131
-
132
- Note that Bio::KEGG::ORTHOLOGY#pathways is not changed (returns an array
133
- containing pathway IDs).
69
+ === Bio::PAML::Codeml::Report
134
70
 
135
- ==== orthologs method
71
+ The code is completely rewritten. See the RDoc for details.
136
72
 
137
- In Bio::KEGG::ENZYME, GENES, GLYCAN and REACTION, the method orthologs is
138
- changed to return a Hash. Each key of the hash is a ortholog ID and its
139
- value is the name of the ortholog. If old behavior (returns raw entry lines
140
- as an array of strings) is needed, use orthologs_as_strings.
73
+ === Bio::KEGG::ORTHOLOGY
141
74
 
142
- ==== genes method
75
+ Bio::KEGG::ORTHOLOGY#pathways is changed to return a hash. The old pathway
76
+ method is renamed to pathways_in_keggclass for compatibility.
143
77
 
144
- In Bio::KEGG::ENZYME#genes and Bio::KEGG::ORTHOLOGY#genes is changed to
145
- return a Hash that is the same as Bio::KEGG::ORTHOLOGY#genes_as_hash.
146
- If old behavior (returns raw entry lines as an array of strings) is needed,
147
- use genes_as_strings.
78
+ === Bio::AAindex2
148
79
 
149
- ==== Bio::KEGG:REACTION#rpairs
80
+ Bio::AAindex2 now copies each symmetric element for lower triangular matrix
81
+ to the upper right part, because the Matrix class in Ruby 1.9.2 no longer
82
+ accepts any dimension mismatches. We think the previous behavior is a bug.
150
83
 
151
- Bio::KEGG::REACTION#rpairs is changed to return a Hash. Each key of the
152
- hash is a KEGG Rpair ID and its value is an array containing name and type.
153
- If old behavior (returns as tokens) is needed, use rpairs_as_tokens.
84
+ === Bio::MEDLINE
154
85
 
155
- ==== Bio::KEGG::ORTHOLOGY
86
+ Bio::MEDLINE#reference no longer puts empty values in the returned
87
+ Bio::Reference object. We think the previous behavior is a bug.
88
+ We also think the effect is very small.
156
89
 
157
- Bio::KEGG:ORTHOLOGY#dblinks_as_hash does not lower-case database names.
90
+ == Known issues
158
91
 
159
- === Bio::RestrictionEnzyme
92
+ The following issues are added or updated. See KNOWN_ISSUES.rdoc for other
93
+ already known issues.
160
94
 
161
- Format validation when creating an object is turned off because of efficiency.
95
+ === String escaping of command-line arguments in Ruby 1.9.X on Windows
162
96
 
163
- == Known problems
97
+ After BioRuby 1.4.1, in Ruby 1.9.X running on Windows, escaping of
98
+ command-line arguments are processed by the Ruby interpreter. Before BioRuby
99
+ 1.4.0, the escaping is executed in Bio::Command#escape_shell_windows, and
100
+ the behavior is different from the Ruby interpreter's one.
164
101
 
165
- See KNOWN_ISSUES.rdoc for details.
102
+ Curreltly, due to the change, test/functional/bio/test_command.rb may fail
103
+ on Windows with Ruby 1.9.X.
166
104
 
@@ -3,7 +3,7 @@
3
3
  #
4
4
  Gem::Specification.new do |s|
5
5
  s.name = 'bio'
6
- s.version = "1.4.0"
6
+ s.version = "1.4.1"
7
7
 
8
8
  s.author = "BioRuby project"
9
9
  s.email = "staff@bioruby.org"
@@ -37,6 +37,7 @@ Gem::Specification.new do |s|
37
37
  "doc/Changes-1.3.rdoc",
38
38
  "doc/KEGG_API.rd",
39
39
  "doc/KEGG_API.rd.ja",
40
+ "doc/RELEASE_NOTES-1.4.0.rdoc",
40
41
  "doc/Tutorial.rd",
41
42
  "doc/Tutorial.rd.html",
42
43
  "doc/Tutorial.rd.ja",
@@ -124,6 +125,7 @@ Gem::Specification.new do |s|
124
125
  "lib/bio/db/fasta/format_fasta.rb",
125
126
  "lib/bio/db/fasta/format_qual.rb",
126
127
  "lib/bio/db/fasta/qual.rb",
128
+ "lib/bio/db/fasta/qual_to_biosequence.rb",
127
129
  "lib/bio/db/fastq.rb",
128
130
  "lib/bio/db/fastq/fastq_to_biosequence.rb",
129
131
  "lib/bio/db/fastq/format_fastq.rb",
@@ -147,7 +149,9 @@ Gem::Specification.new do |s|
147
149
  "lib/bio/db/kegg/glycan.rb",
148
150
  "lib/bio/db/kegg/keggtab.rb",
149
151
  "lib/bio/db/kegg/kgml.rb",
152
+ "lib/bio/db/kegg/module.rb",
150
153
  "lib/bio/db/kegg/orthology.rb",
154
+ "lib/bio/db/kegg/pathway.rb",
151
155
  "lib/bio/db/kegg/reaction.rb",
152
156
  "lib/bio/db/kegg/taxonomy.rb",
153
157
  "lib/bio/db/lasergene.rb",
@@ -219,6 +223,7 @@ Gem::Specification.new do |s|
219
223
  "lib/bio/sequence/generic.rb",
220
224
  "lib/bio/sequence/na.rb",
221
225
  "lib/bio/sequence/quality_score.rb",
226
+ "lib/bio/sequence/sequence_masker.rb",
222
227
  "lib/bio/shell.rb",
223
228
  "lib/bio/shell/core.rb",
224
229
  "lib/bio/shell/demo.rb",
@@ -370,7 +375,17 @@ Gem::Specification.new do |s|
370
375
  "test/data/KEGG/G00024.glycan",
371
376
  "test/data/KEGG/G01366.glycan",
372
377
  "test/data/KEGG/K02338.orthology",
378
+ "test/data/KEGG/M00118.module",
373
379
  "test/data/KEGG/R00006.reaction",
380
+ "test/data/KEGG/T00005.genome",
381
+ "test/data/KEGG/T00070.genome",
382
+ "test/data/KEGG/b0529.gene",
383
+ "test/data/KEGG/ec00072.pathway",
384
+ "test/data/KEGG/hsa00790.pathway",
385
+ "test/data/KEGG/ko00312.pathway",
386
+ "test/data/KEGG/map00030.pathway",
387
+ "test/data/KEGG/map00052.pathway",
388
+ "test/data/KEGG/rn00250.pathway",
374
389
  "test/data/SOSUI/sample.report",
375
390
  "test/data/TMHMM/sample.report",
376
391
  "test/data/aaindex/DAYM780301",
@@ -383,6 +398,7 @@ Gem::Specification.new do |s|
383
398
  "test/data/blast/b0002.faa.m7",
384
399
  "test/data/blast/b0002.faa.m8",
385
400
  "test/data/blast/blastp-multi.m7",
401
+ "test/data/clustalw/example1.aln",
386
402
  "test/data/command/echoarg2.bat",
387
403
  "test/data/embl/AB090716.embl",
388
404
  "test/data/embl/AB090716.embl.rel89",
@@ -441,13 +457,23 @@ Gem::Specification.new do |s|
441
457
  "test/data/fastq/wrapping_original_sanger.fastq",
442
458
  "test/data/gcg/pileup-aa.msf",
443
459
  "test/data/genscan/sample.report",
460
+ "test/data/go/selected_component.ontology",
461
+ "test/data/go/selected_gene_association.sgd",
462
+ "test/data/go/selected_wikipedia2go",
444
463
  "test/data/iprscan/merged.raw",
445
464
  "test/data/iprscan/merged.txt",
465
+ "test/data/medline/20146148_modified.medline",
446
466
  "test/data/meme/db",
447
467
  "test/data/meme/mast",
448
468
  "test/data/meme/mast.out",
449
469
  "test/data/meme/meme.out",
450
470
  "test/data/paml/codeml/control_file.txt",
471
+ "test/data/paml/codeml/models/aa.aln",
472
+ "test/data/paml/codeml/models/aa.dnd",
473
+ "test/data/paml/codeml/models/aa.ph",
474
+ "test/data/paml/codeml/models/alignment.phy",
475
+ "test/data/paml/codeml/models/results0-3.txt",
476
+ "test/data/paml/codeml/models/results7-8.txt",
451
477
  "test/data/paml/codeml/output.txt",
452
478
  "test/data/paml/codeml/rates",
453
479
  "test/data/phyloxml/apaf.xml",
@@ -479,6 +505,7 @@ Gem::Specification.new do |s|
479
505
  "test/unit/bio/appl/blast/test_ncbioptions.rb",
480
506
  "test/unit/bio/appl/blast/test_report.rb",
481
507
  "test/unit/bio/appl/blast/test_rpsblast.rb",
508
+ "test/unit/bio/appl/clustalw/test_report.rb",
482
509
  "test/unit/bio/appl/gcg/test_msf.rb",
483
510
  "test/unit/bio/appl/genscan/test_report.rb",
484
511
  "test/unit/bio/appl/hmmer/test_report.rb",
@@ -489,6 +516,7 @@ Gem::Specification.new do |s|
489
516
  "test/unit/bio/appl/meme/test_motif.rb",
490
517
  "test/unit/bio/appl/paml/codeml/test_rates.rb",
491
518
  "test/unit/bio/appl/paml/codeml/test_report.rb",
519
+ "test/unit/bio/appl/paml/codeml/test_report_single.rb",
492
520
  "test/unit/bio/appl/paml/test_codeml.rb",
493
521
  "test/unit/bio/appl/sim4/test_report.rb",
494
522
  "test/unit/bio/appl/sosui/test_report.rb",
@@ -508,13 +536,18 @@ Gem::Specification.new do |s|
508
536
  "test/unit/bio/db/embl/test_embl_to_bioseq.rb",
509
537
  "test/unit/bio/db/embl/test_sptr.rb",
510
538
  "test/unit/bio/db/embl/test_uniprot.rb",
539
+ "test/unit/bio/db/fasta/test_defline.rb",
540
+ "test/unit/bio/db/fasta/test_defline_misc.rb",
511
541
  "test/unit/bio/db/fasta/test_format_qual.rb",
512
542
  "test/unit/bio/db/kegg/test_compound.rb",
513
543
  "test/unit/bio/db/kegg/test_drug.rb",
514
544
  "test/unit/bio/db/kegg/test_enzyme.rb",
515
545
  "test/unit/bio/db/kegg/test_genes.rb",
546
+ "test/unit/bio/db/kegg/test_genome.rb",
516
547
  "test/unit/bio/db/kegg/test_glycan.rb",
548
+ "test/unit/bio/db/kegg/test_module.rb",
517
549
  "test/unit/bio/db/kegg/test_orthology.rb",
550
+ "test/unit/bio/db/kegg/test_pathway.rb",
518
551
  "test/unit/bio/db/kegg/test_reaction.rb",
519
552
  "test/unit/bio/db/pdb/test_pdb.rb",
520
553
  "test/unit/bio/db/sanger_chromatogram/test_abif.rb",
@@ -523,6 +556,7 @@ Gem::Specification.new do |s|
523
556
  "test/unit/bio/db/test_fasta.rb",
524
557
  "test/unit/bio/db/test_fastq.rb",
525
558
  "test/unit/bio/db/test_gff.rb",
559
+ "test/unit/bio/db/test_go.rb",
526
560
  "test/unit/bio/db/test_lasergene.rb",
527
561
  "test/unit/bio/db/test_medline.rb",
528
562
  "test/unit/bio/db/test_newick.rb",
@@ -548,6 +582,7 @@ Gem::Specification.new do |s|
548
582
  "test/unit/bio/sequence/test_dblink.rb",
549
583
  "test/unit/bio/sequence/test_na.rb",
550
584
  "test/unit/bio/sequence/test_quality_score.rb",
585
+ "test/unit/bio/sequence/test_sequence_masker.rb",
551
586
  "test/unit/bio/shell/plugin/test_seq.rb",
552
587
  "test/unit/bio/test_alignment.rb",
553
588
  "test/unit/bio/test_command.rb",
@@ -588,7 +623,8 @@ Gem::Specification.new do |s|
588
623
  "README.rdoc",
589
624
  "README_DEV.rdoc",
590
625
  "RELEASE_NOTES.rdoc",
591
- "doc/Changes-1.3.rdoc"
626
+ "doc/Changes-1.3.rdoc",
627
+ "doc/RELEASE_NOTES-1.4.0.rdoc"
592
628
  ]
593
629
  s.rdoc_options << '--main' << 'README.rdoc'
594
630
  s.rdoc_options << '--title' << 'BioRuby API documentation'
@@ -0,0 +1,167 @@
1
+ = BioRuby 1.4.0 RELEASE NOTES
2
+
3
+ A lot of changes have been made to the BioRuby 1.4.0 after the version 1.3.1
4
+ is released. This document describes important and/or incompatible changes
5
+ since the BioRuby 1.3.1 release.
6
+
7
+ == New features
8
+
9
+ === PhyloXML support
10
+
11
+ Support for reading and writing PhyloXML file format is added. New classes
12
+ Bio::PhyloXML::Parser and Bio::PhyloXML::Writer are used to read and write
13
+ a PhyloXML file, respectively.
14
+
15
+ The code is developed by Diana Jaunzeikare, mentored by Christian M Zmasek
16
+ and co-mentors, supported by Google Summer of Code 2009 in collaboration
17
+ with the National Evolutionary Synthesis Center (NESCent).
18
+
19
+ === FASTQ file format support
20
+
21
+ Support for reading and writing FASTQ file format is added. All of the
22
+ three FASTQ format variants are supported.
23
+
24
+ To read a FASTQ file, Bio::FlatFile can be used. File format auto-detection
25
+ of the FASTQ format is supported (although the three format variants should
26
+ be specified later by users if quality scores are needed).
27
+
28
+ New class Bio::Fastq is the parser class for the FASTQ format. An object
29
+ of the Bio::Fastq class can be converted to a Bio::Sequence object with the
30
+ "to_biosequnece" method. Bio::Sequence#output now supports output of the
31
+ FASTQ format.
32
+
33
+ The code is written by Naohisa Goto, with the help of discussions in the
34
+ open-bio-l mailing list. The prototype of Bio::Fastq class was first
35
+ developed during the BioHackathon 2009 held in Okinawa.
36
+
37
+ === DNA chromatogram support
38
+
39
+ Support for reading DNA chromatogram files are added. SCF and and ABIF file
40
+ formats are supported. The code is developed by Anthony Underwood.
41
+
42
+ === MEME (motif-based sequence analysis tools) support
43
+
44
+ Support for running MAST (Motif Aliginment & Search Tool, part of the MEME
45
+ Suite, motif-based sequence analysis tools) and parsing its results are
46
+ added. The code is developed by Adam Kraut.
47
+
48
+ === Improvement of KEGG parser classes
49
+
50
+ Some new methods are added to parse new fields added to some KEGG file
51
+ formats. Unit tests for KEGG parsers are also added and improved. In
52
+ addition, return value types of some methods are also changed for unifying
53
+ APIs among KEGG parser classes. See incompatible changes below for details.
54
+ Most of them are contributed by Kozo Nishida.
55
+
56
+ === Many sample scripts are added
57
+
58
+ Many sample scripts showing demonstrations of usages of classes are added.
59
+ They are moved from primitive test codes for the classes described in the
60
+ "if __FILE__ == $0" convention in the library files.
61
+
62
+ === Unit tests can test installed BioRuby
63
+
64
+ Mechanism to load library and to find test data in the unit tests are changed,
65
+ and the library path and test data path can be specified with environment
66
+ variables. BIORUBY_TEST_LIB is the path to be added to the Ruby's $LOAD_PATH.
67
+ For example, to test BioRuby installed in
68
+ /usr/local/lib/site_ruby/1.8, run
69
+ env BIORUBY_TEST_LIB=/usr/local/lib/site_ruby/1.8 ruby test/runner.rb
70
+
71
+ BIORUBY_TEST_DATA is the path of the test data, and BIORUBY_TEST_DEBUG is a
72
+ flag to turn on debug of the tests.
73
+
74
+ == Deprecated features
75
+
76
+ === ChangeLog is replaced by git log
77
+
78
+ ChangeLog is replaced by the output of git-log command, and ChangeLog before
79
+ the 1.3.1 release is moved to doc/ChangeLog-before-1.3.1.
80
+
81
+ === "if __FILE__ == $0" convention
82
+
83
+ Primitive test codes in the "if __FILE__ == $0" convention are removed and
84
+ the codes are moved to the sample scripts named sample/demo_*.rb (except
85
+ some older or deprecated files).
86
+
87
+ == Incompatible changes
88
+
89
+ === Bio::NCBI::REST
90
+
91
+ NCBI announces that all Entrez E-utility requests must contain email and
92
+ tool parameters, and requests without them will return error after June
93
+ 2010.
94
+
95
+ To set default email address and tool name, following methods are added.
96
+ * Bio::NCBI.default_email=(email)
97
+ * Bio::NCBI.default_tool=(tool_name)
98
+
99
+ For every query, Bio::NCBI::REST checks the email and tool parameters and
100
+ raises error if they are empty.
101
+
102
+ IMPORTANT NOTE: No default email address is preset in BioRuby. Programmers
103
+ using BioRuby must set their own email address or implement to get user's
104
+ email address in some way (from input form, configuration file, etc).
105
+
106
+ Default tool name is set as "#{$0} (bioruby/#{Bio::BIORUBY_VERSION_ID})".
107
+ For example, if you run "ruby my_script.rb" with BioRuby 1.4.0, the value is
108
+ "my_script.rb (bioruby/1.4.0)".
109
+
110
+ === Bio::KEGG
111
+
112
+ ==== dblinks method
113
+
114
+ In Bio::KEGG::COMPOUND, DRUG, ENZYME, GLYCAN and ORTHOLOGY, the method
115
+ dblinks is changed to return a Hash. Each key of the hash is a database name
116
+ and its value is an array of entry IDs in the database. If old behavior
117
+ (returns raw entry lines as an array of strings) is needed, use
118
+ dblinks_as_strings.
119
+
120
+ ==== pathways method
121
+
122
+ In Bio::KEGG::COMPOUND, DRUG, ENZYME, GENES, GLYCAN and REACTION, the
123
+ method pathways is changed to return a Hash. Each key of the hash is a
124
+ pathway ID and its value is the description of the pathway.
125
+
126
+ In Bio::KEGG::GENES, if old behavior (returns pathway IDs as an Array) is
127
+ needed, use pathways.keys.
128
+
129
+ In Bio::KEGG::COMPOUND, DRUG, ENZYME, GLYCAN, and REACTION, if old behavior
130
+ (returns raw entry lines as an array of strings) is needed, use
131
+ pathways_as_strings.
132
+
133
+ Note that Bio::KEGG::ORTHOLOGY#pathways is not changed (returns an array
134
+ containing pathway IDs).
135
+
136
+ ==== orthologs method
137
+
138
+ In Bio::KEGG::ENZYME, GENES, GLYCAN and REACTION, the method orthologs is
139
+ changed to return a Hash. Each key of the hash is a ortholog ID and its
140
+ value is the name of the ortholog. If old behavior (returns raw entry lines
141
+ as an array of strings) is needed, use orthologs_as_strings.
142
+
143
+ ==== genes method
144
+
145
+ In Bio::KEGG::ENZYME#genes and Bio::KEGG::ORTHOLOGY#genes is changed to
146
+ return a Hash that is the same as Bio::KEGG::ORTHOLOGY#genes_as_hash.
147
+ If old behavior (returns raw entry lines as an array of strings) is needed,
148
+ use genes_as_strings.
149
+
150
+ ==== Bio::KEGG:REACTION#rpairs
151
+
152
+ Bio::KEGG::REACTION#rpairs is changed to return a Hash. Each key of the
153
+ hash is a KEGG Rpair ID and its value is an array containing name and type.
154
+ If old behavior (returns as tokens) is needed, use rpairs_as_tokens.
155
+
156
+ ==== Bio::KEGG::ORTHOLOGY
157
+
158
+ Bio::KEGG:ORTHOLOGY#dblinks_as_hash does not lower-case database names.
159
+
160
+ === Bio::RestrictionEnzyme
161
+
162
+ Format validation when creating an object is turned off because of efficiency.
163
+
164
+ == Known problems
165
+
166
+ See KNOWN_ISSUES.rdoc for details.
167
+