RubyGems - bio - Versions diffs - 1.4.0 → 1.4.1 - Mend

bio 1.4.0 → 1.4.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (82) hide show

data/ChangeLog +1712 -0
data/KNOWN_ISSUES.rdoc +11 -1
data/README.rdoc +3 -2
data/RELEASE_NOTES.rdoc +65 -127
data/bioruby.gemspec +38 -2
data/doc/RELEASE_NOTES-1.4.0.rdoc +167 -0
data/doc/Tutorial.rd +74 -16
data/doc/Tutorial.rd.html +68 -16
data/lib/bio.rb +2 -0
data/lib/bio/appl/clustalw/report.rb +18 -0
data/lib/bio/appl/paml/codeml/report.rb +579 -21
data/lib/bio/command.rb +149 -21
data/lib/bio/db/aaindex.rb +11 -1
data/lib/bio/db/embl/sptr.rb +1 -1
data/lib/bio/db/fasta/defline.rb +7 -2
data/lib/bio/db/fasta/qual.rb +24 -0
data/lib/bio/db/fasta/qual_to_biosequence.rb +29 -0
data/lib/bio/db/fastq.rb +15 -0
data/lib/bio/db/go.rb +2 -2
data/lib/bio/db/kegg/common.rb +109 -5
data/lib/bio/db/kegg/genes.rb +61 -15
data/lib/bio/db/kegg/genome.rb +43 -38
data/lib/bio/db/kegg/module.rb +158 -0
data/lib/bio/db/kegg/orthology.rb +40 -1
data/lib/bio/db/kegg/pathway.rb +254 -0
data/lib/bio/db/medline.rb +6 -2
data/lib/bio/io/flatfile/autodetection.rb +6 -0
data/lib/bio/location.rb +39 -0
data/lib/bio/reference.rb +24 -0
data/lib/bio/sequence.rb +2 -0
data/lib/bio/sequence/adapter.rb +1 -0
data/lib/bio/sequence/format.rb +14 -0
data/lib/bio/sequence/sequence_masker.rb +95 -0
data/lib/bio/tree.rb +4 -4
data/lib/bio/util/restriction_enzyme/double_stranded/aligned_strands.rb +5 -0
data/lib/bio/version.rb +1 -1
data/setup.rb +5 -0
data/test/data/KEGG/K02338.orthology +180 -52
data/test/data/KEGG/M00118.module +44 -0
data/test/data/KEGG/T00005.genome +140 -0
data/test/data/KEGG/T00070.genome +34 -0
data/test/data/KEGG/b0529.gene +47 -0
data/test/data/KEGG/ec00072.pathway +23 -0
data/test/data/KEGG/hsa00790.pathway +59 -0
data/test/data/KEGG/ko00312.pathway +16 -0
data/test/data/KEGG/map00030.pathway +37 -0
data/test/data/KEGG/map00052.pathway +13 -0
data/test/data/KEGG/rn00250.pathway +114 -0
data/test/data/clustalw/example1.aln +58 -0
data/test/data/go/selected_component.ontology +12 -0
data/test/data/go/selected_gene_association.sgd +31 -0
data/test/data/go/selected_wikipedia2go +13 -0
data/test/data/medline/20146148_modified.medline +54 -0
data/test/data/paml/codeml/models/aa.aln +26 -0
data/test/data/paml/codeml/models/aa.dnd +13 -0
data/test/data/paml/codeml/models/aa.ph +13 -0
data/test/data/paml/codeml/models/alignment.phy +49 -0
data/test/data/paml/codeml/models/results0-3.txt +312 -0
data/test/data/paml/codeml/models/results7-8.txt +340 -0
data/test/functional/bio/io/test_togows.rb +8 -8
data/test/functional/bio/test_command.rb +7 -6
data/test/unit/bio/appl/clustalw/test_report.rb +80 -0
data/test/unit/bio/appl/paml/codeml/test_rates.rb +6 -6
data/test/unit/bio/appl/paml/codeml/test_report.rb +231 -24
data/test/unit/bio/appl/paml/codeml/test_report_single.rb +46 -0
data/test/unit/bio/db/embl/test_sptr.rb +1 -1
data/test/unit/bio/db/fasta/test_defline.rb +160 -0
data/test/unit/bio/db/fasta/test_defline_misc.rb +490 -0
data/test/unit/bio/db/kegg/test_genes.rb +281 -1
data/test/unit/bio/db/kegg/test_genome.rb +408 -0
data/test/unit/bio/db/kegg/test_module.rb +246 -0
data/test/unit/bio/db/kegg/test_orthology.rb +95 -0
data/test/unit/bio/db/kegg/test_pathway.rb +1250 -0
data/test/unit/bio/db/test_aaindex.rb +8 -7
data/test/unit/bio/db/test_fastq.rb +36 -0
data/test/unit/bio/db/test_go.rb +171 -0
data/test/unit/bio/db/test_medline.rb +148 -0
data/test/unit/bio/db/test_qual.rb +9 -2
data/test/unit/bio/sequence/test_sequence_masker.rb +169 -0
data/test/unit/bio/test_tree.rb +260 -1
data/test/unit/bio/util/test_contingency_table.rb +7 -7
metadata +53 -6

data/doc/Tutorial.rd CHANGED

@@ -21,8 +21,12 @@
 #   cat Tutorial.rd | sed -e "s,bioruby>,>>," | sed "s,==>,=>," > Tutorial.rd.tmp
 #   rubydoctest Tutorial.rd.tmp
 #
-# Rubydoctest is useful to verify an example in this document (still) works
+# alternatively, the Ruby way is
 #
+#   ruby -p -e '$_.sub!(/bioruby\>/, ">>"); $_.sub!(/\=\=\>/, "=>")' Tutorial.rd > Tutorial.rd.tmp
+#   rubydoctest Tutorial.rd.tmp
+#
+# Rubydoctest is useful to verify an example in this document (still) works
 #
 #
@@ -34,9 +38,9 @@ bioruby> $: << '../lib'
 = BioRuby Tutorial
 * Copyright (C) 2001-2003 KATAYAMA Toshiaki <k .at. bioruby.org>
-* Copyright (C) 2005-2009 Pjotr Prins, Naohisa Goto and others
+* Copyright (C) 2005-2010 Pjotr Prins, Naohisa Goto and others
-This document was last modified: 2009/12/27
+This document was last modified: 2010/01/08
 Current editor: Pjotr Prins <p .at. bioruby.org>
 The latest version resides in the GIT source code repository:  ./doc/((<Tutorial.rd|URL:http://github.com/pjotrp/bioruby/raw/documentation/doc/Tutorial.rd>)).
@@ -46,8 +50,8 @@ The latest version resides in the GIT source code repository:  ./doc/((<Tutorial
 This is a tutorial for using Bioruby. A basic knowledge of Ruby is required.
 If you want to know more about the programming langauge Ruby we recommend the
 latest Ruby book ((<Programming Ruby|URL:http://www.pragprog.com/titles/ruby>))
-by Dave Thomas and Andy Hunt - some of it is online
-((<here|URL:http://www.rubycentral.com/pickaxe/>)).
+by Dave Thomas and Andy Hunt - the first edition is online
+((<here|URL:http://www.ruby-doc.org/docs/ProgrammingRuby/>)).
 For BioRuby you need to install Ruby and the BioRuby package on your computer
@@ -64,8 +68,13 @@ If you see no such thing you'll have to install Ruby using your installation
 manager. For more information see the
 ((<Ruby|URL:http://www.ruby-lang.org/en/>)) website.
-Once Ruby is works download and install Bioruby using the links on the
-((<Bioruby|URL:http://bioruby.org/>)) website.
+With Ruby download and install Bioruby using the links on the
+((<Bioruby|URL:http://bioruby.org/>)) website. The recommended installation is via
+Ruby gems:
+  gem install bio
+See also the Bioruby ((<wiki|URL:http://bioruby.open-bio.org/wiki/Installation>)).
 A lot of BioRuby's documentation exists in the source code and unit tests. To
 really dive in you will need the latest source code tree. The embedded rdoc
@@ -1211,7 +1220,64 @@ Bio::Fetch.query method.)
 == BioSQL
-to be written...
+BioSQL is a well known schema to store and retrive biological sequences using a RDBMS like PostgreSQL or MySQL; note that SQLite is not supported.
+First of all, you must install a database engine or have access to a remote one. Then create the schema and populate with the taxonomy. You can follow the ((<Official Guide|URL:http://code.open-bio.org/svnweb/index.cgi/biosql/view/biosql-schema/trunk/INSTALL>)) .
+Next step is to install these gems:
+* ActiveRecord
+* CompositePrimaryKeys (Rails doesn't handle by default composite primary keys)
+* The layer to comunicate with you preferred RDBMS (postgresql, mysql, jdbcmysql in case you are running JRuby )
+You can find ActiveRecord's models in /bioruby/lib/bio/io/biosql
+When you have your database up and running, you can connect to it in this way:
+    #!/usr/bin/env ruby
+    require 'bio'
+    connection = Bio::SQL.establish_connection({'development'=>{'hostname'=>"YourHostname",
+                                                   'database'=>"CoolBioSeqDB",
+                                                   'adapter'=>"jdbcmysql",
+                                                   'username'=>"YourUser",
+                                                   'password'=>"YouPassword"
+                                                  }
+                                  },
+                                  'development')
+    #The first parameter is the hash contaning the description of the configuration similar to database.yml in Rails application, you can declare different environment. The second parameter is the environment to use: 'development', 'test', 'production'.
+    #To store a sequence into the database you simply need a biosequence object.
+    biosql_database = Bio::SQL::Biodatabase.find(:first)
+    ff = Bio::GenBank.open("gbvrl1.seq")
+    ff.each_entry do |gb|
+      Bio::SQL::Sequence.new(:biosequence=>gb.to_biosequence, :biodatabase=>biosql_database
+    end
+    #You can list all the entries into every database
+    Bio::SQL.list_entries
+    #list databases:
+    Bio::SQL.list_databases
+    #retriving a generic accession
+    bioseq = Bio::SQL.fetch_accession("YouAccession")
+    #If you use biosequence objects, you will find all its method mapped to BioSQL sequences. But you can also access to the models directly:
+    #get the raw sequence associated with you accession
+    bioseq.entry.biosequence
+    #get the length of your sequence, this is the explicit form of bioseq.length
+    bioseq.entry.biosequence.length
+    #convert the sequence in GenBank format
+    bioseq.to_biosequence.output(:genbank)
+BioSQL' ((<schema|URL:http://www.biosql.org/wiki/Schema_Overview>)) is not so intuitive at the beginning, spend some time on understanding it, in the end if you know a little bit of rails everything will go smootly. You can find information to Annotation ((<here|URL:http://www.biosql.org/wiki/Annotation_Mapping>))
+ToDo: add exemaples from George. I remember he did some cool post on BioSQL and Rails.
 = PhyloXML
@@ -1400,14 +1466,6 @@ Gene Ontologies can be fetched through the Ruby Ensembl API package:
 Prints each mosq. accession/uniq identifier and the GO terms from the Drosphila
 homologues.
-== Comparing BioProjects
-For a quick functional comparison of BioRuby, BioPerl, BioPython and Bioconductor (R) see ((<URL:http://sciruby.codeforpeople.com/sr.cgi/BioProjects>))
-== Using BioRuby with R
-Using Ruby with R Pjotr wrote a section on SciRuby. See ((<URL:http://sciruby.codeforpeople.com/sr.cgi/RubyWithRlang>))
 == Using BioPerl or BioPython from Ruby
 At the moment there is no easy way of accessing BioPerl from Ruby. The best way, perhaps, is to create a Perl server that gets accessed through XML/RPC or SOAP.

data/doc/Tutorial.rd.html CHANGED

@@ -11,17 +11,17 @@
 <h1><a name="label-0" id="label-0">BioRuby Tutorial</a></h1><!-- RDLabel: "BioRuby Tutorial" -->
 <ul>
 <li>Copyright (C) 2001-2003 KATAYAMA Toshiaki &lt;k .at. bioruby.org&gt;</li>
-<li>Copyright (C) 2005-2009 Pjotr Prins, Naohisa Goto and others</li>
+<li>Copyright (C) 2005-2010 Pjotr Prins, Naohisa Goto and others</li>
 </ul>
-<p>This document was last modified: 2009/12/27
+<p>This document was last modified: 2010/01/08
 Current editor: Pjotr Prins &lt;p .at. bioruby.org&gt;</p>
 <p>The latest version resides in the GIT source code repository:  ./doc/<a href="http://github.com/pjotrp/bioruby/raw/documentation/doc/Tutorial.rd">Tutorial.rd</a>.</p>
 <h2><a name="label-1" id="label-1">Introduction</a></h2><!-- RDLabel: "Introduction" -->
 <p>This is a tutorial for using Bioruby. A basic knowledge of Ruby is required.
 If you want to know more about the programming langauge Ruby we recommend the
 latest Ruby book <a href="http://www.pragprog.com/titles/ruby">Programming Ruby</a>
-by Dave Thomas and Andy Hunt - some of it is online
-<a href="http://www.rubycentral.com/pickaxe/">here</a>.</p>
+by Dave Thomas and Andy Hunt - the first edition is online
+<a href="http://www.ruby-doc.org/docs/ProgrammingRuby/">here</a>.</p>
 <p>For BioRuby you need to install Ruby and the BioRuby package on your computer</p>
 <p>You can check whether Ruby is installed on your computer and what
 version it has with the</p>
@@ -31,8 +31,11 @@ version it has with the</p>
 <p>If you see no such thing you'll have to install Ruby using your installation
 manager. For more information see the
 <a href="http://www.ruby-lang.org/en/">Ruby</a> website.</p>
-<p>Once Ruby is works download and install Bioruby using the links on the
-<a href="http://bioruby.org/">Bioruby</a> website.</p>
+<p>With Ruby download and install Bioruby using the links on the
+<a href="http://bioruby.org/">Bioruby</a> website. The recommended installation is via
+Ruby gems:</p>
+<pre>gem install bio</pre>
+<p>See also the Bioruby <a href="http://bioruby.open-bio.org/wiki/Installation">wiki</a>.</p>
 <p>A lot of BioRuby's documentation exists in the source code and unit tests. To
 really dive in you will need the latest source code tree. The embedded rdoc
 documentation can be viewed online at
@@ -946,7 +949,60 @@ Because the KEGG/GENES database and AAindex database are not available
 from other BioFetch servers, we used bioruby.org server with
 Bio::Fetch.query method.)</p>
 <h2><a name="label-22" id="label-22">BioSQL</a></h2><!-- RDLabel: "BioSQL" -->
-<p>to be written...</p>
+<p>BioSQL is a well known schema to store and retrive biological sequences using a RDBMS like PostgreSQL or MySQL; note that SQLite is not supported.
+First of all, you must install a database engine or have access to a remote one. Then create the schema and populate with the taxonomy. You can follow the <a href="http://code.open-bio.org/svnweb/index.cgi/biosql/view/biosql-schema/trunk/INSTALL">Official Guide</a> .
+Next step is to install these gems:</p>
+<ul>
+<li>ActiveRecord</li>
+<li>CompositePrimaryKeys (Rails doesn't handle by default composite primary keys)</li>
+<li>The layer to comunicate with you preferred RDBMS (postgresql, mysql, jdbcmysql in case you are running JRuby )</li>
+</ul>
+<p>You can find ActiveRecord's models in /bioruby/lib/bio/io/biosql</p>
+<p>When you have your database up and running, you can connect to it in this way:</p>
+<pre>#!/usr/bin/env ruby
+require 'bio'
+connection = Bio::SQL.establish_connection({'development'=&gt;{'hostname'=&gt;"YourHostname",
+                                               'database'=&gt;"CoolBioSeqDB",
+                                               'adapter'=&gt;"jdbcmysql",
+                                               'username'=&gt;"YourUser",
+                                               'password'=&gt;"YouPassword"
+                                              }
+                              },
+                              'development')
+#The first parameter is the hash contaning the description of the configuration similar to database.yml in Rails application, you can declare different environment. The second parameter is the environment to use: 'development', 'test', 'production'.
+#To store a sequence into the database you simply need a biosequence object.
+biosql_database = Bio::SQL::Biodatabase.find(:first)
+ff = Bio::GenBank.open("gbvrl1.seq")
+ff.each_entry do |gb|
+  Bio::SQL::Sequence.new(:biosequence=&gt;gb.to_biosequence, :biodatabase=&gt;biosql_database
+end
+#You can list all the entries into every database
+Bio::SQL.list_entries
+#list databases:
+Bio::SQL.list_databases
+#retriving a generic accession
+bioseq = Bio::SQL.fetch_accession("YouAccession")
+#If you use biosequence objects, you will find all its method mapped to BioSQL sequences. But you can also access to the models directly:
+#get the raw sequence associated with you accession
+bioseq.entry.biosequence
+#get the length of your sequence, this is the explicit form of bioseq.length
+bioseq.entry.biosequence.length
+#convert the sequence in GenBank format
+bioseq.to_biosequence.output(:genbank)</pre>
+<p>BioSQL' <a href="http://www.biosql.org/wiki/Schema_Overview">schema</a> is not so intuitive at the beginning, spend some time on understanding it, in the end if you know a little bit of rails everything will go smootly. You can find information to Annotation <a href="http://www.biosql.org/wiki/Annotation_Mapping">here</a>
+ToDo: add exemaples from George. I remember he did some cool post on BioSQL and Rails.</p>
 <h1><a name="label-23" id="label-23">PhyloXML</a></h1><!-- RDLabel: "PhyloXML" -->
 <p>PhyloXML is an XML language for saving, analyzing and exchanging data of
 annotated phylogenetic trees. PhyloXML parser in BioRuby is implemented in
@@ -1087,13 +1143,9 @@ infile.each do |line|
 end</pre>
 <p>Prints each mosq. accession/uniq identifier and the GO terms from the Drosphila
 homologues.</p>
-<h2><a name="label-38" id="label-38">Comparing BioProjects</a></h2><!-- RDLabel: "Comparing BioProjects" -->
-<p>For a quick functional comparison of BioRuby, BioPerl, BioPython and Bioconductor (R) see <a href="http://sciruby.codeforpeople.com/sr.cgi/BioProjects">&lt;URL:http://sciruby.codeforpeople.com/sr.cgi/BioProjects&gt;</a></p>
-<h2><a name="label-39" id="label-39">Using BioRuby with R</a></h2><!-- RDLabel: "Using BioRuby with R" -->
-<p>Using Ruby with R Pjotr wrote a section on SciRuby. See <a href="http://sciruby.codeforpeople.com/sr.cgi/RubyWithRlang">&lt;URL:http://sciruby.codeforpeople.com/sr.cgi/RubyWithRlang&gt;</a></p>
-<h2><a name="label-40" id="label-40">Using BioPerl or BioPython from Ruby</a></h2><!-- RDLabel: "Using BioPerl or BioPython from Ruby" -->
+<h2><a name="label-38" id="label-38">Using BioPerl or BioPython from Ruby</a></h2><!-- RDLabel: "Using BioPerl or BioPython from Ruby" -->
 <p>At the moment there is no easy way of accessing BioPerl from Ruby. The best way, perhaps, is to create a Perl server that gets accessed through XML/RPC or SOAP.</p>
-<h2><a name="label-41" id="label-41">Installing required external library</a></h2><!-- RDLabel: "Installing required external library" -->
+<h2><a name="label-39" id="label-39">Installing required external library</a></h2><!-- RDLabel: "Installing required external library" -->
 <p>At this point for using BioRuby no additional libraries are needed, except if
 you are using Bio::PhyloXML module. Then you have to install libxml-ruby.</p>
 <p>This may change, so keep an eye on the Bioruby website. Also when
@@ -1102,20 +1154,20 @@ a package is missing BioRuby should show an informative message.</p>
 painful, as the gem standard for packages evolved late and some still
 force you to copy things by hand. Therefore read the README's
 carefully that come with each package.</p>
-<h3><a name="label-42" id="label-42">Installing libxml-ruby</a></h3><!-- RDLabel: "Installing libxml-ruby" -->
+<h3><a name="label-40" id="label-40">Installing libxml-ruby</a></h3><!-- RDLabel: "Installing libxml-ruby" -->
 <p>The simplest way is to use gem packaging system.</p>
 <pre>gem install -r libxml-ruby</pre>
 <p>If you get `require': no such file to load - mkmf (LoadError) error then do</p>
 <pre>sudo apt-get install ruby-dev</pre>
 <p>If you have other problems with installation, then see <a href="http://libxml.rubyforge.org/install.xml">&lt;URL:http://libxml.rubyforge.org/install.xml&gt;</a>  </p>
-<h2><a name="label-43" id="label-43">Trouble shooting</a></h2><!-- RDLabel: "Trouble shooting" -->
+<h2><a name="label-41" id="label-41">Trouble shooting</a></h2><!-- RDLabel: "Trouble shooting" -->
 <ul>
 <li>Error: in `require': no such file to load -- bio (LoadError)</li>
 </ul>
 <p>Ruby fails to find the BioRuby libraries - add it to the RUBYLIB path, or pass
 it to the interpeter. For example:</p>
 <pre>ruby -I$BIORUBYPATH/lib yourprogram.rb</pre>
-<h2><a name="label-44" id="label-44">Modifying this page</a></h2><!-- RDLabel: "Modifying this page" -->
+<h2><a name="label-42" id="label-42">Modifying this page</a></h2><!-- RDLabel: "Modifying this page" -->
 <p>IMPORTANT NOTICE: This page is maintained in the BioRuby source code
 repository. Please edit the file there otherwise changes may get
 lost. See <!-- Reference, RDLabel "BioRuby Developer Information" doesn't exist --><em class="label-not-found">BioRuby Developer Information</em><!-- Reference end --> for repository and mailing list

data/lib/bio.rb CHANGED

@@ -107,6 +107,8 @@ module Bio
     autoload :EXPRESSION,   'bio/db/kegg/expression'
     autoload :ORTHOLOGY,    'bio/db/kegg/orthology'
     autoload :KGML,         'bio/db/kegg/kgml'
+    autoload :PATHWAY,      'bio/db/kegg/pathway'
+    autoload :MODULE,       'bio/db/kegg/module'
     autoload :Taxonomy,     'bio/db/kegg/taxonomy'
   end

data/lib/bio/appl/clustalw/report.rb CHANGED

@@ -2,6 +2,8 @@
 # = bio/appl/clustalw/report.rb - CLUSTAL W format data (*.aln) class
 #
 # Copyright:: Copyright (C) 2003 GOTO Naohisa <ngoto@gen-info.osaka-u.ac.jp>
+#             Copyright (C) 2010 Pjotr Prins <pjotr.prins@thebird.nl>
+#
 # License::   The Ruby License
 #
 #  $Id: report.rb,v 1.13 2007/07/18 08:47:39 ngoto Exp $
@@ -72,6 +74,22 @@ module Bio
         @header or (do_parse or @header)
       end
+      # Returns the Bio::Sequence in the matrix at row 'row' as
+      # Bio::Sequence object. When _row_ is out of range a nil is returned.
+      # ---
+      # *Arguments*:
+      # * (required) _row_: Integer
+      # *Returns*:: Bio::Sequence
+      def get_sequence(row)
+        a = alignment
+        return nil if row < 0 or row >= a.keys.size
+        id  = a.keys[row]
+        seq = a.to_hash[id]
+        s = Bio::Sequence.new(seq.seq)
+        s.definition = id
+        s
+      end
       # Shows "match line" of CLUSTAL's alignment result, for example,
       # ':* :* .*   *       .*::*.   ** :* . *    .        '.
       # Returns a string.

data/lib/bio/appl/paml/codeml/report.rb CHANGED

@@ -1,41 +1,369 @@
 #
 # = bio/appl/paml/codeml/report.rb - Codeml report parser
 #
-# Copyright::  Copyright (C) 2008 Michael D. Barton <mail@michaelbarton.me.uk>
+# Copyright::  Copyright (C) 2008-2010
+#              Michael D. Barton <mail@michaelbarton.me.uk>,
+#              Pjotr Prins <pjotr.prins@thebird.nl>
 #
 # License::    The Ruby License
 #
-# == Description
-#
-# This file contains a class that implement a simple interface to Codeml output file
-#
-# == References
-#
-# * http://abacus.gene.ucl.ac.uk/software/paml.html
-#
 require 'bio/appl/paml/codeml'
 module Bio::PAML
   class Codeml
     # == Description
     #
-    # A simple class for parsing codeml output.
+    # Run PAML codeml and get the results from the output file. The
+    # Codeml::Report object is returned by Bio::PAML::Codeml.query. For
+    # example
+    #
+    #   codeml = Bio::PAML::Codeml.new('codeml', :runmode => 0,
+    #       :RateAncestor => 1, :alpha => 0.5, :fix_alpha => 0)
+    #   result = codeml.query(alignment, tree)
+    #
+    # where alignment and tree are Bioruby objects. This class assumes we have a
+    # buffer containing the output of codeml.
+    #
+    # == References
+    #
+    # Phylogenetic Analysis by Maximum Likelihood (PAML) is a package of
+    # programs for phylogenetic analyses of DNA or protein sequences using
+    # maximum likelihood. It is maintained and distributed for academic use
+    # free of charge by Ziheng Yang. Suggestion citation
+    #
+    #   Yang, Z. 1997
+    #   PAML: a program package for phylogenetic analysis by maximum likelihood
+    #   CABIOS 13:555-556
+    #
+    # http://abacus.gene.ucl.ac.uk/software/paml.html
+    #
+    # == Examples
+    #
+    #--
+    # The following is not shown in the documentation
+    #
+    #   >> require 'bio'
+    #   >> require 'bio/test/biotestfile'
+    #   >> buf = BioTestFile.read('paml/codeml/models/results0-3.txt')
+    #++
+    #
+    # Invoke Bioruby's PAML codeml parser, after having read the contents
+    # of the codeml result file into _buf_ (for example using File.read)
+    #
+    #   >> c = Bio::PAML::Codeml::Report.new(buf)
+    #
+    # Do we have two models?
+    #
+    #   >> c.models.size
+    #   => 2
+    #   >> c.models[0].name
+    #   => "M0"
+    #   >> c.models[1].name
+    #   => "M3"
+    #
+    # Check the general information
+    #
+    #   >> c.num_sequences
+    #   => 6
+    #   >> c.num_codons
+    #   => 134
+    #   >> c.descr
+    #   => "M0-3"
+    #
+    # Test whether the second model M3 is significant over M0
+    #
+    #   >> c.significant
+    #   => true
+    #
+    # Now fetch the results of the first model M0, and check its values
+    #
+    #   >> m0 = c.models[0]
+    #   >> m0.tree_length
+    #   => 1.90227
+    #   >> m0.lnL
+    #   => -1125.800375
+    #   >> m0.omega
+    #   => 0.58589
+    #   >> m0.dN_dS
+    #   => 0.58589
+    #   >> m0.kappa
+    #   => 2.14311
+    #   >> m0.alpha
+    #   => nil
+    #
+    # We also have a tree (as a string)
+    #
+    #   >> m0.tree
+    #   => "((((PITG_23265T0: 0.000004, PITG_23253T0: 0.400074): 0.000004, PITG_23257T0: 0.952614): 0.000004, PITG_23264T0: 0.445507): 0.000004, PITG_23267T0: 0.011814, PITG_23293T0: 0.092242);"
+    #
+    # Check the M3 and its specific values
+    #
+    #   >> m3 = c.models[1]
+    #   >> m3.lnL
+    #   => -1070.964046
+    #   >> m3.classes.size
+    #   => 3
+    #   >> m3.classes[0]
+    #   => {:w=>0.00928, :p=>0.56413}
+    #
+    # And the tree
+    #
+    #   >> m3.tree
+    #   => "((((PITG_23265T0: 0.000004, PITG_23253T0: 0.762597): 0.000004, PITG_23257T0: 2.721710): 0.000004, PITG_23264T0: 0.924326): 0.014562, PITG_23267T0: 0.000004, PITG_23293T0: 0.237433);"
+    #
+    # Next take the overall posterior analysis
+    #
+    #   >> c.nb_sites.size
+    #   => 44
+    #   >> c.nb_sites[0].to_a
+    #   => [17, "I", 0.988, 3.293]
+    #
+    # or by field
+    #
+    #   >> codon = c.nb_sites[0]
+    #   >> codon.position
+    #   => 17
+    #   >> codon.probability
+    #   => 0.988
+    #   >> codon.dN_dS
+    #   => 3.293
+    #
+    # with aliases
+    #
+    #   >> codon.p
+    #   => 0.988
+    #   >> codon.w
+    #   => 3.293
+    #
+    # Now we generate special string 'graph' for positive selection. The
+    # following returns a string the length of the input alignment and
+    # shows the locations of positive selection:
+    #
+    #   >> c.nb_sites.graph[0..32]
+    #   => "                **    *       * *"
+    #
+    # And with dN/dS (high values are still an asterisk *)
+    #
+    #   >> c.nb_sites.graph_omega[0..32]
+    #   => "                3*    6       6 2"
+    #
+    # We also provide the raw buffers to adhere to the principle of
+    # unexpected use. Test the raw buffers for content:
+    #
+    #   >> c.header.to_s =~ /seed/
+    #   => 1
+    #   >> m0.to_s =~ /one-ratio/
+    #   => 3
+    #   >> m3.to_s =~ /discrete/
+    #   => 3
+    #   >> c.footer.to_s =~ /Bayes/
+    #   => 16
     #
-    # WARNING: This data is parsed using a regex from the output file, and
-    # so will take the first result found. If using multiple tree's, your
-    # milage may vary. See the source for the regular expressions.
+    # Finally we do a test on an M7+M8 run. Again, after loading the
+    # results file into _buf_
     #
-    # require 'bio'
+    #--
+    #   >> buf78 = BioTestFile.read('paml/codeml/models/results7-8.txt')
+    #
+    #
+    #++
+    #
+    # Invoke Bioruby's PAML codeml parser
+    #
+    #   >> c = Bio::PAML::Codeml::Report.new(buf78)
+    #
+    # Do we have two models?
+    #
+    #   >> c.models.size
+    #   => 2
+    #   >> c.models[0].name
+    #   => "M7"
+    #   >> c.models[1].name
+    #   => "M8"
+    #
+    # Assert the results are significant
+    #
+    #   >> c.significant
+    #   => true
+    #
+    # Compared to M0/M3 there are some differences. The important ones
+    # are the parameters and the full Bayesian result available for M7/M8.
+    # This is the naive Bayesian result:
+    #
+    #   >> c.nb_sites.size
+    #   => 10
+    #
+    # And this is the full Bayesian result:
+    #
+    #   >> c.sites.size
+    #   => 30
+    #   >> c.sites[0].to_a
+    #   => [17, "I", 0.672, 2.847]
+    #   >> c.sites.graph[0..32]
+    #   => "                **    *       * *"
+    #
+    # Note the differences of omega with earlier M0-M3 naive Bayesian
+    # analysis:
+    #
+    #   >> c.sites.graph_omega[0..32]
+    #   => "                24    3       3 2"
+    #
+    # The locations are the same, but the omega differs.
     #
-    # report = Bio::PAML::Codeml::Report.new(File.open(codeml_output_file).read)
-    # report.gene_rate  # => Rate of gene evolution as defined be alpha
-    # report.tree_lengh # => Estimated phylogetic tree length
     class Report < Bio::PAML::Common::Report
+      attr_reader :models, :header, :footer
+      # Parse codeml output file passed with +buf+, where buf contains
+      # the content of a codeml result file
+      def initialize buf
+        # split the main buffer into sections for each model, header and footer.
+        sections = buf.split("\nModel ")
+        model_num = sections.size-1
+        raise ReportError,"Incorrect codeml data models=#{model_num}" if model_num > 2
+        foot2 = sections[model_num].split("\nNaive ")
+        if foot2.size == 2
+          # We have a dual model
+          sections[model_num] = foot2[0]
+          @footer = 'Naive '+foot2[1]
+          @models = []
+          sections[1..-1].each do | model_buf |
+            @models.push Model.new(model_buf)
+          end
+        else
+          # A single model is run
+          sections = buf.split("\nTREE #")
+          model_num = sections.size-1
+          raise ReportError,"Can not parse single model file" if model_num != 1
+          @models = []
+          @models.push sections[1]
+          @footer = sections[1][/Time used/,1]
+          @single = ReportSingle.new(buf)
+        end
+        @header = sections[0]
+      end
+      # Give a short description of the models, for example 'M0-3'
+      def descr
+        num = @models.size
+        case num
+          when 0
+            'No model'
+          when 1
+            @models[0].name
+          else
+            @models[0].name + '-' + @models[1].modelnum.to_s
+        end
+      end
+      # Return the number of condons in the codeml alignment
+      def num_codons
+        @header.scan(/seed used = \d+\n\s+\d+\s+\d+/).to_s.split[5].to_i/3
+      end
+      # Return the number of sequences in the codeml alignment
+      def num_sequences
+        @header.scan(/seed used = \d+\n\s+\d+\s+\d+/).to_s.split[4].to_i
+      end
+      # Return a PositiveSites (naive empirical bayesian) object
+      def nb_sites
+        PositiveSites.new("Naive Empirical Bayes (NEB)",@footer,num_codons)
+      end
+      # Return a PositiveSites Bayes Empirical Bayes (BEB) analysis
+      def sites
+        PositiveSites.new("Bayes Empirical Bayes (BEB)",@footer,num_codons)
+      end
+      # If the number of models is two we can calculate whether the result is
+      # statistically significant, or not, at the 1% significance level. For
+      # example, for M7-8 the LRT statistic, or twice the log likelihood
+      # difference between the two compared models, may be compared against
+      # chi-square, with critical value 9.21 at the 1% significance level.
+      #
+      # Here we support a few likely combinations, M0-3, M1-2 and M7-8, used
+      # most often in literature. For other combinations, or a different
+      # significance level, you'll have to calculate chi-square yourself.
+      #
+      # Returns true or false. If no result is calculated this method
+      # raises an error
+      def significant
+        raise ReportError,"Wrong number of models #{@models.size}" if @models.size != 2
+        lnL1 = @models[0].lnL
+        model1 = @models[0].modelnum
+        lnL2 = @models[1].lnL
+        model2 = @models[1].modelnum
+        case [model1, model2]
+          when [0,3]
+            2*(lnL2-lnL1) > 13.2767   # chi2: p=0.01, df=4
+          when [1,2]
+            2*(lnL2-lnL1) >  9.2103   # chi2: p=0.01, df=2
+          when [7,8]
+            2*(lnL2-lnL1) >  9.2103   # chi2: p=0.01, df=2
+          else
+            raise ReportError,"Significance calculation for #{descr} not supported"
+        end
+      end
+      #:stopdoc:
+      # compatibility call for older interface (single models only)
+      def tree_log_likelihood
+        @single.tree_log_likelihood
+      end
+      # compatibility call for older interface (single models only)
+      def tree_length
+        @single.tree_length
+      end
+      # compatibility call for older interface (single models only)
+      def alpha
+        @single.alpha
+      end
+      # compatibility call for older interface (single models only)
+      def tree
+        @single.tree
+      end
+      #:startdoc:
+    end  # Report
+    #   ReportSingle is a simpler parser for a codeml report
+    #   containing a single run. This is retained for
+    #   backward compatibility mostly.
+    #
+    #   The results of a single model (old style report parser)
+    #
+    #--
+    #     >> buf = BioTestFile.read('paml/codeml/output.txt')
+    #++
+    #
+    #     >> single = Bio::PAML::Codeml::Report.new(buf)
+    #
+    #     >> single.tree_log_likelihood
+    #     => -1817.465211
+    #
+    #     >> single.tree_length
+    #     => 0.77902
+    #
+    #     >> single.alpha
+    #     => 0.58871
+    #
+    #     >> single.tree
+    #     => "(((rabbit: 0.082889, rat: 0.187866): 0.038008, human: 0.055050): 0.033639, goat-cow: 0.096992, marsupial: 0.284574);"
+    #
+    class ReportSingle < Bio::PAML::Common::Report
       attr_reader :tree_log_likelihood, :tree_length, :alpha, :tree
+      # Do not use
       def initialize(codeml_report)
         @tree_log_likelihood = pull_tree_log_likelihood(codeml_report)
         @tree_length = pull_tree_length(codeml_report)
@@ -45,23 +373,253 @@ module Bio::PAML
       private
+      # Do not use
       def pull_tree_log_likelihood(text)
         text[/lnL\(.+\):\s+(-?\d+(\.\d+)?)/,1].to_f
       end
+      # Do not use
       def pull_tree_length(text)
         text[/tree length\s+=\s+ (-?\d+(\.\d+)?)/,1].to_f
       end
+      # Do not use
       def pull_alpha(text)
         text[/alpha .+ =\s+(-?\d+(\.\d+)?)/,1].to_f
       end
+      # Do not use
       def pull_tree(text)
         text[/([^\n]+)\n\nDetailed/m,1]
       end
-    end # End Report
-  end # End Codeml
-end # End Bio::PAML
+    end # ReportSingle
+    # Model class contains one of the models of a codeml run (e.g. M0)
+    # which is used as a test hypothesis for positive selection. This
+    # class is used by Codeml::Report.
+    class Model
+      # Create a model using the relevant information from the codeml
+      # result data (text buffer)
+      def initialize buf
+        @buf = buf
+      end
+      # Return the model number
+      def modelnum
+        @buf[0..0].to_i
+      end
+      # Return the model name, e.g. 'M0' or 'M7'
+      def name
+        'M'.to_s+modelnum.to_s
+      end
+      # Return codeml log likelihood of model
+      def lnL
+        @buf[/lnL\(.+\):\s+(-?\d+(\.\d+)?)/,1].to_f
+      end
+      # Return codeml omega of model
+      def omega
+        @buf[/omega \(dN\/dS\)\s+=\s+ (-?\d+(\.\d+)?)/,1].to_f
+      end
+      alias dN_dS omega
+      # Return codeml kappa of model, when available
+      def kappa
+        return nil if @buf !~ /kappa/
+        @buf[/kappa \(ts\/tv\)\s+=\s+ (-?\d+(\.\d+)?)/,1].to_f
+      end
+      # Return codeml alpha of model, when available
+      def alpha
+        return nil if @buf !~ /alpha/
+        @buf[/alpha .+ =\s+(-?\d+(\.\d+)?)/,1].to_f
+      end
+      # Return codeml treee length
+      def tree_length
+        @buf[/tree length\s+=\s+ (-?\d+(\.\d+)?)/,1].to_f
+      end
+      # Return codeml tree
+      def tree
+        @buf[/([^\n]+)\n\nDetailed/m,1]
+      end
+      # Return classes when available. For M3 it parses
+      #
+      # dN/dS (w) for site classes (K=3)
+      # p:   0.56413  0.35613  0.07974
+      # w:   0.00928  1.98252 23.44160
+      #
+      # and turns it into an array of Hash
+      #
+      #   >> m3.classes[0]
+      #   => {:w=>0.00928, :p=>0.56413}
+      def classes
+        return nil if @buf !~ /classes/
+        # probs = @buf.scan(/\np:\s+(\w+)\s+(\S+)\s+(\S+)/)
+        probs = @buf.scan(/\np:.*?\n/).to_s.split[1..3].map { |f| f.to_f }
+        ws = @buf.scan(/\nw:.*?\n/).to_s.split[1..3].map { |f| f.to_f }
+        ret = []
+        probs.each_with_index do | prob, i |
+          ret.push  :p => prob, :w => ws[i]
+        end
+        ret
+      end
+      # Return the model information as a String
+      def to_s
+        @buf
+      end
+    end
+    # A record of codon sites, across the sequences in the alignment,
+    # showing evidence of positive selection.
+    #
+    # This class is used for storing both codeml's full Bayesian and naive
+    # Bayesian analysis
+    class PositiveSite
+      attr_reader :position
+      attr_reader :aaref
+      attr_reader :probability
+      attr_reader :omega
+      def initialize fields
+        @position    = fields[0].to_i
+        @aaref       = fields[1].to_s
+        @probability = fields[2].to_f
+        @omega       = fields[3].to_f
+      end
+      # Return dN/dS (or omega) for this codon
+      def dN_dS
+        omega
+      end
+      alias w dN_dS
+      alias p probability
+      # Return contents as Array - useful for printing
+      def to_a
+        [ @position, @aaref, @probability, @omega ]
+      end
+    end
+    # List for the positive selection sites. PAML returns:
+    #
+    # Naive Empirical Bayes (NEB) analysis
+    # Positively selected sites (*: P>95%; **: P>99%)
+    # (amino acids refer to 1st sequence: PITG_23265T0)
+    #
+    #             Pr(w>1)     post mean +- SE for w
+    #
+    #     17 I      0.988*        3.293
+    #     18 H      1.000**       17.975
+    #     23 F      0.991**       6.283
+    # (...)
+    #    131 V      1.000**       22.797
+    #    132 R      1.000**       10.800
+    # (newline)
+    #
+    # these can be accessed using normal iterators. Also special
+    # methods are available for presenting this data
+    #
+    class PositiveSites < Array
+      attr_reader :descr
+      def initialize search, buf, num_codons
+        @num_codons = num_codons
+        if buf.index(search)==nil
+          raise ReportError,"No NB sites found for #{search}"
+        end
+        # Set description of this class
+        @descr = search
+        lines = buf.split("\n")
+        # find location of 'search'
+        start = 0
+        lines.each_with_index do | line, i |
+          if line.index(search) != nil
+            start = i
+            break
+          end
+        end
+        raise ReportError,"Out of bound error for <#{buf}>" if lines[start+6]==nil
+        lines[start+6..-1].each do | line |
+          break if line.strip == ""
+          fields = line.split
+          push PositiveSite.new(fields)
+        end
+        num = size()
+        @buf = lines[start..start+num+7].join("\n")
+      end
+      # Generate a graph - which is a simple string pointing out the positions
+      # showing evidence of positive selection pressure.
+      #
+      #   >> c.sites.graph[0..32]
+      #   => "                **    *       * *"
+      #
+      def graph
+        graph_to_s(lambda { |site| "*" })
+      end
+      # Generate a graph - which is a simple string pointing out the positions
+      # showing evidence of positive selection pressure, with dN/dS values
+      # (high values are an asterisk *)
+      #
+      #   >> c.sites.graph_omega[0..32]
+      #   => "                24    3       3 2"
+      #
+      def graph_omega
+        graph_to_s(lambda { |site|
+            symbol = "*"
+            symbol = site.omega.to_i.to_s if site.omega.abs <= 10.0
+            symbol
+        })
+      end
+      # Graph of amino acids of first sequence at locations
+      def graph_seq
+        graph_to_s(lambda { |site |
+          symbol = site.aaref
+        })
+      end
+      # Return the positive selection information as a String
+      def to_s
+        @buf
+      end
+      # :nodoc:
+      # Creates a graph of sites, adjusting for gaps. This generator
+      # is also called from HtmlPositiveSites. The _fill_ is used
+      # to fill out the gaps
+      def graph_to_s func, fill=' '
+        ret = ""
+        pos = 0
+        each do | site |
+          symbol = func.call(site)
+          gapsize = site.position-pos-1
+          ret += fill*gapsize + symbol
+          pos = site.position
+        end
+        gapsize = @num_codons - pos - 1
+        ret += fill*gapsize if gapsize > 0
+        ret
+      end
+    end
+    # Supporting error class
+    class ReportError < RuntimeError
+    end
+  end # Codeml
+end # Bio::PAML