bioruby-phyloxml 1.0.0

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,214 @@
1
+ # bio-phyloxml
2
+
3
+ [![Build Status](https://secure.travis-ci.org/bioruby/bioruby-phyloxml.png)](http://travis-ci.org/bioruby/bioruby-phyloxml)
4
+
5
+ bio-phyloxml (the package name on RubyGems.org is bioruby-phyloxml)
6
+ is a [phyloXML](http://www.phyloxml.org/) plugin for
7
+ [BioRuby](http://bioruby.org/), an open source bioinformatics
8
+ library for Ruby.
9
+
10
+ phyloXML is an XML language for saving, analyzing and exchanging data
11
+ of annotated phylogenetic trees. The phyloXML parser in BioRuby is
12
+ implemented in Bio::PhyloXML::Parser, and its writer in
13
+ Bio::PhyloXML::Writer. More information can be found at
14
+ [phyloxml.org](http://www.phyloxml.org).
15
+
16
+ This phyloXML code has historically been part of the core BioRuby
17
+ [gem](https://github.com/bioruby/bioruby), but has been split into its
18
+ own gem as part of an effort to
19
+ [modularize](http://bioruby.open-bio.org/wiki/Plugins)
20
+ BioRuby. bio-phyloxml and many more plugins are available at
21
+ [biogems.info](http://www.biogems.info/).
22
+
23
+ This code was originally written by Diana Jaunzeikare during the
24
+ Google Summer of Code 2009 for the
25
+ [Implementing phyloXML support in BioRuby](http://informatics.nescent.org/wiki/Phyloinformatics_Summer_of_Code_2009#Implementing_phyloXML_support_in_BioRuby)
26
+ project with NESCent, mentored by Christian Zmasek et al. For details
27
+ of development, see
28
+ [github.com/latvianlinuxgirl/bioruby](https://github.com/latvianlinuxgirl/bioruby)
29
+ and the BioRuby mailing list archives.
30
+
31
+ ## Requirements
32
+
33
+ bio-phyloxml uses [libxml-ruby](http://xml4r.github.com/libxml-ruby/),
34
+ which requires several C libraries and their headers to be installed:
35
+ * `zlib`
36
+ * `libiconv`
37
+ * `libxml`
38
+
39
+ With these installed, `libxml-ruby` gem should be installed.
40
+
41
+ ```sh
42
+ gem install libxml-ruby
43
+ ```
44
+
45
+ If you see "ERROR: Failed to build gem native extension", the above
46
+ C libraries and their headers may be missing. See doc/Tutorial.rd
47
+ about installation of them in some system.
48
+
49
+ bio-phyloxml also uses the `bio` gem. It will automatically be installed
50
+ during the installation of `bio-phyloxml` in normal cases.
51
+
52
+ For more information see the
53
+ [libxml page](https://rubygems.org/gems/libxml-ruby) and
54
+ the [BioRuby installation page](http://bioruby.open-bio.org/wiki/Installation).
55
+
56
+
57
+ ## Installation
58
+
59
+ ```sh
60
+ gem install bioruby-phyloxml
61
+ ```
62
+
63
+ Note: Please uninstall old bio-phyloxml gem that have not been maintained
64
+ since 2012. The old bio-phyloxml gem was created in 2012 as a preliminary
65
+ trial of splitting bioruby components to separate gems.
66
+ We tried to contact the author of the old bio-phyloxml gem, but no response.
67
+
68
+ ```sh
69
+ gem uninstall bio-phyloxml
70
+ ```
71
+
72
+ ## Migration
73
+
74
+ Users who were previously using the phyloXML support in the core
75
+ BioRuby gem should be able to migrate to using this gem very
76
+ easily. Simply install the `bio-phyloxml` gem as described below, and
77
+ add `require 'bio-phyloxml'` to the relevant application code.
78
+
79
+ ## Usage
80
+
81
+ ```ruby
82
+ require 'bio-phyloxml'
83
+ ```
84
+
85
+ ### Parsing a file
86
+
87
+ ```ruby
88
+ require 'bio-phyloxml'
89
+
90
+ # Create new phyloxml parser
91
+ phyloxml = Bio::PhyloXML::Parser.open('example.xml')
92
+
93
+ # Print the names of all trees in the file
94
+ phyloxml.each do |tree|
95
+ puts tree.name
96
+ end
97
+ ```
98
+
99
+ If there are several trees in the file, you can access the one you wish by specifying its index:
100
+
101
+ ```ruby
102
+ tree = phyloxml[3]
103
+ ```
104
+ You can use all Bio::Tree methods on the tree, since PhyloXML::Tree inherits from Bio::Tree. For example,
105
+
106
+ ```ruby
107
+ tree.leaves.each do |node|
108
+ puts node.name
109
+ end
110
+ ```
111
+
112
+ PhyloXML files can hold additional information besides phylogenies at the end of the file. This info can be accessed through the 'other' array of the parser object.
113
+
114
+ ```ruby
115
+ phyloxml = Bio::PhyloXML::Parser.open('example.xml')
116
+ while tree = phyloxml.next_tree
117
+ # do stuff with trees
118
+ end
119
+
120
+ puts phyloxml.other
121
+ ```
122
+
123
+ ### Writing a file
124
+
125
+ ```ruby
126
+ # Create new phyloxml writer
127
+ writer = Bio::PhyloXML::Writer.new('tree.xml')
128
+
129
+ # Write tree to the file tree.xml
130
+ writer.write(tree1)
131
+
132
+ # Add another tree to the file
133
+ writer.write(tree2)
134
+ ```
135
+
136
+ ### Retrieving data
137
+
138
+ Here is an example of how to retrieve the scientific name of the clades included in each tree.
139
+
140
+ ```ruby
141
+ require 'bio-phyloxml'
142
+
143
+ phyloxml = Bio::PhyloXML::Parser.open('ncbi_taxonomy_mollusca.xml')
144
+ phyloxml.each do |tree|
145
+ tree.each_node do |node|
146
+ print "Scientific name: ", node.taxonomies[0].scientific_name, "\n"
147
+ end
148
+ end
149
+ ```
150
+
151
+ ### Retrieving 'other' data
152
+
153
+ ```ruby
154
+ require 'bio'
155
+
156
+ phyloxml = Bio::PhyloXML::Parser.open('phyloxml_examples.xml')
157
+ while tree = phyloxml.next_tree
158
+ #do something with the trees
159
+ end
160
+
161
+ p phyloxml.other
162
+ puts "\n"
163
+ #=> output is an object representation
164
+
165
+ #Print in a readable way
166
+ puts phyloxml.other[0].to_xml, "\n"
167
+ #=>:
168
+ #
169
+ #<align:alignment xmlns:align="http://example.org/align">
170
+ # <seq name="A">acgtcgcggcccgtggaagtcctctcct</seq>
171
+ # <seq name="B">aggtcgcggcctgtggaagtcctctcct</seq>
172
+ # <seq name="C">taaatcgc--cccgtgg-agtccc-cct</seq>
173
+ #</align:alignment>
174
+
175
+ #Once we know whats there, lets output just sequences
176
+ phyloxml.other[0].children.each do |node|
177
+ puts node.value
178
+ end
179
+ #=>
180
+ #
181
+ #acgtcgcggcccgtggaagtcctctcct
182
+ #aggtcgcggcctgtggaagtcctctcct
183
+ #taaatcgc--cccgtgg-agtccc-cct
184
+ ```
185
+
186
+ The API doc is online. (TODO: generate and link) For more code
187
+ examples see the test files in the source tree.
188
+
189
+ ## Project home page
190
+
191
+ Information on the source tree, documentation, examples, issues and
192
+ how to contribute, see
193
+
194
+ http://github.com/bioruby/bioruby-phyloxml
195
+
196
+ The BioRuby community is on IRC server: irc.freenode.org, channel: #bioruby.
197
+
198
+ ## Cite
199
+
200
+ If you use this software, please cite one of
201
+
202
+ * [BioRuby: bioinformatics software for the Ruby programming language](http://dx.doi.org/10.1093/bioinformatics/btq475)
203
+ * [Biogem: an effective tool-based approach for scaling up open source software development in bioinformatics](http://dx.doi.org/10.1093/bioinformatics/bts080)
204
+
205
+ ## Biogems.info
206
+
207
+ This Biogem is published at [#bio-phyloxml](http://biogems.info/index.html)
208
+
209
+ ## Copyright
210
+
211
+ Copyright (c) 2009 Diana Jaunzeikare and BioRuby project.
212
+ See COPYING or COPYING.ja for further details.
213
+
214
+ This README.md was first written by Clayton Wheeler.
@@ -0,0 +1,20 @@
1
+ require "bundler/gem_tasks"
2
+ require 'rdoc/task'
3
+ require 'rake/testtask'
4
+
5
+ task :default => "test"
6
+
7
+ Rake::TestTask.new do |t|
8
+ t.test_files = FileList["test/unit/**/test_*.rb"]
9
+ end
10
+
11
+ Rake::RDocTask.new do |r|
12
+ r.rdoc_dir = "rdoc"
13
+ r.rdoc_files.include("README.md",
14
+ "COPYING", "COPYING.ja", "BSDL", "LGPL", "GPL",
15
+ "doc/Tutorial.rd",
16
+ "lib/**/*.rb")
17
+ r.main = "README.md"
18
+ r.options << '--title' << 'Bio::PhyloXML API documentation'
19
+ r.options << '--line-numbers'
20
+ end
@@ -0,0 +1,36 @@
1
+ # coding: utf-8
2
+ lib = File.expand_path('../lib', __FILE__)
3
+ $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
4
+ require 'bio-phyloxml/version'
5
+
6
+ Gem::Specification.new do |spec|
7
+ spec.name = "bioruby-phyloxml"
8
+ spec.version = Bio::PhyloXML::VERSION
9
+ spec.authors = [ "Diana Jaunzeikare", "Clayton Wheeler",
10
+ "BioRuby project" ]
11
+ spec.email = [ "staff@bioruby.org" ]
12
+
13
+ spec.summary = %q{PhyloXML plugin for BioRuby}
14
+ spec.description = %q{Provides PhyloXML support for BioRuby. This bioruby-phyloxml gem replaces old unmaintained bio-phyloxml gem.}
15
+ spec.homepage = "http://github.com/bioruby/bioruby-phyloxml"
16
+ spec.license = "Ruby"
17
+
18
+ spec.files = `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(test|spec|features)/}) }
19
+ spec.bindir = "exe"
20
+ spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
21
+ spec.require_paths = ["lib"]
22
+
23
+ spec.extra_rdoc_files = [ "README.md",
24
+ "COPYING", "COPYING.ja", "BSDL", "LGPL", "GPL",
25
+ "doc/Tutorial.rd" ]
26
+ spec.rdoc_options << '--main' << 'README.md'
27
+ spec.rdoc_options << '--title' << 'Bio::PhyloXML API documentation'
28
+ spec.rdoc_options << '--line-numbers'
29
+
30
+ spec.add_runtime_dependency "bio", ">= 1.5.0"
31
+ spec.add_runtime_dependency "libxml-ruby", "~> 2.8"
32
+
33
+ spec.add_development_dependency "bundler", "~> 1.10"
34
+ spec.add_development_dependency "rake", "~> 10.0"
35
+ spec.add_development_dependency "rdoc", "~> 4"
36
+ end
@@ -0,0 +1,152 @@
1
+ # This document is generated with a version of rd2html (part of Hiki)
2
+ #
3
+ # A possible test run could be from rdtool (on Debian package rdtool)
4
+ #
5
+ # rd2 $BIORUBYPATH/doc/Tutorial.rd
6
+ #
7
+ # or with style sheet:
8
+ #
9
+ # rd2 -r rd/rd2html-lib.rb --with-css=bioruby.css $BIORUBYPATH/doc/Tutorial.rd > ~/bioruby.html
10
+ #
11
+ # in Debian:
12
+ #
13
+ # rd2 -r rd/rd2html-lib --with-css="../lib/bio/shell/rails/vendor/plugins/bioruby/generators/bioruby/templates/bioruby.css" Tutorial.rd > Tutorial.rd.html
14
+ #
15
+ # A common problem is tabs in the text file! TABs are not allowed.
16
+ #
17
+ # To add tests run Toshiaki's bioruby shell and paste in the query plus
18
+ # results.
19
+ #
20
+ # To run the embedded Ruby doctests you can use the rubydoctest tool, part
21
+ # of the bioruby-support repository at http://github.com/pjotrp/bioruby-support/
22
+ #
23
+
24
+ =begin
25
+ #doctest Testing bioruby
26
+
27
+ = Bio::PhyloXML Tutorial
28
+
29
+ * Copyright (C) 2001-2003 KATAYAMA Toshiaki <k .at. bioruby.org>
30
+ * Copyright (C) 2005-2009 Pjotr Prins, Naohisa Goto and others
31
+
32
+ = PhyloXML
33
+
34
+ PhyloXML is an XML language for saving, analyzing and exchanging data of
35
+ annotated phylogenetic trees. PhyloXML parser in BioRuby is implemented in
36
+ Bio::PhyloXML::Parser and writer in Bio::PhyloXML::Writer.
37
+ More information at www.phyloxml.org
38
+
39
+ == Install
40
+
41
+ % gem install bio-phyloxml
42
+
43
+ In addition to bio-phyloxml, dependent gems such as bio and libxml-ruby
44
+ will automatically be installed.
45
+
46
+ == Parsing a file
47
+
48
+ require 'bio-phyloxml'
49
+
50
+ # Create new phyloxml parser
51
+ phyloxml = Bio::PhyloXML::Parser.new('example.xml')
52
+
53
+ # Print the names of all trees in the file
54
+ phyloxml.each do |tree|
55
+ puts tree.name
56
+ end
57
+
58
+ If there are several trees in the file, you can access the one you wish by an index
59
+
60
+ tree = phyloxml[3]
61
+
62
+ You can use all Bio::Tree methods on the tree, since PhyloXML::Tree inherits from Bio::Tree. For example,
63
+
64
+ tree.leaves.each do |node|
65
+ puts node.name
66
+ end
67
+
68
+ PhyloXML files can hold additional information besides phylogenies at the end of the file. This info can be accessed through the 'other' array of the parser object.
69
+
70
+ phyloxml = Bio::PhyloXML::Parser.new('example.xml')
71
+ while tree = phyloxml.next_tree
72
+ # do stuff with trees
73
+ end
74
+
75
+ puts phyloxml.other
76
+
77
+ == Writing a file
78
+
79
+ # Create new phyloxml writer
80
+ writer = Bio::PhyloXML::Writer.new('tree.xml')
81
+
82
+ # Write tree to the file tree.xml
83
+ writer.write(tree1)
84
+
85
+ # Add another tree to the file
86
+ writer.write(tree2)
87
+
88
+ == Retrieving data
89
+
90
+ Here is an example of how to retrieve the scientific name of the clades.
91
+
92
+ require 'bio-phyloxml'
93
+
94
+ phyloxml = Bio::PhyloXML::Parser.new('ncbi_taxonomy_mollusca.xml')
95
+ phyloxml.each do |tree|
96
+ tree.each_node do |node|
97
+ print "Scientific name: ", node.taxonomies[0].scientific_name, "\n"
98
+ end
99
+ end
100
+
101
+ == Retrieving 'other' data
102
+
103
+ require 'bio-phyloxml'
104
+
105
+ phyloxml = Bio::PhyloXML::Parser.new('phyloxml_examples.xml')
106
+ while tree = phyloxml.next_tree
107
+ #do something with the trees
108
+ end
109
+
110
+ p phyloxml.other
111
+ puts "\n"
112
+ #=> output is an object representation
113
+
114
+ #Print in a readable way
115
+ puts phyloxml.other[0].to_xml, "\n"
116
+ #=>:
117
+ #
118
+ #<align:alignment xmlns:align="http://example.org/align">
119
+ # <seq name="A">acgtcgcggcccgtggaagtcctctcct</seq>
120
+ # <seq name="B">aggtcgcggcctgtggaagtcctctcct</seq>
121
+ # <seq name="C">taaatcgc--cccgtgg-agtccc-cct</seq>
122
+ #</align:alignment>
123
+
124
+ #Once we know whats there, lets output just sequences
125
+ phyloxml.other[0].children.each do |node|
126
+ puts node.value
127
+ end
128
+ #=>
129
+ #
130
+ #acgtcgcggcccgtggaagtcctctcct
131
+ #aggtcgcggcctgtggaagtcctctcct
132
+ #taaatcgc--cccgtgg-agtccc-cct
133
+
134
+
135
+ = APPENDIX
136
+
137
+ === Troubleshooting libxml-ruby installation problem
138
+
139
+ If you get "Failed to build gem native extension" error, you may need to
140
+ install the GNOME Libxml2 XML toolkit library and development files.
141
+
142
+ On Debian or Ubuntu,
143
+
144
+ sudo aptitude install libxml2-dev
145
+
146
+ On RedHat or CentOS,
147
+
148
+ sudo yum install libxml2-devel
149
+
150
+ On other platforms, see ((<URL:http://www.xmlsoft.org/>)).
151
+
152
+ =end
@@ -0,0 +1,27 @@
1
+ # Please require your code below, respecting the naming conventions in the
2
+ # bioruby directory tree.
3
+ #
4
+ # For example, say you have a plugin named bio-plugin, the only uncommented
5
+ # line in this file would be
6
+ #
7
+ # require 'bio/bio-plugin/plugin'
8
+ #
9
+ # In this file only require other files. Avoid other source code.
10
+
11
+ require 'bio-phyloxml/compat/cleanup.rb'
12
+ require 'bio-phyloxml/version.rb'
13
+ require 'bio-phyloxml/phyloxml_elements.rb'
14
+ require 'bio-phyloxml/phyloxml_parser.rb'
15
+ require 'bio-phyloxml/phyloxml_writer.rb'
16
+
17
+ if require 'bio-phyloxml/compat/stub_phyloxml_elements.rb'
18
+ require_relative 'bio/db/phyloxml/phyloxml_elements.rb'
19
+ end
20
+
21
+ if require 'bio-phyloxml/compat/stub_phyloxml_parser.rb'
22
+ require_relative 'bio/db/phyloxml/phyloxml_parser.rb'
23
+ end
24
+
25
+ if require 'bio-phyloxml/compat/stub_phyloxml_writer.rb'
26
+ require_relative 'bio/db/phyloxml/phyloxml_writer.rb'
27
+ end