bioruby-phyloxml 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,214 @@
1
+ # bio-phyloxml
2
+
3
+ [![Build Status](https://secure.travis-ci.org/bioruby/bioruby-phyloxml.png)](http://travis-ci.org/bioruby/bioruby-phyloxml)
4
+
5
+ bio-phyloxml (the package name on RubyGems.org is bioruby-phyloxml)
6
+ is a [phyloXML](http://www.phyloxml.org/) plugin for
7
+ [BioRuby](http://bioruby.org/), an open source bioinformatics
8
+ library for Ruby.
9
+
10
+ phyloXML is an XML language for saving, analyzing and exchanging data
11
+ of annotated phylogenetic trees. The phyloXML parser in BioRuby is
12
+ implemented in Bio::PhyloXML::Parser, and its writer in
13
+ Bio::PhyloXML::Writer. More information can be found at
14
+ [phyloxml.org](http://www.phyloxml.org).
15
+
16
+ This phyloXML code has historically been part of the core BioRuby
17
+ [gem](https://github.com/bioruby/bioruby), but has been split into its
18
+ own gem as part of an effort to
19
+ [modularize](http://bioruby.open-bio.org/wiki/Plugins)
20
+ BioRuby. bio-phyloxml and many more plugins are available at
21
+ [biogems.info](http://www.biogems.info/).
22
+
23
+ This code was originally written by Diana Jaunzeikare during the
24
+ Google Summer of Code 2009 for the
25
+ [Implementing phyloXML support in BioRuby](http://informatics.nescent.org/wiki/Phyloinformatics_Summer_of_Code_2009#Implementing_phyloXML_support_in_BioRuby)
26
+ project with NESCent, mentored by Christian Zmasek et al. For details
27
+ of development, see
28
+ [github.com/latvianlinuxgirl/bioruby](https://github.com/latvianlinuxgirl/bioruby)
29
+ and the BioRuby mailing list archives.
30
+
31
+ ## Requirements
32
+
33
+ bio-phyloxml uses [libxml-ruby](http://xml4r.github.com/libxml-ruby/),
34
+ which requires several C libraries and their headers to be installed:
35
+ * `zlib`
36
+ * `libiconv`
37
+ * `libxml`
38
+
39
+ With these installed, `libxml-ruby` gem should be installed.
40
+
41
+ ```sh
42
+ gem install libxml-ruby
43
+ ```
44
+
45
+ If you see "ERROR: Failed to build gem native extension", the above
46
+ C libraries and their headers may be missing. See doc/Tutorial.rd
47
+ about installation of them in some system.
48
+
49
+ bio-phyloxml also uses the `bio` gem. It will automatically be installed
50
+ during the installation of `bio-phyloxml` in normal cases.
51
+
52
+ For more information see the
53
+ [libxml page](https://rubygems.org/gems/libxml-ruby) and
54
+ the [BioRuby installation page](http://bioruby.open-bio.org/wiki/Installation).
55
+
56
+
57
+ ## Installation
58
+
59
+ ```sh
60
+ gem install bioruby-phyloxml
61
+ ```
62
+
63
+ Note: Please uninstall old bio-phyloxml gem that have not been maintained
64
+ since 2012. The old bio-phyloxml gem was created in 2012 as a preliminary
65
+ trial of splitting bioruby components to separate gems.
66
+ We tried to contact the author of the old bio-phyloxml gem, but no response.
67
+
68
+ ```sh
69
+ gem uninstall bio-phyloxml
70
+ ```
71
+
72
+ ## Migration
73
+
74
+ Users who were previously using the phyloXML support in the core
75
+ BioRuby gem should be able to migrate to using this gem very
76
+ easily. Simply install the `bio-phyloxml` gem as described below, and
77
+ add `require 'bio-phyloxml'` to the relevant application code.
78
+
79
+ ## Usage
80
+
81
+ ```ruby
82
+ require 'bio-phyloxml'
83
+ ```
84
+
85
+ ### Parsing a file
86
+
87
+ ```ruby
88
+ require 'bio-phyloxml'
89
+
90
+ # Create new phyloxml parser
91
+ phyloxml = Bio::PhyloXML::Parser.open('example.xml')
92
+
93
+ # Print the names of all trees in the file
94
+ phyloxml.each do |tree|
95
+ puts tree.name
96
+ end
97
+ ```
98
+
99
+ If there are several trees in the file, you can access the one you wish by specifying its index:
100
+
101
+ ```ruby
102
+ tree = phyloxml[3]
103
+ ```
104
+ You can use all Bio::Tree methods on the tree, since PhyloXML::Tree inherits from Bio::Tree. For example,
105
+
106
+ ```ruby
107
+ tree.leaves.each do |node|
108
+ puts node.name
109
+ end
110
+ ```
111
+
112
+ PhyloXML files can hold additional information besides phylogenies at the end of the file. This info can be accessed through the 'other' array of the parser object.
113
+
114
+ ```ruby
115
+ phyloxml = Bio::PhyloXML::Parser.open('example.xml')
116
+ while tree = phyloxml.next_tree
117
+ # do stuff with trees
118
+ end
119
+
120
+ puts phyloxml.other
121
+ ```
122
+
123
+ ### Writing a file
124
+
125
+ ```ruby
126
+ # Create new phyloxml writer
127
+ writer = Bio::PhyloXML::Writer.new('tree.xml')
128
+
129
+ # Write tree to the file tree.xml
130
+ writer.write(tree1)
131
+
132
+ # Add another tree to the file
133
+ writer.write(tree2)
134
+ ```
135
+
136
+ ### Retrieving data
137
+
138
+ Here is an example of how to retrieve the scientific name of the clades included in each tree.
139
+
140
+ ```ruby
141
+ require 'bio-phyloxml'
142
+
143
+ phyloxml = Bio::PhyloXML::Parser.open('ncbi_taxonomy_mollusca.xml')
144
+ phyloxml.each do |tree|
145
+ tree.each_node do |node|
146
+ print "Scientific name: ", node.taxonomies[0].scientific_name, "\n"
147
+ end
148
+ end
149
+ ```
150
+
151
+ ### Retrieving 'other' data
152
+
153
+ ```ruby
154
+ require 'bio'
155
+
156
+ phyloxml = Bio::PhyloXML::Parser.open('phyloxml_examples.xml')
157
+ while tree = phyloxml.next_tree
158
+ #do something with the trees
159
+ end
160
+
161
+ p phyloxml.other
162
+ puts "\n"
163
+ #=> output is an object representation
164
+
165
+ #Print in a readable way
166
+ puts phyloxml.other[0].to_xml, "\n"
167
+ #=>:
168
+ #
169
+ #<align:alignment xmlns:align="http://example.org/align">
170
+ # <seq name="A">acgtcgcggcccgtggaagtcctctcct</seq>
171
+ # <seq name="B">aggtcgcggcctgtggaagtcctctcct</seq>
172
+ # <seq name="C">taaatcgc--cccgtgg-agtccc-cct</seq>
173
+ #</align:alignment>
174
+
175
+ #Once we know whats there, lets output just sequences
176
+ phyloxml.other[0].children.each do |node|
177
+ puts node.value
178
+ end
179
+ #=>
180
+ #
181
+ #acgtcgcggcccgtggaagtcctctcct
182
+ #aggtcgcggcctgtggaagtcctctcct
183
+ #taaatcgc--cccgtgg-agtccc-cct
184
+ ```
185
+
186
+ The API doc is online. (TODO: generate and link) For more code
187
+ examples see the test files in the source tree.
188
+
189
+ ## Project home page
190
+
191
+ Information on the source tree, documentation, examples, issues and
192
+ how to contribute, see
193
+
194
+ http://github.com/bioruby/bioruby-phyloxml
195
+
196
+ The BioRuby community is on IRC server: irc.freenode.org, channel: #bioruby.
197
+
198
+ ## Cite
199
+
200
+ If you use this software, please cite one of
201
+
202
+ * [BioRuby: bioinformatics software for the Ruby programming language](http://dx.doi.org/10.1093/bioinformatics/btq475)
203
+ * [Biogem: an effective tool-based approach for scaling up open source software development in bioinformatics](http://dx.doi.org/10.1093/bioinformatics/bts080)
204
+
205
+ ## Biogems.info
206
+
207
+ This Biogem is published at [#bio-phyloxml](http://biogems.info/index.html)
208
+
209
+ ## Copyright
210
+
211
+ Copyright (c) 2009 Diana Jaunzeikare and BioRuby project.
212
+ See COPYING or COPYING.ja for further details.
213
+
214
+ This README.md was first written by Clayton Wheeler.
@@ -0,0 +1,20 @@
1
+ require "bundler/gem_tasks"
2
+ require 'rdoc/task'
3
+ require 'rake/testtask'
4
+
5
+ task :default => "test"
6
+
7
+ Rake::TestTask.new do |t|
8
+ t.test_files = FileList["test/unit/**/test_*.rb"]
9
+ end
10
+
11
+ Rake::RDocTask.new do |r|
12
+ r.rdoc_dir = "rdoc"
13
+ r.rdoc_files.include("README.md",
14
+ "COPYING", "COPYING.ja", "BSDL", "LGPL", "GPL",
15
+ "doc/Tutorial.rd",
16
+ "lib/**/*.rb")
17
+ r.main = "README.md"
18
+ r.options << '--title' << 'Bio::PhyloXML API documentation'
19
+ r.options << '--line-numbers'
20
+ end
@@ -0,0 +1,36 @@
1
+ # coding: utf-8
2
+ lib = File.expand_path('../lib', __FILE__)
3
+ $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
4
+ require 'bio-phyloxml/version'
5
+
6
+ Gem::Specification.new do |spec|
7
+ spec.name = "bioruby-phyloxml"
8
+ spec.version = Bio::PhyloXML::VERSION
9
+ spec.authors = [ "Diana Jaunzeikare", "Clayton Wheeler",
10
+ "BioRuby project" ]
11
+ spec.email = [ "staff@bioruby.org" ]
12
+
13
+ spec.summary = %q{PhyloXML plugin for BioRuby}
14
+ spec.description = %q{Provides PhyloXML support for BioRuby. This bioruby-phyloxml gem replaces old unmaintained bio-phyloxml gem.}
15
+ spec.homepage = "http://github.com/bioruby/bioruby-phyloxml"
16
+ spec.license = "Ruby"
17
+
18
+ spec.files = `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(test|spec|features)/}) }
19
+ spec.bindir = "exe"
20
+ spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
21
+ spec.require_paths = ["lib"]
22
+
23
+ spec.extra_rdoc_files = [ "README.md",
24
+ "COPYING", "COPYING.ja", "BSDL", "LGPL", "GPL",
25
+ "doc/Tutorial.rd" ]
26
+ spec.rdoc_options << '--main' << 'README.md'
27
+ spec.rdoc_options << '--title' << 'Bio::PhyloXML API documentation'
28
+ spec.rdoc_options << '--line-numbers'
29
+
30
+ spec.add_runtime_dependency "bio", ">= 1.5.0"
31
+ spec.add_runtime_dependency "libxml-ruby", "~> 2.8"
32
+
33
+ spec.add_development_dependency "bundler", "~> 1.10"
34
+ spec.add_development_dependency "rake", "~> 10.0"
35
+ spec.add_development_dependency "rdoc", "~> 4"
36
+ end
@@ -0,0 +1,152 @@
1
+ # This document is generated with a version of rd2html (part of Hiki)
2
+ #
3
+ # A possible test run could be from rdtool (on Debian package rdtool)
4
+ #
5
+ # rd2 $BIORUBYPATH/doc/Tutorial.rd
6
+ #
7
+ # or with style sheet:
8
+ #
9
+ # rd2 -r rd/rd2html-lib.rb --with-css=bioruby.css $BIORUBYPATH/doc/Tutorial.rd > ~/bioruby.html
10
+ #
11
+ # in Debian:
12
+ #
13
+ # rd2 -r rd/rd2html-lib --with-css="../lib/bio/shell/rails/vendor/plugins/bioruby/generators/bioruby/templates/bioruby.css" Tutorial.rd > Tutorial.rd.html
14
+ #
15
+ # A common problem is tabs in the text file! TABs are not allowed.
16
+ #
17
+ # To add tests run Toshiaki's bioruby shell and paste in the query plus
18
+ # results.
19
+ #
20
+ # To run the embedded Ruby doctests you can use the rubydoctest tool, part
21
+ # of the bioruby-support repository at http://github.com/pjotrp/bioruby-support/
22
+ #
23
+
24
+ =begin
25
+ #doctest Testing bioruby
26
+
27
+ = Bio::PhyloXML Tutorial
28
+
29
+ * Copyright (C) 2001-2003 KATAYAMA Toshiaki <k .at. bioruby.org>
30
+ * Copyright (C) 2005-2009 Pjotr Prins, Naohisa Goto and others
31
+
32
+ = PhyloXML
33
+
34
+ PhyloXML is an XML language for saving, analyzing and exchanging data of
35
+ annotated phylogenetic trees. PhyloXML parser in BioRuby is implemented in
36
+ Bio::PhyloXML::Parser and writer in Bio::PhyloXML::Writer.
37
+ More information at www.phyloxml.org
38
+
39
+ == Install
40
+
41
+ % gem install bio-phyloxml
42
+
43
+ In addition to bio-phyloxml, dependent gems such as bio and libxml-ruby
44
+ will automatically be installed.
45
+
46
+ == Parsing a file
47
+
48
+ require 'bio-phyloxml'
49
+
50
+ # Create new phyloxml parser
51
+ phyloxml = Bio::PhyloXML::Parser.new('example.xml')
52
+
53
+ # Print the names of all trees in the file
54
+ phyloxml.each do |tree|
55
+ puts tree.name
56
+ end
57
+
58
+ If there are several trees in the file, you can access the one you wish by an index
59
+
60
+ tree = phyloxml[3]
61
+
62
+ You can use all Bio::Tree methods on the tree, since PhyloXML::Tree inherits from Bio::Tree. For example,
63
+
64
+ tree.leaves.each do |node|
65
+ puts node.name
66
+ end
67
+
68
+ PhyloXML files can hold additional information besides phylogenies at the end of the file. This info can be accessed through the 'other' array of the parser object.
69
+
70
+ phyloxml = Bio::PhyloXML::Parser.new('example.xml')
71
+ while tree = phyloxml.next_tree
72
+ # do stuff with trees
73
+ end
74
+
75
+ puts phyloxml.other
76
+
77
+ == Writing a file
78
+
79
+ # Create new phyloxml writer
80
+ writer = Bio::PhyloXML::Writer.new('tree.xml')
81
+
82
+ # Write tree to the file tree.xml
83
+ writer.write(tree1)
84
+
85
+ # Add another tree to the file
86
+ writer.write(tree2)
87
+
88
+ == Retrieving data
89
+
90
+ Here is an example of how to retrieve the scientific name of the clades.
91
+
92
+ require 'bio-phyloxml'
93
+
94
+ phyloxml = Bio::PhyloXML::Parser.new('ncbi_taxonomy_mollusca.xml')
95
+ phyloxml.each do |tree|
96
+ tree.each_node do |node|
97
+ print "Scientific name: ", node.taxonomies[0].scientific_name, "\n"
98
+ end
99
+ end
100
+
101
+ == Retrieving 'other' data
102
+
103
+ require 'bio-phyloxml'
104
+
105
+ phyloxml = Bio::PhyloXML::Parser.new('phyloxml_examples.xml')
106
+ while tree = phyloxml.next_tree
107
+ #do something with the trees
108
+ end
109
+
110
+ p phyloxml.other
111
+ puts "\n"
112
+ #=> output is an object representation
113
+
114
+ #Print in a readable way
115
+ puts phyloxml.other[0].to_xml, "\n"
116
+ #=>:
117
+ #
118
+ #<align:alignment xmlns:align="http://example.org/align">
119
+ # <seq name="A">acgtcgcggcccgtggaagtcctctcct</seq>
120
+ # <seq name="B">aggtcgcggcctgtggaagtcctctcct</seq>
121
+ # <seq name="C">taaatcgc--cccgtgg-agtccc-cct</seq>
122
+ #</align:alignment>
123
+
124
+ #Once we know whats there, lets output just sequences
125
+ phyloxml.other[0].children.each do |node|
126
+ puts node.value
127
+ end
128
+ #=>
129
+ #
130
+ #acgtcgcggcccgtggaagtcctctcct
131
+ #aggtcgcggcctgtggaagtcctctcct
132
+ #taaatcgc--cccgtgg-agtccc-cct
133
+
134
+
135
+ = APPENDIX
136
+
137
+ === Troubleshooting libxml-ruby installation problem
138
+
139
+ If you get "Failed to build gem native extension" error, you may need to
140
+ install the GNOME Libxml2 XML toolkit library and development files.
141
+
142
+ On Debian or Ubuntu,
143
+
144
+ sudo aptitude install libxml2-dev
145
+
146
+ On RedHat or CentOS,
147
+
148
+ sudo yum install libxml2-devel
149
+
150
+ On other platforms, see ((<URL:http://www.xmlsoft.org/>)).
151
+
152
+ =end
@@ -0,0 +1,27 @@
1
+ # Please require your code below, respecting the naming conventions in the
2
+ # bioruby directory tree.
3
+ #
4
+ # For example, say you have a plugin named bio-plugin, the only uncommented
5
+ # line in this file would be
6
+ #
7
+ # require 'bio/bio-plugin/plugin'
8
+ #
9
+ # In this file only require other files. Avoid other source code.
10
+
11
+ require 'bio-phyloxml/compat/cleanup.rb'
12
+ require 'bio-phyloxml/version.rb'
13
+ require 'bio-phyloxml/phyloxml_elements.rb'
14
+ require 'bio-phyloxml/phyloxml_parser.rb'
15
+ require 'bio-phyloxml/phyloxml_writer.rb'
16
+
17
+ if require 'bio-phyloxml/compat/stub_phyloxml_elements.rb'
18
+ require_relative 'bio/db/phyloxml/phyloxml_elements.rb'
19
+ end
20
+
21
+ if require 'bio-phyloxml/compat/stub_phyloxml_parser.rb'
22
+ require_relative 'bio/db/phyloxml/phyloxml_parser.rb'
23
+ end
24
+
25
+ if require 'bio-phyloxml/compat/stub_phyloxml_writer.rb'
26
+ require_relative 'bio/db/phyloxml/phyloxml_writer.rb'
27
+ end