biointerchange 0.1.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (57) hide show
  1. data/.document +5 -0
  2. data/.rspec +1 -0
  3. data/.travis.yml +12 -0
  4. data/Gemfile +17 -0
  5. data/LICENSE.txt +8 -0
  6. data/README.md +166 -0
  7. data/Rakefile +50 -0
  8. data/VERSION +1 -0
  9. data/bin/biointerchange +6 -0
  10. data/docs/exceptions_readme.txt +13 -0
  11. data/examples/BovineGenomeChrX.gff3.gz +0 -0
  12. data/examples/gb-2007-8-3-R40.xml +243 -0
  13. data/examples/pubannotation.json +1 -0
  14. data/generators/rdfxml.rb +104 -0
  15. data/lib/biointerchange/core.rb +195 -0
  16. data/lib/biointerchange/exceptions.rb +38 -0
  17. data/lib/biointerchange/genomics/gff3_feature.rb +82 -0
  18. data/lib/biointerchange/genomics/gff3_feature_set.rb +37 -0
  19. data/lib/biointerchange/genomics/gff3_rdf_ntriples.rb +107 -0
  20. data/lib/biointerchange/genomics/gff3_reader.rb +86 -0
  21. data/lib/biointerchange/gff3.rb +135 -0
  22. data/lib/biointerchange/reader.rb +25 -0
  23. data/lib/biointerchange/registry.rb +29 -0
  24. data/lib/biointerchange/sio.rb +7124 -0
  25. data/lib/biointerchange/sofa.rb +1566 -0
  26. data/lib/biointerchange/textmining/content.rb +69 -0
  27. data/lib/biointerchange/textmining/document.rb +36 -0
  28. data/lib/biointerchange/textmining/pdfx_xml_reader.rb +161 -0
  29. data/lib/biointerchange/textmining/process.rb +57 -0
  30. data/lib/biointerchange/textmining/pubannos_json_reader.rb +72 -0
  31. data/lib/biointerchange/textmining/text_mining_rdf_ntriples.rb +197 -0
  32. data/lib/biointerchange/textmining/text_mining_reader.rb +41 -0
  33. data/lib/biointerchange/writer.rb +23 -0
  34. data/lib/biointerchange.rb +3 -0
  35. data/spec/exceptions_spec.rb +27 -0
  36. data/spec/gff3_rdfwriter_spec.rb +67 -0
  37. data/spec/text_mining_pdfx_xml_reader_spec.rb +89 -0
  38. data/spec/text_mining_pubannos_json_reader_spec.rb +71 -0
  39. data/spec/text_mining_rdfwriter_spec.rb +57 -0
  40. data/web/about.html +89 -0
  41. data/web/biointerchange.js +133 -0
  42. data/web/bootstrap/css/bootstrap-responsive.css +1040 -0
  43. data/web/bootstrap/css/bootstrap-responsive.min.css +9 -0
  44. data/web/bootstrap/css/bootstrap.css +5624 -0
  45. data/web/bootstrap/css/bootstrap.min.css +9 -0
  46. data/web/bootstrap/img/glyphicons-halflings-white.png +0 -0
  47. data/web/bootstrap/img/glyphicons-halflings.png +0 -0
  48. data/web/bootstrap/js/bootstrap.js +2027 -0
  49. data/web/bootstrap/js/bootstrap.min.js +6 -0
  50. data/web/bootstrap/js/jquery-1.8.1.min.js +2 -0
  51. data/web/css/rdoc-style.css +5786 -0
  52. data/web/css/rdoc.css +716 -0
  53. data/web/images/BioInterchange300.png +0 -0
  54. data/web/index.html +109 -0
  55. data/web/service/rdfizer.fcgi +68 -0
  56. data/web/webservices.html +123 -0
  57. metadata +240 -0
data/.document ADDED
@@ -0,0 +1,5 @@
1
+ lib/**/*.rb
2
+ bin/*
3
+ -
4
+ features/**/*.feature
5
+ LICENSE.txt
data/.rspec ADDED
@@ -0,0 +1 @@
1
+ --color
data/.travis.yml ADDED
@@ -0,0 +1,12 @@
1
+ language: ruby
2
+ rvm:
3
+ - 1.9.2
4
+ - 1.9.3
5
+ - jruby-19mode # JRuby in 1.9 mode
6
+ - rbx-19mode
7
+ # - 1.8.7
8
+ # - jruby-18mode # JRuby in 1.8 mode
9
+ # - rbx-18mode
10
+
11
+ # uncomment this line if your project needs to run something other than `rake`:
12
+ # script: bundle exec rspec spec
data/Gemfile ADDED
@@ -0,0 +1,17 @@
1
+ source "http://rubygems.org"
2
+ # Add dependencies required to use your gem here.
3
+ # Example:
4
+ # gem "activesupport", ">= 2.3.5"
5
+ gem "rdf", ">= 0.3.4.1"
6
+ gem "json", ">= 1.6.4"
7
+ gem "getopt", ">= 1.4.1"
8
+
9
+ # Add dependencies to develop your gem here.
10
+ # Include everything needed to run rake, tests, features, etc.
11
+ group :development do
12
+ gem "rspec", "~> 2.8.0"
13
+ gem "bundler", "~> 1.2.1"
14
+ gem "jeweler", "~> 1.8.4"
15
+ gem "bio", ">= 1.4.2"
16
+ gem "rdoc", "~> 3.12"
17
+ end
data/LICENSE.txt ADDED
@@ -0,0 +1,8 @@
1
+ Copyright (c) 2012 Joachim Baran, Geraint Duck
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
4
+
5
+ The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
6
+
7
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
8
+
data/README.md ADDED
@@ -0,0 +1,166 @@
1
+ BioInterchange
2
+ ==============
3
+
4
+ [![Build Status](https://secure.travis-ci.org/joejimbo/bioruby-biointerchange.png)](http://travis-ci.org/joejimbo/bioruby-biointerchange)
5
+
6
+ BioInterchange is a tool for generating interchangable RDF from non-RDF data sources.
7
+
8
+ Supported input file formats (see examples directory):
9
+
10
+ * [Pubannos JSON](http://pubannotation.dbcls.jp/)
11
+ * [PDFx XML](http://pdfx.cs.man.ac.uk/)
12
+
13
+ Supported RDF output formats:
14
+
15
+ * [RDF N-Triples](http://www.w3.org/TR/rdf-testcases/#ntriples)
16
+
17
+ Ontologies used in the RDF output:
18
+
19
+ * [Semanticscience Integrated Ontology](http://code.google.com/p/semanticscience/wiki/SIO) (SIO)
20
+
21
+ Usage
22
+ -----
23
+
24
+ Four interfaces to BioInterchange are available:
25
+
26
+ 1. command-line tool-suite
27
+ 2. Ruby API/Ruby gem
28
+ 3. RESTful web-service
29
+ 4. interactive web-site
30
+
31
+ ### Command-Line Tool-Suite
32
+
33
+ BioInterchange's command-line tool `biointerchange` can be installed as a command line tools as follows:
34
+
35
+ gem install biointerchange
36
+
37
+ #### Usage
38
+
39
+ Examples:
40
+
41
+ biointerchange --input dbcls.catanns.json --rdf rdf.bh12.sio --file examples/pubannotation.json --name 'Peter Smith' --name_id 'peter.smith@example.com'
42
+ biointerchange --input uk.ac.man.pdfx --rdf rdf.bh12.sio --file examples/gb-2007-8-3-R40.xml --name 'Peter Smith' --name_id 'peter.smith@example.com'
43
+
44
+ Input formats:
45
+
46
+ * `biointerchange.gff3`
47
+ * `dbcls.catanns.json`
48
+ * `uk.ac.man.pdfx`
49
+
50
+ Output formats:
51
+
52
+ * `rdf.biointerchange.gff3`
53
+ * `rdf.bh12.sio`
54
+
55
+
56
+ ### Ruby API/Ruby gem
57
+
58
+ The Ruby gem is under active development, so the following may or may not work out of the box.
59
+
60
+ gem install biointerchange
61
+
62
+ To use BioInterchange in your Ruby projects, include the following line in your code:
63
+
64
+ require 'biointerchange'
65
+
66
+ ### RESTful Web-Service
67
+
68
+ TODO
69
+
70
+ ### Interactive Web-Site
71
+
72
+ TODO
73
+
74
+ Build Notes
75
+ -----------
76
+
77
+ This section is only relevant if you are building newer versions of BioInterchange yourself. If you are using the Ruby gem, web-service or interactive web-site, then you can safely skip the steps explained here.
78
+
79
+ Note that the following set-up only works with Ruby 1.9.2p290 or newer.
80
+
81
+ Building a new version of the Ruby vocabulary classes for GFF3, SIO, SOFA (requires that the OBO files are saves as RDF/XML using [Protege](http://protege.stanford.edu)):
82
+
83
+ sudo gem install rdf
84
+ sudo gem install rdf-rdfxml
85
+ echo -e "module BioInterchange\n" > lib/biointerchange/gff3.rb
86
+ ruby generators/rdfxml.rb <path-to-rdf/xml-version-of-gff3> GFF3 >> lib/biointerchange/gff3.rb
87
+ echo -e "\nend" >> lib/biointerchange/gff3.rb
88
+ echo -e "module BioInterchange\n" > lib/biointerchange/sio.rb
89
+ ruby generators/rdfxml.rb <path-to-rdf/xml-version-of-sio> SIO >> lib/biointerchange/sio.rb
90
+ echo -e "\nend" >> lib/biointerchange/sio.rb
91
+ echo -e "module BioInterchange\n" > lib/biointerchange/sofa.rb
92
+ ruby generators/rdfxml.rb <path-to-rdf/xml-version-of-sofa> SOFA >> lib/biointerchange/sofa.rb
93
+ echo -e "\nend" >> lib/biointerchange/sofa.rb
94
+
95
+ ### Gem Bundling/Installing
96
+
97
+ sudo bundle exec rake install
98
+
99
+ If you encounter problems with gem dependencies, then you can try to explictly use Ruby 1.9:
100
+
101
+ sudo bundle exec rake1.9 install
102
+
103
+ ### Unit Testing
104
+
105
+ BioInterchange uses unit testing using [RSpec](http://rspec.info), where the unit tests are located in the `spec` directory.
106
+
107
+ Using bundler, a quick check can be carried out using:
108
+
109
+ bundle exec rake spec
110
+
111
+ A more verbose is produced by calling `rspec` directly:
112
+
113
+ rspec -c -f d
114
+
115
+ ### Generating RDocs
116
+
117
+ bundle exec rake rdoc
118
+
119
+ ### Troubleshooting
120
+
121
+ #### GCC: No such file or directory
122
+
123
+ On Mac OS X, you might see this error:
124
+
125
+ make: /usr/bin/gcc-4.2: No such file or directory
126
+ make: *** [generator.o] Error 1
127
+
128
+ This can be solved by executing:
129
+
130
+ sudo ln -s /usr/bin/llvm-gcc-4.2 /usr/bin/gcc-4.2
131
+
132
+
133
+ Contributors
134
+ ------------
135
+
136
+ In alphabetical order of the last name:
137
+
138
+ * [Joachim Baran](http://joachimbaran.wordpress.com)
139
+ * [Kevin B. Cohen](http://compbio.ucdenver.edu/Hunter_lab/Cohen/index.shtml)
140
+ * [Geraint Duck](http://www.cs.man.ac.uk/~duckg)
141
+ * [Michel Dumontier](http://dumontierlab.com)
142
+
143
+ Cite
144
+ ----
145
+
146
+ If you use this software, please cite
147
+
148
+ * BioInterchange: An Open Source Framework for Transforming Heterogeneous Data Formats Into RDF (_in preparation_)
149
+
150
+ and one of the following Biogem publications
151
+
152
+ * [BioRuby: bioinformatics software for the Ruby programming language](http://dx.doi.org/10.1093/bioinformatics/btq475)
153
+ * [Biogem: an effective tool-based approach for scaling up open source software development in bioinformatics](http://dx.doi.org/10.1093/bioinformatics/bts080)
154
+
155
+ Biogems.info
156
+ ------------
157
+
158
+ This Biogem is published at [#biointerchange](http://biogems.info/index.html) and hosted on its primary site [www.biointerchange.org](http://www.biointerchange.org).
159
+
160
+ The BioRuby community is on IRC server: irc.freenode.org, channel: #bioruby.
161
+
162
+ License/Copyright
163
+ -----------------
164
+
165
+ See [LICENSE](https://raw.github.com/BioInterchange/BioInterchange/master/LICENSE.txt) file.
166
+
data/Rakefile ADDED
@@ -0,0 +1,50 @@
1
+ # encoding: utf-8
2
+
3
+ require 'rubygems'
4
+ require 'bundler'
5
+ begin
6
+ Bundler.setup(:default, :development)
7
+ rescue Bundler::BundlerError => e
8
+ $stderr.puts e.message
9
+ $stderr.puts "Run `bundle install` to install missing gems"
10
+ exit e.status_code
11
+ end
12
+ require 'rake'
13
+
14
+ require 'jeweler'
15
+ Jeweler::Tasks.new do |gem|
16
+ # gem is a Gem::Specification... see http://docs.rubygems.org/read/chapter/20 for more options
17
+ gem.name = "biointerchange"
18
+ gem.homepage = "http://www.biointerchange.org"
19
+ gem.license = "MIT"
20
+ gem.summary = %Q{An open source framework for transforming heterogeneous data formats into RDF.}
21
+ gem.description = %Q{BioInterchange is a Ruby gem, command-line tool, web-service for turning heterogeneous data formats such as JSON, XML, GFF3, etc., into RDF."}
22
+ gem.email = "joachim.baran@gmail.com"
23
+ gem.authors = ["Joachim Baran", "Kevin B. Cohen", "Geraint Duck", "Michel Dumontier"]
24
+ gem.executable = 'biointerchange'
25
+ # dependencies defined in Gemfile
26
+ end
27
+ Jeweler::RubygemsDotOrgTasks.new
28
+
29
+ require 'rspec/core'
30
+ require 'rspec/core/rake_task'
31
+ RSpec::Core::RakeTask.new(:spec) do |spec|
32
+ spec.pattern = FileList['spec/**/*_spec.rb']
33
+ end
34
+
35
+ RSpec::Core::RakeTask.new(:rcov) do |spec|
36
+ spec.pattern = 'spec/**/*_spec.rb'
37
+ spec.rcov = true
38
+ end
39
+
40
+ task :default => :spec
41
+
42
+ require 'rdoc/task'
43
+ Rake::RDocTask.new do |rdoc|
44
+ version = File.exist?('VERSION') ? File.read('VERSION') : ""
45
+
46
+ rdoc.rdoc_dir = 'rdoc'
47
+ rdoc.title = "BioInterchange #{version}"
48
+ # rdoc.rdoc_files.include('README*')
49
+ rdoc.rdoc_files.include('lib/**/*.rb')
50
+ end
data/VERSION ADDED
@@ -0,0 +1 @@
1
+ 0.1.0
@@ -0,0 +1,6 @@
1
+ #!/usr/bin/ruby
2
+
3
+ require 'biointerchange'
4
+
5
+ BioInterchange::cli
6
+
@@ -0,0 +1,13 @@
1
+ BioInterchange Exception and Error handling.
2
+
3
+ All BioInterchange errors come under two main categories:
4
+ 1. InputFormatError
5
+ 2. ImplementationError
6
+
7
+ Each of these classes is handled differently within the BioInterchange framework.
8
+
9
+ InputFormatErrors are those that could be raised at any time (during runtime), and that the user could then be exposed to. Examples include errors relating to missing files, or incorrect file formats. These Errors are rescued at the highest level in our program, and adjusted such that a user friendly error message is displayed without a stacktrace (backtrace). This helps to keep our system more user-friendly. However, this means that we strongly advise that when such an error is raised, that it comes with a user friendly error message, and is as specific as possible.
10
+
11
+ ImplementationErrors are those that could only be thrown during program implementation and extension. For example, a method is passed something it isn't expecting or can't handle (e.g., a writer is passed an invalid model). These errors are not caught by the main program, instead left to execute in the normal fashion including stacktrace to enable program debugging by a developer.
12
+
13
+ ImplementationErrors have three sub-categories. They are split into errors being raised from either the Reader (ImplementationReaderError), the Model (ImplementationModelError), or the Writer (ImplementationWriterError). Please use the ImplementationError sub-class consistent with where the error is being raised. This helps provide an easy way to see which stage in program execution an error has occurred.
Binary file