RubyGems - biointerchange - Versions diffs - 0.1.0 - Mend

biointerchange 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (57) hide show

data/.document +5 -0
data/.rspec +1 -0
data/.travis.yml +12 -0
data/Gemfile +17 -0
data/LICENSE.txt +8 -0
data/README.md +166 -0
data/Rakefile +50 -0
data/VERSION +1 -0
data/bin/biointerchange +6 -0
data/docs/exceptions_readme.txt +13 -0
data/examples/BovineGenomeChrX.gff3.gz +0 -0
data/examples/gb-2007-8-3-R40.xml +243 -0
data/examples/pubannotation.json +1 -0
data/generators/rdfxml.rb +104 -0
data/lib/biointerchange/core.rb +195 -0
data/lib/biointerchange/exceptions.rb +38 -0
data/lib/biointerchange/genomics/gff3_feature.rb +82 -0
data/lib/biointerchange/genomics/gff3_feature_set.rb +37 -0
data/lib/biointerchange/genomics/gff3_rdf_ntriples.rb +107 -0
data/lib/biointerchange/genomics/gff3_reader.rb +86 -0
data/lib/biointerchange/gff3.rb +135 -0
data/lib/biointerchange/reader.rb +25 -0
data/lib/biointerchange/registry.rb +29 -0
data/lib/biointerchange/sio.rb +7124 -0
data/lib/biointerchange/sofa.rb +1566 -0
data/lib/biointerchange/textmining/content.rb +69 -0
data/lib/biointerchange/textmining/document.rb +36 -0
data/lib/biointerchange/textmining/pdfx_xml_reader.rb +161 -0
data/lib/biointerchange/textmining/process.rb +57 -0
data/lib/biointerchange/textmining/pubannos_json_reader.rb +72 -0
data/lib/biointerchange/textmining/text_mining_rdf_ntriples.rb +197 -0
data/lib/biointerchange/textmining/text_mining_reader.rb +41 -0
data/lib/biointerchange/writer.rb +23 -0
data/lib/biointerchange.rb +3 -0
data/spec/exceptions_spec.rb +27 -0
data/spec/gff3_rdfwriter_spec.rb +67 -0
data/spec/text_mining_pdfx_xml_reader_spec.rb +89 -0
data/spec/text_mining_pubannos_json_reader_spec.rb +71 -0
data/spec/text_mining_rdfwriter_spec.rb +57 -0
data/web/about.html +89 -0
data/web/biointerchange.js +133 -0
data/web/bootstrap/css/bootstrap-responsive.css +1040 -0
data/web/bootstrap/css/bootstrap-responsive.min.css +9 -0
data/web/bootstrap/css/bootstrap.css +5624 -0
data/web/bootstrap/css/bootstrap.min.css +9 -0
data/web/bootstrap/img/glyphicons-halflings-white.png +0 -0
data/web/bootstrap/img/glyphicons-halflings.png +0 -0
data/web/bootstrap/js/bootstrap.js +2027 -0
data/web/bootstrap/js/bootstrap.min.js +6 -0
data/web/bootstrap/js/jquery-1.8.1.min.js +2 -0
data/web/css/rdoc-style.css +5786 -0
data/web/css/rdoc.css +716 -0
data/web/images/BioInterchange300.png +0 -0
data/web/index.html +109 -0
data/web/service/rdfizer.fcgi +68 -0
data/web/webservices.html +123 -0
metadata +240 -0

data/spec/gff3_rdfwriter_spec.rb ADDED Viewed

@@ -0,0 +1,67 @@
+require 'rspec'
+load 'lib/biointerchange/core.rb'
+load 'lib/biointerchange/gff3.rb'
+load 'lib/biointerchange/sofa.rb'
+load 'lib/biointerchange/reader.rb'
+load 'lib/biointerchange/writer.rb'
+load 'lib/biointerchange/genomics/gff3_rdf_ntriples.rb'
+load 'lib/biointerchange/genomics/gff3_feature_set.rb'
+load 'lib/biointerchange/genomics/gff3_feature.rb'
+describe BioInterchange::Genomics::RDFWriter do
+  describe 'serialization of GFF3 models' do
+    it 'empty document' do
+      istream, ostream = IO.pipe
+      BioInterchange::Genomics::RDFWriter.new(ostream).serialize(BioInterchange::Genomics::GFF3FeatureSet.new())
+      ostream.close
+      istream.read.lines.count.should eq(1)
+    end
+    it 'model with three features' do
+      istream, ostream = IO.pipe
+      set = BioInterchange::Genomics::GFF3FeatureSet.new()
+      feature = BioInterchange::Genomics::GFF3Feature.new(
+          'GRCh37.1',
+          'NCBI',
+          BioInterchange::SOFA.CDS,
+          32890598,
+          32890664,
+          0.1,
+          BioInterchange::Genomics::GFF3Feature::POSITIVE,
+          nil,
+          { 'ID' => [ 'BRCA2' ], 'annotation' => [ 'manual' ] }
+        )
+      set.add(feature)
+      feature = BioInterchange::Genomics::GFF3Feature.new(
+          'GRCh37.1',
+          'NCBI',
+          BioInterchange::SOFA.modified_base,
+          32890599,
+          32890599,
+          0.8,
+          BioInterchange::Genomics::GFF3Feature::POSITIVE,
+          nil,
+          { 'ID' => [ 'aModifiedBase' ], 'Parent' => [ 'BRCA2' ] }
+        )
+      set.add(feature)
+      feature = BioInterchange::Genomics::GFF3Feature.new(
+          'GRCh37.1',
+          'NCBI',
+          BioInterchange::SOFA.modified_base,
+          32890599,
+          32890599,
+          0.8,
+          BioInterchange::Genomics::GFF3Feature::POSITIVE,
+          nil,
+          { 'Parent' => [ 'BRCA2', 'aModifiedBase' ] }
+        )
+      set.add(feature)
+      BioInterchange::Genomics::RDFWriter.new(ostream).serialize(set)
+      ostream.close
+      istream.read.lines.count.should be == 43
+    end
+  end
+end

data/spec/text_mining_pdfx_xml_reader_spec.rb ADDED Viewed

@@ -0,0 +1,89 @@
+require 'rspec'
+load 'lib/biointerchange/core.rb'
+load 'lib/biointerchange/reader.rb'
+load 'lib/biointerchange/textmining/text_mining_reader.rb'
+load 'lib/biointerchange/textmining/pdfx_xml_reader.rb'
+load 'lib/biointerchange/textmining/document.rb'
+load 'lib/biointerchange/textmining/content.rb'
+load 'lib/biointerchange/textmining/process.rb'
+describe BioInterchange::TextMining::PdfxXmlReader do
+  describe 'deserialization of pdfx text-mining documents' do
+    describe 'IO check' do
+      before :all do
+        @reader = BioInterchange::TextMining::PdfxXmlReader.new("Test", "http://test.com", "00-00-0000", BioInterchange::TextMining::Process::UNSPECIFIED, "0.0")
+      end
+      it 'read pdfx from string' do
+        model = @reader.deserialize("<pdfx><job>text</job></pdfx>")
+        model.should be_an_instance_of BioInterchange::TextMining::Document
+      end
+      it 'read pdfx from file' do
+        model = @reader.deserialize(File.new('examples/gb-2007-8-3-R40.xml'))
+        model.should be_an_instance_of BioInterchange::TextMining::Document
+      end
+    end
+    describe 'generated model check' do
+      before :all do
+        reader = BioInterchange::TextMining::PdfxXmlReader.new("Test", "http://test.com", "00-00-0000", BioInterchange::TextMining::Process::UNSPECIFIED, "0.0")
+        @model = reader.deserialize("<pdfx><job>rspec_test</job><article><article-title>TITLE</article-title><abstract>ABSTRACT</abstract><body>BODY TEXT<section>SECTION LEVEL 1<section>SECTION LEVEL 2.1</section><section>SECTION LEVEL 2.2</section>END SECTION LEVEL 1</section></body></article></pdfx>")
+        #puts "Document Model: #{@model.uri}"
+        #  @model.contents.each do |c|
+        #  puts "\tContent: #{c.type}, #{c.offset}, #{c.length}"
+        #end
+      end
+      it 'model is of type document' do
+        @model.should be_an_instance_of BioInterchange::TextMining::Document
+      end
+      it 'document uri (job id read)' do
+        @model.uri.should eql "http://pdfx.cs.man.ac.uk/rspec_test"
+      end
+      it 'document has content' do
+        @model.contents.size.should eql 7
+      end
+      it 'document document' do
+        @model.contents[6].type.should eql BioInterchange::TextMining::Content::DOCUMENT and @model.contents[6].offset.should eql 0 and @model.contents[6].length.should eql 90
+      end
+      it 'document title' do
+        @model.contents[0].type.should eql BioInterchange::TextMining::Content::TITLE and @model.contents[0].offset.should eql 0 and @model.contents[0].length.should eql 5
+      end
+      it 'document abstract' do
+        @model.contents[1].type.should eql BioInterchange::TextMining::Content::ABSTRACT and @model.contents[1].offset.should eql 5 and @model.contents[1].length.should eql 8
+      end
+      it 'document body' do
+        @model.contents[5].type.should eql BioInterchange::TextMining::Content::SECTION and @model.contents[5].offset.should eql 13 and @model.contents[5].length.should eql 77
+      end
+      it 'document sections' do
+        @model.contents[2].type.should eql BioInterchange::TextMining::Content::SECTION and
+        @model.contents[2].offset.should eql 37 and
+        @model.contents[2].length.should eql 17 and
+        @model.contents[3].type.should eql BioInterchange::TextMining::Content::SECTION and
+        @model.contents[3].offset.should eql 54 and
+        @model.contents[3].length.should eql 17 and
+        @model.contents[4].type.should eql BioInterchange::TextMining::Content::SECTION and @model.contents[4].offset.should eql 22 and @model.contents[4].length.should eql 68
+      end
+    end
+  end
+end

data/spec/text_mining_pubannos_json_reader_spec.rb ADDED Viewed

@@ -0,0 +1,71 @@
+require 'rspec'
+load 'lib/biointerchange/core.rb'
+load 'lib/biointerchange/reader.rb'
+load 'lib/biointerchange/textmining/text_mining_reader.rb'
+load 'lib/biointerchange/textmining/pubannos_json_reader.rb'
+load 'lib/biointerchange/textmining/document.rb'
+load 'lib/biointerchange/textmining/content.rb'
+load 'lib/biointerchange/textmining/process.rb'
+describe BioInterchange::TextMining::PubannosJsonReader do
+  describe 'deserialization of pubannos json text-mining documents' do
+    describe 'IO check' do
+      before :all do
+        @reader = BioInterchange::TextMining::PubannosJsonReader.new("Test", "http://test.com", "00-00-0000", BioInterchange::TextMining::Process::UNSPECIFIED, "0.0")
+      end
+      it 'read json from string' do
+        model = @reader.deserialize('{"docurl":"http://example.org/test","text":""}')
+        model.should be_an_instance_of BioInterchange::TextMining::Document
+      end
+      it 'read json from file' do
+        model = @reader.deserialize(File.new('examples/pubannotation.json'))
+        model.should be_an_instance_of BioInterchange::TextMining::Document
+      end
+    end
+    describe 'generated model check' do
+      before :all do
+        reader = BioInterchange::TextMining::PubannosJsonReader.new("Test", "http://test.com", "00-00-0000", BioInterchange::TextMining::Process::UNSPECIFIED, "0.0")
+        @model = reader.deserialize('{ "name": "Peter Smith", "name_id": "<peter.smith@example.json>", "date": "2012-08-12", "version": "3", "docurl":"http://example.org/example_json", "text":"Some document text. With two annotations of type protein.\n", "catanns":[{"annset_id":1,"begin":0,"category":"Protein","doc_id":9,"end":10,"id":139},{"annset_id":1,"begin":20,"category":"Protein","doc_id":9,"end":42,"id":138}]}')
+        #puts "Document Model: #{@model.uri}"
+        #  @model.contents.each do |c|
+        #  puts "\tContent: #{c.type}, #{c.offset}, #{c.length}"
+        #end
+      end
+      it 'model is of type document' do
+        @model.should be_an_instance_of BioInterchange::TextMining::Document
+      end
+      it 'document uri (job id read)' do
+        @model.uri.should eql "http://example.org/example_json"
+      end
+      it 'document has content' do
+        @model.contents.size.should eql 3
+      end
+      it 'document document' do
+        @model.contents[0].type.should eql BioInterchange::TextMining::Content::DOCUMENT and @model.contents[0].offset.should eql 0 and @model.contents[0].length.should eql 58
+      end
+      it 'document phrase' do
+        @model.contents[1].type.should eql BioInterchange::TextMining::Content::PHRASE and @model.contents[1].offset.should eql 0 and @model.contents[1].length.should eql 10 and
+        @model.contents[2].type.should eql BioInterchange::TextMining::Content::PHRASE and @model.contents[2].offset.should eql 20 and @model.contents[2].length.should eql 22
+      end
+    end
+  end
+end

data/spec/text_mining_rdfwriter_spec.rb ADDED Viewed

@@ -0,0 +1,57 @@
+require 'rspec'
+load 'lib/biointerchange/core.rb'
+load 'lib/biointerchange/sio.rb'
+load 'lib/biointerchange/reader.rb'
+load 'lib/biointerchange/writer.rb'
+load 'lib/biointerchange/textmining/text_mining_rdf_ntriples.rb'
+load 'lib/biointerchange/textmining/document.rb'
+load 'lib/biointerchange/textmining/content.rb'
+load 'lib/biointerchange/textmining/process.rb'
+describe BioInterchange::TextMining::RDFWriter do
+  describe 'serialization of text-mining documents' do
+    it 'empty document' do
+      istream, ostream = IO.pipe
+      BioInterchange::TextMining::RDFWriter.new(ostream).serialize(BioInterchange::TextMining::Document.new('http://example.org'))
+      ostream.close
+      istream.read.lines.count.should eq(1)
+    end
+    it 'document with two entities' do
+      istream, ostream = IO.pipe
+      document = BioInterchange::TextMining::Document.new('http://example.org')
+      content = BioInterchange::TextMining::Content.new(
+          3,
+          11,
+          BioInterchange::TextMining::Content::PHRASE,
+          BioInterchange::TextMining::Process.new(
+            'Peter Smith',
+            'peter.smith@some.example.address.org',
+            BioInterchange::TextMining::Process::MANUAL
+          )
+        )
+      content.setContext(document)
+      document.add(content)
+      content = BioInterchange::TextMining::Content.new(
+          42,
+          9,
+          BioInterchange::TextMining::Content::PHRASE,
+          BioInterchange::TextMining::Process.new(
+            'GENIA',
+            'http://www.nactem.ac.uk/GENIA/tagger',
+            BioInterchange::TextMining::Process::SOFTWARE,
+            {},
+            '2012-09-28'
+          )
+        )
+      content.setContext(document)
+      document.add(content)
+      BioInterchange::TextMining::RDFWriter.new(ostream).serialize(document)
+      ostream.close
+      istream.read.lines.count.should be > 1
+    end
+  end
+end

data/web/about.html ADDED Viewed

@@ -0,0 +1,89 @@
+<!DOCTYPE html>
+<html lang="en">
+  <head>
+    <meta charset="utf-8">
+    <title>BioInterchange</title>
+    <meta name="viewport" content="width=device-width, initial-scale=1.0">
+    <meta name="description" content="BioInterchange about page">
+    <meta name="author" content="Joachim Baran">
+    <!-- Le styles -->
+    <link href="bootstrap/css/bootstrap.css" rel="stylesheet">
+    <style type="text/css">
+      body {
+        padding-top: 60px;
+        padding-bottom: 40px;
+      }
+    </style>
+    <link href="bootstrap/css/bootstrap-responsive.css" rel="stylesheet">
+    <!-- Le HTML5 shim, for IE6-8 support of HTML5 elements -->
+    <!--[if lt IE 9]>
+      <script src="http://html5shim.googlecode.com/svn/trunk/html5.js"></script>
+    <![endif]-->
+    <!-- Le fav and touch icons -->
+    <!-- <link rel="shortcut icon" href="../assets/ico/favicon.ico"> -->
+  </head>
+  <body>
+    <div class="navbar navbar-inverse navbar-fixed-top">
+      <div class="navbar-inner">
+        <div class="container">
+          <a class="btn btn-navbar" data-toggle="collapse" data-target=".nav-collapse">
+            <span class="icon-bar"></span>
+            <span class="icon-bar"></span>
+            <span class="icon-bar"></span>
+          </a>
+          <a class="brand" href="index.html">BioInterchange</a>
+          <div class="nav-collapse collapse">
+            <ul class="nav">
+              <li><a href="index.html">Home</a></li>
+              <li class="active"><a href="about.html">About</a></li>
+              <!--
+              <li class="dropdown">
+                <a href="#" class="dropdown-toggle" data-toggle="dropdown">Documentation <b class="caret"></b></a>
+                <ul class="dropdown-menu">
+                  <li><a href="#">To</a></li>
+                  <li><a href="#">Be</a></li>
+                  <li><a href="#">Done</a></li>
+                  <li><a href="#">Dude</a></li>
+                  <li class="divider"></li>
+                  <li class="nav-header">API</li>
+                  <li><a href="#">API usage information</a></li>
+                  <li><a href="#">Implementing new readers/writers</a></li>
+                </ul>
+              </li>
+              -->
+            </ul>
+          </div><!--/.nav-collapse -->
+        </div>
+      </div>
+    </div>
+    <div class="container">
+      <!-- Example row of columns -->
+      <div class="row">
+        <div class="span12">
+          <h2>About BioInterchange</h2>
+          <p>BioInterchange was conceived and designed during <a href="http://biosciencedbc.jp">NBDC</a>/<a href="http://dbcls.rois.ac.jp">DBCLS</a>'s <a href="http://2012.biohackathon.org">BioHackathon 2012</a>. Architecture and RDF serialization implementations were provided by <a href="http://joachimbaran.wordpress.com">Joachim Baran</a>, <a href="http://www.cs.man.ac.uk/~duckg">Geraint Duck</a> provided JSON and XML deserialization implementations and contributed to architecture decisions, guidance on ontology use and applications were given by <a href="http://compbio.ucdenver.edu/Hunter_lab/Cohen/index.shtml">Kevin B. Cohen</a> and <a href="http://dumontierlab.com">Michel Dumontier</a>, where Michel brought forward and extended the <a href="http://code.google.com/p/semanticscience/wiki/SIO">Semanticscience Integrated Ontology</a> (SIO).</p>
+        </div>
+      </div>
+      <hr>
+      <footer>
+        <p>&copy; <a href="https://github.com/BioInterchange/BioInterchange#contributors">The BioInterchange Contributors</a> 2012</p>
+      </footer>
+    </div> <!-- /container -->
+    <!-- Le javascript
+    ================================================== -->
+    <!-- Placed at the end of the document so the pages load faster -->
+    <!-- <script src="bootstrap/js/jquery-1.8.1.min.js"></script> -->
+    <script src="bootstrap/js/bootstrap.min.js"></script>
+  </body>
+</html>

data/web/biointerchange.js ADDED Viewed

@@ -0,0 +1,133 @@
+function generateRDF() {
+  if ($('#inputformat').val() == 'biointerchange.gff3' || $('#inputformat').val() == 'dbcls.catanns.json' || $('#inputformat').val() == 'uk.ac.man.pdfx') {
+    request = '{ "parameters" : "' + escape($('#metainput').val()) + '", "data" : "' + escape($('#maininput').val()) + '" }'
+    $.ajax({
+      type: 'POST',
+      url: 'service/rdfizer.fcgi',
+      data: request,
+      success: function(data) {
+        if ($('#output')[0].innerHTML.substring(0, 7) == '<i>RDF ')
+          $('#output').empty();
+        $('#output').append(data.replace(/</g, '&lt;').replace(/>/g, '&gt;'));
+      },
+      contentType: 'biointerchange/json',
+      dataType: 'text'
+    });
+  }
+}
+function selectDbclsCatannsJson() {
+  var outputFormats = $('#outputformat')[0];
+  for (var i = 0; i < outputFormats.length; i++)
+  if ($('#inputformat').val() == 'biointerchange.gff3') {
+    if (outputFormats[i].value == 'rdf.biointerchange.gff3') {
+      outputFormats[i].selected = true;
+      outputFormats[i].disabled = false;
+    } else {
+      outputFormats[i].selected = false;
+      outputFormats[i].disabled = true;
+    }
+  } else if ($('#inputformat').val() == 'dbcls.catanns.json') {
+    if (outputFormats[i].value == 'rdf.bh12.sio') {
+      outputFormats[i].selected = true;
+      outputFormats[i].disabled = false;
+    } else {
+      outputFormats[i].selected = false;
+      outputFormats[i].disabled = true;
+    }
+  } else if ($('#inputformat').val() == 'uk.ac.man.pdfx') {
+    if (outputFormats[i].value == 'rdf.bh12.sio') {
+      outputFormats[i].selected = true;
+      outputFormats[i].disabled = false;
+    } else {
+      outputFormats[i].selected = false;
+      outputFormats[i].disabled = true;
+    }
+  } else {
+    // Woopsie.
+  }
+}
+function pasteExample() {
+  if ($('#inputformat').val() == 'biointerchange.gff3') {
+    $('#metainput').val(
+      "{\n" +
+      "  \"input\" : \"biointerchange.gff3\",\n" +
+      "  \"output\" : \"rdf.biointerchange.gff3\",\n" +
+      "  \"name\" : \"Peter Smith\",\n" +
+      "  \"name_id\" : \"peter.smith@some.example.domain\",\n" +
+      "  \"date\" : \"2012-07-19\"\n" +
+      "}\n"
+    );
+    $('#maininput').val(
+      "ChrX.38\tbovine_complete_cds_gmap_perfect\tgene\t15870\t16254\t.\t+\t.\tID=BC109609_ChrX.38\n" +
+      "ChrX.38\tbovine_complete_cds_gmap_perfect\tRNA\t15870\t16254\t.\t+\t.\tID=bovine_complete_cds_gmap_perfect_BC109609_ChrX.38;Parent=BC109609_ChrX.38\n" +
+      "ChrX.38\tbovine_complete_cds_gmap_perfect\tCDS\t15870\t16254\t.\t+\t0\tParent=bovine_complete_cds_gmap_perfect_BC109609_ChrX.38\n" +
+      "ChrX.38\tbovine_complete_cds_gmap_perfect\texon\t15870\t16254\t.\t+\t0\tParent=bovine_complete_cds_gmap_perfect_BC109609_ChrX.38\n"
+    );
+  } else if ($('#inputformat').val() == 'dbcls.catanns.json') {
+    $('#metainput').val(
+      "{\n" +
+      "  \"input\" : \"dbcls.catanns.json\",\n" +
+      "  \"output\" : \"rdf.bh12.sio\",\n" +
+      "  \"name\" : \"Peter Smith\",\n" +
+      "  \"name_id\" : \"peter.smith@some.example.domain\",\n" +
+      "  \"date\" : \"2012-07-19\"\n" +
+      "}\n"
+    );
+    $('#maininput').val(
+      "{\n" +
+      "  \"docurl\" : \"http://www.ncbi.nlm.nih.gov/pubmed/10096561\",\n" +
+      "  \"text\" : \"Stimulation of CD40 on immunogenic human malignant melanomas augments their cytotoxic T lymphocyte-mediated lysis and induces apoptosis.\",\n" +
+      "  \"catanns\" : [\n" +
+      "    {\n" +
+      "      \"annset_id\" : 1,\n" +
+      "      \"begin\" : 15,\n" +
+      "      \"category\" : \"Protein\",\n" +
+      "      \"created_at\" : \"2012-07-18T06:11:50Z\",\n" +
+      "      \"doc_id\" : 9,\n" +
+      "      \"end\" : 19,\n" +
+      "      \"id\" : 110,\n" +
+      "      \"updated_at\" : \"2012-07-18T06:11:50Z\"\n" +
+      "    }\n" +
+      "  ]\n" +
+      "}\n"
+    );
+  } else if ($('#inputformat').val() == 'uk.ac.man.pdfx') {
+    $('#metainput').val(
+      "{\n" +
+      "  \"input\" : \"uk.ac.man.pdfx\",\n" +
+      "  \"output\" : \"rdf.bh12.sio\",\n" +
+      "  \"name\" : \"Peter Smith\",\n" +
+      "  \"name_id\" : \"peter.smith@some.example.domain\",\n" +
+      "  \"date\" : \"2012-07-19\"\n" +
+      "}\n"
+    );
+    $('#maininput').val(
+      "<?xml version='1.0' encoding='UTF-8'?>\n" +
+      "<pdfx xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xsi:noNamespaceSchemaLocation=\"http://pdfx.cs.man.ac.uk/static/article-schema.xsd\">\n" +
+      "  <meta>\n" +
+      "    <job>b85333761ae8955f8abcb6a067628f50faf58fc247d92e8aaa604991e28ccfe1</job>\n" +
+      "  </meta>\n" +
+      "  <article>\n" +
+      "    <front class=\"DoCO:FrontMatter\">\n" +
+      "      <region class=\"unknown\" id=\"1\">Volume et Galante 2007 Research al. 8, Issue 3, Article R40</region>\n" +
+      "      <title-group>\n" +
+      "        <article-title class=\"DoCO:Title\" id=\"2\">Sense-antisense pairs in mammals: functional and evolutionary</article-title>\n" +
+      "      </title-group>\n" +
+      "      <abstract class=\"DoCO:Abstract\" id=\"20\">Background: A significant number of genes in mammalian genomes are being found to have natural antisense transcripts (NATs). These sense-antisense (S-AS) pairs are believed to be involved in several cellular phenomena. Results: Here, we generated a catalog of S-AS pairs occurring in the human and mouse genomes by analyzing different sources of expressed sequences available in the public domain plus 122 massively parallel signature sequencing (MPSS) libraries from a variety of human and mouse tissues. Using this dataset of almost 20,000 S-AS pairs in both genomes we investigated, in a computational and experimental way, several putative roles that have been assigned to NATs, including gene expression regulation. Furthermore, these global analyses allowed us to better dissect and propose new roles for NATs. Surprisingly, we found that a significant fraction of NATs are artifacts produced by genomic priming during cDNA library construction. Conclusion: We propose an evolutionary and functional model in which alternative polyadenylation and retroposition account for the origin of a significant number of functional S-AS pairs in mammalian genomes.</abstract>\n" +
+      "    </front>\n" +
+      "    <body class=\"DoCO:BodyMatter\">\n" +
+      "      <section class=\"deo:Results\">\n" +
+      "        <h1 class=\"DoCO:SectionTitle\" id=\"34\" page=\"2\" column=\"2\">Results and discussion</h1>\n" +
+      "        <region class=\"DoCO:TextChunk\" id=\"151\" page=\"2\" column=\"2\">Overall distribution of S-AS pairs in human and mouse genomes To identify transcripts that derive from opposite strands of the same locus, we used a modified version of an in-house knowledgebase previously described for humans [26-28]. This knowledgebase contains more than 6 million expressed sequences mapped onto the human genome sequence and clustered in approximately 111,000 groups. [...]</region>\n" +
+      "      </section>\n" +
+      "    </body>\n" +
+      "  </article>\n" +
+      "</pdfx>\n"
+    );
+  } else {
+    $('#metainput').val('');
+    $('#maininput').val('');
+  }
+}