intermine-bio 0.98.1

Sign up to get free protection for your applications and to get access to all the features.
data/Gemfile ADDED
@@ -0,0 +1,4 @@
1
+ source "http://rubygems.org"
2
+
3
+ # Specify your gem's dependencies in intermine.gemspec
4
+ gemspec
data/LICENCE ADDED
@@ -0,0 +1,165 @@
1
+ GNU LESSER GENERAL PUBLIC LICENSE
2
+ Version 3, 29 June 2007
3
+
4
+ Copyright (C) 2007 Free Software Foundation, Inc. <http://fsf.org/>
5
+ Everyone is permitted to copy and distribute verbatim copies
6
+ of this license document, but changing it is not allowed.
7
+
8
+
9
+ This version of the GNU Lesser General Public License incorporates
10
+ the terms and conditions of version 3 of the GNU General Public
11
+ License, supplemented by the additional permissions listed below.
12
+
13
+ 0. Additional Definitions.
14
+
15
+ As used herein, "this License" refers to version 3 of the GNU Lesser
16
+ General Public License, and the "GNU GPL" refers to version 3 of the GNU
17
+ General Public License.
18
+
19
+ "The Library" refers to a covered work governed by this License,
20
+ other than an Application or a Combined Work as defined below.
21
+
22
+ An "Application" is any work that makes use of an interface provided
23
+ by the Library, but which is not otherwise based on the Library.
24
+ Defining a subclass of a class defined by the Library is deemed a mode
25
+ of using an interface provided by the Library.
26
+
27
+ A "Combined Work" is a work produced by combining or linking an
28
+ Application with the Library. The particular version of the Library
29
+ with which the Combined Work was made is also called the "Linked
30
+ Version".
31
+
32
+ The "Minimal Corresponding Source" for a Combined Work means the
33
+ Corresponding Source for the Combined Work, excluding any source code
34
+ for portions of the Combined Work that, considered in isolation, are
35
+ based on the Application, and not on the Linked Version.
36
+
37
+ The "Corresponding Application Code" for a Combined Work means the
38
+ object code and/or source code for the Application, including any data
39
+ and utility programs needed for reproducing the Combined Work from the
40
+ Application, but excluding the System Libraries of the Combined Work.
41
+
42
+ 1. Exception to Section 3 of the GNU GPL.
43
+
44
+ You may convey a covered work under sections 3 and 4 of this License
45
+ without being bound by section 3 of the GNU GPL.
46
+
47
+ 2. Conveying Modified Versions.
48
+
49
+ If you modify a copy of the Library, and, in your modifications, a
50
+ facility refers to a function or data to be supplied by an Application
51
+ that uses the facility (other than as an argument passed when the
52
+ facility is invoked), then you may convey a copy of the modified
53
+ version:
54
+
55
+ a) under this License, provided that you make a good faith effort to
56
+ ensure that, in the event an Application does not supply the
57
+ function or data, the facility still operates, and performs
58
+ whatever part of its purpose remains meaningful, or
59
+
60
+ b) under the GNU GPL, with none of the additional permissions of
61
+ this License applicable to that copy.
62
+
63
+ 3. Object Code Incorporating Material from Library Header Files.
64
+
65
+ The object code form of an Application may incorporate material from
66
+ a header file that is part of the Library. You may convey such object
67
+ code under terms of your choice, provided that, if the incorporated
68
+ material is not limited to numerical parameters, data structure
69
+ layouts and accessors, or small macros, inline functions and templates
70
+ (ten or fewer lines in length), you do both of the following:
71
+
72
+ a) Give prominent notice with each copy of the object code that the
73
+ Library is used in it and that the Library and its use are
74
+ covered by this License.
75
+
76
+ b) Accompany the object code with a copy of the GNU GPL and this license
77
+ document.
78
+
79
+ 4. Combined Works.
80
+
81
+ You may convey a Combined Work under terms of your choice that,
82
+ taken together, effectively do not restrict modification of the
83
+ portions of the Library contained in the Combined Work and reverse
84
+ engineering for debugging such modifications, if you also do each of
85
+ the following:
86
+
87
+ a) Give prominent notice with each copy of the Combined Work that
88
+ the Library is used in it and that the Library and its use are
89
+ covered by this License.
90
+
91
+ b) Accompany the Combined Work with a copy of the GNU GPL and this license
92
+ document.
93
+
94
+ c) For a Combined Work that displays copyright notices during
95
+ execution, include the copyright notice for the Library among
96
+ these notices, as well as a reference directing the user to the
97
+ copies of the GNU GPL and this license document.
98
+
99
+ d) Do one of the following:
100
+
101
+ 0) Convey the Minimal Corresponding Source under the terms of this
102
+ License, and the Corresponding Application Code in a form
103
+ suitable for, and under terms that permit, the user to
104
+ recombine or relink the Application with a modified version of
105
+ the Linked Version to produce a modified Combined Work, in the
106
+ manner specified by section 6 of the GNU GPL for conveying
107
+ Corresponding Source.
108
+
109
+ 1) Use a suitable shared library mechanism for linking with the
110
+ Library. A suitable mechanism is one that (a) uses at run time
111
+ a copy of the Library already present on the user's computer
112
+ system, and (b) will operate properly with a modified version
113
+ of the Library that is interface-compatible with the Linked
114
+ Version.
115
+
116
+ e) Provide Installation Information, but only if you would otherwise
117
+ be required to provide such information under section 6 of the
118
+ GNU GPL, and only to the extent that such information is
119
+ necessary to install and execute a modified version of the
120
+ Combined Work produced by recombining or relinking the
121
+ Application with a modified version of the Linked Version. (If
122
+ you use option 4d0, the Installation Information must accompany
123
+ the Minimal Corresponding Source and Corresponding Application
124
+ Code. If you use option 4d1, you must provide the Installation
125
+ Information in the manner specified by section 6 of the GNU GPL
126
+ for conveying Corresponding Source.)
127
+
128
+ 5. Combined Libraries.
129
+
130
+ You may place library facilities that are a work based on the
131
+ Library side by side in a single library together with other library
132
+ facilities that are not Applications and are not covered by this
133
+ License, and convey such a combined library under terms of your
134
+ choice, if you do both of the following:
135
+
136
+ a) Accompany the combined library with a copy of the same work based
137
+ on the Library, uncombined with any other library facilities,
138
+ conveyed under the terms of this License.
139
+
140
+ b) Give prominent notice with the combined library that part of it
141
+ is a work based on the Library, and explaining where to find the
142
+ accompanying uncombined form of the same work.
143
+
144
+ 6. Revised Versions of the GNU Lesser General Public License.
145
+
146
+ The Free Software Foundation may publish revised and/or new versions
147
+ of the GNU Lesser General Public License from time to time. Such new
148
+ versions will be similar in spirit to the present version, but may
149
+ differ in detail to address new problems or concerns.
150
+
151
+ Each version is given a distinguishing version number. If the
152
+ Library as you received it specifies that a certain numbered version
153
+ of the GNU Lesser General Public License "or any later version"
154
+ applies to it, you have the option of following the terms and
155
+ conditions either of that published version or of any later version
156
+ published by the Free Software Foundation. If the Library as you
157
+ received it does not specify a version number of the GNU Lesser
158
+ General Public License, you may choose any version of the GNU Lesser
159
+ General Public License ever published by the Free Software Foundation.
160
+
161
+ If the Library as you received it specifies that a proxy can decide
162
+ whether future versions of the GNU Lesser General Public License shall
163
+ apply, that proxy's public statement of acceptance of any version is
164
+ permanent authorization for you to choose that version for the
165
+ Library.
@@ -0,0 +1,50 @@
1
+ = Biological Extensions to the InterMine Webservice Client Library
2
+
3
+ This library is a set of extensions to the InterMine Webservices client,
4
+ providing access for data in biological formats. It directly extends the
5
+ InterMine classes, providing extra methods to the Query class.
6
+
7
+ == Example
8
+
9
+ Get all sequences for proteins on "h", "r", "eve", "bib" and "zen":
10
+
11
+ require "rubygems"
12
+ require "intermine/service"
13
+ require "intermine/bio"
14
+
15
+ s = Service.new("www.flymine.org/query")
16
+
17
+ puts s.query("Gene").select("proteins").where(:symbol => %w{h r eve bib zen}).fasta
18
+
19
+ Process the locations of these genes one at a time:
20
+
21
+ s.query.select("Gene").where(:symbol => %w{h r eve bib zen}).bed do |line|
22
+ process line
23
+ end
24
+
25
+ == Who is this for?
26
+
27
+ InterMine data warehouses are typically constructed to hold
28
+ Biological data, and as this library facilitates programmatic
29
+ access to these data, this install is primarily aimed at
30
+ bioinformaticians. In particular, users of the following services
31
+ may find it especially useful:
32
+ * FlyMine (http://www.flymine.org/query)
33
+ * YeastMine (http://yeastmine.yeastgenome.org/yeastmine)
34
+ * RatMine (http://ratmine.mcw.edu/ratmine)
35
+ * modMine (http://intermine.modencode.org/release-23)
36
+ * metabolicMine (http://www.metabolicmine.org/beta)
37
+
38
+ These extensions are aimed at bioinformaticians looking to integrate
39
+ these sources of data into other workflows.
40
+
41
+ For details on constructing queries, see the intermine documentation.
42
+
43
+ == Support
44
+
45
+ Support is available on our development mailing list: dev@intermine.org
46
+
47
+ == License
48
+
49
+ This code is Open Source under the LGPL. Source code for all InterMine code
50
+ can be checked out from svn://subversion.flymine.org/flymine
@@ -0,0 +1,62 @@
1
+ require 'rubygems'
2
+ require 'rubygems/specification' unless defined?(Gem::Specification)
3
+ require 'rake/testtask'
4
+ require 'rake/gempackagetask'
5
+
6
+ gem 'rdoc', '=2.1.0'
7
+ require 'rdoc/rdoc'
8
+ require 'rake/rdoctask'
9
+
10
+ gem 'darkfish-rdoc'
11
+ require 'darkfish-rdoc'
12
+
13
+ def gemspec
14
+ @gemspec ||= begin
15
+ Gem::Specification.load(File.expand_path('intermine-bio.gemspec'))
16
+ end
17
+ end
18
+
19
+ task :default => :test
20
+
21
+ desc 'Start a console session'
22
+ task :console do
23
+ system 'irb -I lib -r rubygems -r intermine/service -r intermine/bio'
24
+ end
25
+
26
+ desc 'Displays the current version'
27
+ task :version do
28
+ puts "Current version: #{gemspec.version}"
29
+ end
30
+
31
+ desc 'Installs the gem locally'
32
+ task :install => :package do
33
+ sh "gem install pkg/#{gemspec.name}-#{gemspec.version}"
34
+ end
35
+
36
+ desc 'Release the gem'
37
+ task :release => :package do
38
+ sh "gem push pkg/#{gemspec.name}-#{gemspec.version}.gem"
39
+ end
40
+
41
+ Rake::GemPackageTask.new(gemspec) do |pkg|
42
+ pkg.need_zip = true
43
+ pkg.need_tar = true
44
+
45
+ end
46
+
47
+ Rake::TestTask.new do |t|
48
+ t.libs << "test"
49
+ t.test_files = FileList['test/unit_tests.rb']
50
+ t.verbose = true
51
+ t.warning = true
52
+ end
53
+
54
+ Rake::RDocTask.new do |t|
55
+ t.title = 'Bio Extensions to the InterMine Webservice Client'
56
+ t.rdoc_files.include 'README.rdoc'
57
+ t.rdoc_files.include 'lib/**/*rb'
58
+ t.main = 'README.rdoc'
59
+ t.options += ['-SHN', '-f', 'darkfish']
60
+ end
61
+
62
+
@@ -0,0 +1,10 @@
1
+ #
2
+ # [author] Alex Kalderimis dev@intermine.org
3
+ # [homepage] http://www.intermine.org
4
+ # [Licence] Copyright (C) 2002-2011 FlyMine
5
+ #
6
+ # This code may be freely distributed and modified under the
7
+ # terms of the GNU Lesser General Public Licence. This should
8
+ # be distributed with the code. See the LICENSE file for more
9
+ # information or http://www.gnu.org/copyleft/lesser.html.
10
+ #
@@ -0,0 +1,221 @@
1
+ require "net/http"
2
+
3
+ # == Biologically specific Extensions to the InterMine Data-Warehousing Webservices
4
+ #
5
+ # These modules provide interfaces to the biologically specific elements of the
6
+ # InterMine webservices. At present this consists of the ability to return results
7
+ # from queries in Biologically specific formats (GFF3, UCSC-BED, FASTA). The
8
+ # methods for accessing these formats provide mechanisms for iterating over the contents
9
+ # in logical chunks.
10
+ #
11
+ #:include:contact_header.rdoc
12
+ #
13
+ module InterMine
14
+
15
+ # The Metadata for these extensions
16
+ #
17
+ #:include:contact_header.rdoc
18
+ #
19
+ module Bio
20
+
21
+ # The library version
22
+ VERSION = "0.98.1"
23
+
24
+ # The library name
25
+ NAME = "intermine-bio"
26
+
27
+ # The project's homepage
28
+ HOMEPAGE = "http://www.intermine.org"
29
+
30
+ # The authors of this library
31
+ AUTHORS = ["Alex Kalderimis"]
32
+
33
+ # An email address to seek support at
34
+ EMAIL = ["dev@intermine.org"]
35
+ end
36
+
37
+ # Extensions to the PathQuery module.
38
+ #
39
+ #:include:contact_header.rdoc
40
+ #
41
+ module PathQuery
42
+
43
+ class BioError < RuntimeError
44
+ end
45
+
46
+ # Biologically specific Extensions to the Query class.
47
+ #
48
+ # These methods provide mechanisms for accessing results in Biologically
49
+ # appropriate formats, being at present GFF3, FASTA, and UCSC-BED.
50
+ #
51
+ # The methods provided here can be used to both return the data as a string,
52
+ # and to iterate over the data in logical chunks, approriate to the format in question.
53
+ #
54
+ #:include:contact_header.rdoc
55
+ #
56
+ class Query
57
+
58
+ # Return the results from this query as GFF3
59
+ #
60
+ # If a block is given, each line of the GFF3 will be
61
+ # yielded in turn, omitting any header lines, which are
62
+ # returned at the end of the iteration, otherwise
63
+ # the content of the GFF3 results will be returned as one string.
64
+ #
65
+ # header = query.gff3 do |line|
66
+ # process line
67
+ # end
68
+ #
69
+ # puts query.gff3
70
+ #
71
+ def gff3
72
+ if block_given?
73
+ header = ""
74
+ results_reader.each_gff3 do |gff3|
75
+ if gff3.start_with? "#"
76
+ header << gff3
77
+ else
78
+ yield gff3
79
+ end
80
+ end
81
+ return header
82
+ else
83
+ buffer = ""
84
+ results_reader.each_gff3 do |gff3|
85
+ buffer.concat(gff3)
86
+ end
87
+ return buffer
88
+ end
89
+ end
90
+
91
+ # Return the results from this query as FASTA
92
+ #
93
+ # If a block is given, each FASTA record will be yielded
94
+ # in turn, and the query will be returned, otherwise
95
+ # the content of the FASTA results will be returned as one string.
96
+ #
97
+ # query.fasta do |record|
98
+ # process record
99
+ # end
100
+ #
101
+ # puts query.fasta
102
+ #
103
+ def fasta # :yields: record
104
+ if block_given?
105
+ buffer = nil
106
+ results_reader.each_fasta do |line|
107
+ if line.start_with? ">"
108
+ yield buffer unless buffer.nil?
109
+ buffer = line
110
+ else
111
+ raise BioError, "Incorrect fasta - no header line" if buffer.nil?
112
+ buffer << line
113
+ end
114
+ end
115
+ yield buffer unless buffer.nil?
116
+ return self
117
+ else
118
+ buffer = ""
119
+ results_reader.each_fasta do |line|
120
+ buffer.concat(line)
121
+ end
122
+ return buffer
123
+ end
124
+ end
125
+
126
+ # Return the results from this query as BED
127
+ #
128
+ # If a block is given, each line of the BED results will be
129
+ # yielded in turn, and the header will be returned, otherwise
130
+ # the content of the BED results will be returned as one string.
131
+ #
132
+ # If the optional parameter is set to false, then the "chr" prefix on
133
+ # the chromosome id will be omitted.
134
+ #
135
+ # header = query.bed do |line|
136
+ # process line
137
+ # end
138
+ #
139
+ # puts query.bed(false)
140
+ #
141
+ def bed(ucscCompatible=true)
142
+ if block_given?
143
+ header = ""
144
+ results_reader.each_bed(ucscCompatible) do |bed|
145
+ if bed =~ /^\s*(#|track)/
146
+ header << bed
147
+ else
148
+ yield bed
149
+ end
150
+ end
151
+ return header
152
+ else
153
+ buffer = ""
154
+ results_reader.each_bed(ucscCompatible) do |bed|
155
+ buffer.concat(bed)
156
+ end
157
+ return buffer
158
+ end
159
+ end
160
+ end
161
+ end
162
+
163
+ # Extensions to the Result processing code
164
+ #
165
+ #:include:contact_header.rdoc
166
+ #
167
+ module Results
168
+
169
+ # Extensions to the ResultsReader object
170
+ #
171
+ # These methods provide mechanisms for accessing the results in raw
172
+ # format, and iterating over them line by line in a memory efficient
173
+ # manner. They are in no way content aware.
174
+ #
175
+ #:include:contact_header.rdoc
176
+ #
177
+ class ResultsReader
178
+
179
+ # The path to use to get the resource paths for the query variants
180
+ RESOURCE_PATH = "/check/"
181
+
182
+ # Yield results as GFF3
183
+ def each_gff3
184
+ adjust_path(:gff3)
185
+ each_line(params("gff3")) do |line|
186
+ yield line
187
+ end
188
+ end
189
+
190
+ # Yield results as fasta
191
+ def each_fasta
192
+ adjust_path(:fasta)
193
+ each_line(params("fasta")) do |line|
194
+ yield line
195
+ end
196
+ end
197
+
198
+ # Yield results as UCSC-BED
199
+ def each_bed(ucscCompatible=true)
200
+ adjust_path(:bed)
201
+ p = params("bed")
202
+ p["ucscCompatible"] = "no" unless ucscCompatible
203
+ each_line(p) do |line|
204
+ yield line
205
+ end
206
+ end
207
+
208
+ # Adjust the path of this query to suit the currently selected format.
209
+ def adjust_path(variant)
210
+ @resources ||= {}
211
+ root = @query.service.root
212
+ uri = URI.parse(root + RESOURCE_PATH + "query." + variant.to_s)
213
+ @resources[variant] = Net::HTTP.get(uri.host, uri.path)
214
+ path = @resources[variant]
215
+ @uri = URI.parse(root + path)
216
+ end
217
+
218
+ end
219
+ end
220
+ end
221
+
@@ -0,0 +1,8 @@
1
+ # UCSC BED format
2
+ # Source: FlyMine
3
+ # Genome Build: dm3
4
+ track name=FlyMine_newtpreview_Custom_Track description="FlyMine newtpreview Custom Track" useScore=0
5
+ chr2R 5866745 5868284 eve 0 +
6
+ chr2R 5866745 5867058 eve:1 0 +
7
+ chr2R 5866745 5868284 eve-RA 0 +
8
+ chr2R 5867129 5868284 eve:2 0 +
@@ -0,0 +1,84 @@
1
+ >ZEN1_DROME 362011840
2
+ MSSVMHYYPVHQAKVGSYSADPSEVKYSDLIYGHHHDVNPIGLPPNYNQMNSNPTTLNDH
3
+ CSPQHVHQQHVSSDENLPSQPNHDSQRVKLKRSRTAFTSVQLVELENEFKSNMYLYRTRR
4
+ IEIAQRLSLCERQVKIWFQNRRMKFKKDIQGHREPKSNAKLAQPQAEQSAHRGIVKRLMS
5
+ YSQDPREGTAAAEKRPMMAVAPVNPKPDYQASQKMKTEASTNNGMCSSADLSEILEHLAQ
6
+ TTAAPQVSTATSSTGTSTNSASSSSSGHYSYNVDLVLQSIKQDLEAAAQAWSKSKSAPIL
7
+ ATQSWHPSSQSQVPTSVHAAPSMNLSWGEPAAKSRKLSVNHMNPCVTSYNYPN
8
+ >PYR1_DROME 362022982
9
+ MASTDCYLALEDGTVLPGYSFGYVPSENESKVGFGGEVVFQTGMVGYTEALTDRSYSGQI
10
+ LVLTYPLIGNYGVPAPDEDEHGLPLHFEWMKGVVQATALVVGEVAEEAFHWRKWKTLPDW
11
+ LKQHKVPGIQDIDTRALTKKLREQGSMLGKIVYEKPPVEGLPKSSFVDPNVRNLAKECSV
12
+ KERQVYGNPNGKGPRIAILDCGLKLNQLRCLLQRGASVTLLPWSARLEDEQFDALFLSNG
13
+ PGNPESCDQIVQQVRKVIEEGQKPVFGICLGHQLLAKAIGCSTYKMKYGNRGHNLPCLHR
14
+ ATGRCLMTSQNHGYAVDLEQLPDGWSELFVNANDGTNEGIVHASKPYFSVQFHPEHHAGP
15
+ QDTEFLFDVFMESIQQKDLTIPQLIEQRLRPTTPAIDSAPVMPRKVLILGSGGLSIGQAG
16
+ EFDYSGSQAIKAMRESNIQTVLINPNIATVQTSKGMADKCYFLPLTPHYVEQVIKSERPN
17
+ GVLLTFGGQTALNCGVQLERAGVFSKYNVRILGTPIQSIIETEDRKLFAERVNEIGEQVA
18
+ PSEAVYSVAQALDAASRLGYPVMARAAFSLGGLGSGFANNEEELQSLAQQALAHSSQLIV
19
+ DKSLKGWKEVEYEVVRDAYNNCITVCNMENFDPLGIHTGESIVVAPSQTLSDREYQMLRS
20
+ TALKVIRHFGVVGECNIQYALCPHSEQYYIIEVNARLSRSSALASKATGYPLAYVAAKLA
21
+ LGLPLPDIKNSVTGNTTACFEPSLDYCVVKIPRWDLAKFVRVSKHIGSSMKSVGEVMAIG
22
+ RNFEEAFQKALRMVDSDVLGFDPDVVPLNKEQLAEQLSEPTDRRPFVIAAALQLGMSLRE
23
+ LHQLTNIDYWFLEKLERIILLQSLLTRNGSRTDAALLLKAKRFGFSDKQIAKYIKSTELA
24
+ VRHQRQEFGIRPHVKQIDTVAGEWPASTNYLYHTYNGSEHDVDFPGGHTIVVGSGVYRIG
25
+ SSVEFDWCAVGCLRELRKLQRPTIMINYNPETVSTDYDMCDRLYFEEISFEVVMDIYEME
26
+ NSEGIILSMGGQLPNNIAMDLHRQQAKVLGTSPESIDCAENRFKFSRMLDRKGILQPRWK
27
+ ELTNLQSAIEFCEEVGYPCLVRPSYVLSGAAMNVAYSNQDLETYLNAASEVSREHPVVIS
28
+ KFLTEAKEIDVDAVASDGRILCMAVSEHVENAGVHSGDATLVTPPQDLNAETLEAIKRIT
29
+ CDLASVLDVTGPFNMQLIAKNNELKVIECNVRVSRSFPFVSKTLDHDFVATATRAIVGLD
30
+ VEPLDVLHGVGKVGVKVPQFSFSRLAGADVQLGVEMASTGEVACFGDNRYEAYLKAMMST
31
+ GFQIPKNAVLLSIGSFKHKMELLPSIRDLAKMGYKLYASMGTGDFYAEHGVNVESVQWTF
32
+ DKTTPDDINGELRHLAEFLANKQFDLVINLPMSGGGARRVSSFMTHGYRTRRLAVDYSIP
33
+ LVTDVKCTKLLVESMRMNGGKPPMKTHTDCMTSRRIVKLPGFIDVHVHLREPGATHKEDF
34
+ ASGTAAALAGGVTLVCAMPNTNPSIVDRETFTQFQELAKAGARCDYALYVGASDDNWAQV
35
+ NELASHACGLKMYLNDTFGTLKLSDMTSWQRHLSHWPKRSPIVCHAERQSTAAVIMLAHL
36
+ LDRSVHICHVARKEEIQLIRSAKEKGVKVTCEVCPHHLFLSTKDVERLGHGMSEVRPLLC
37
+ SPEDQEALWENIDYIDVFATDHAPHTLAEKRSERPPPGFPGVETILPLLLQAVHEGRLTM
38
+ EDIKRKFHRNPKIIFNLPDQAQTYVEVDLDEEWTITGNEMKSKSGWTPFEGTKVKGRVHR
39
+ VVLRGEVAFVDGQVLVQPGFGQNVRPKQSPLASEASQDLLPSDNDANDTFTRLLTSEGPG
40
+ GGVHGISTKVHFVDGANFLRPNSPSPRIRLDSASNTTLREYLQRTTNSNPVAHSLMGKHI
41
+ LAVDMFNKDHLNDIFNLAQLLKLRGTKDRPVDELLPGKIMASVFYEVSTRTQCSFAAAML
42
+ RLGGRVISMDNITSSVKKGESLEDSIKVVSSYADVVVLRHPSPGAVARAATFSRKPLINA
43
+ GDGVGEHPTQALLDIFTIREEFGTVNGLTITMVGDLKNGRTVHSLARLLTLYNVNLQYVA
44
+ PNSLQMPDEVVQFVHQRGVKQLFARDLKNVLPDTDVLYMTRIQRERFDNVEDYEKCCGHL
45
+ VLTPEHMMRAKKRSIVLHPLPRLNEISREIDSDPRAAYFRQAEYGMYIRMALLAMVVGGR
46
+ NTAL
47
+ >FBpp0088674 362022983
48
+ MASTDCYLALEDGTVLPGYSFGYVPSENESKVGFGGEVVFQTGMVGYTEALTDRSYSGQI
49
+ LVLTYPLIGNYGVPAPDEDEHGLPLHFEWMKGVVQATALVVGEVAEEAFHWRKWKTLPDW
50
+ LKQHKVPGIQDIDTRALTKKLREQGSMLGKIVYEKPPVEGLPKSSFVDPNVRNLAKECSV
51
+ KERQVYGNPNGKGPRIAILDCGLKLNQLRCLLQRGASVTLLPWSARLEDEQFDALFLSNG
52
+ PGNPESCDQIVQQVRKVIEEGQKPVFGICLGHQLLAKAIGCSTYKMKYGNRGHNLPCLHR
53
+ ATGRCLMTSQNHGYAVDLEQLPDGWSELFVNANDGTNEGIVHASKPYFSVQFHPEHHAGP
54
+ QDTEFLFDVFMESIQQKDLTIPQLIEQRLRPTTPAIDSAPVMPRKVLILGSGGLSIGQAG
55
+ EFDYSGSQAIKAMRESNIQTVLINPNIATVQTSKGMADKCYFLPLTPHYVEQVIKSERPN
56
+ GVLLTFGGQTALNCGVQLERAGVFSKYNVRILGTPIQSIIETEDRKLFAERVNEIGEQVA
57
+ PSEAVYSVAQALDAASRLGYPVMARAAFSLGGLGSGFANNEEELQSLAQQALAHSSQLIV
58
+ DKSLKGWKEVEYEVVRDAYNNCITVCNMENFDPLGIHTGESIVVAPSQTLSDREYQMLRS
59
+ TALKVIRHFGVVGECNIQYALCPHSEQYYIIEVNARLSRSSALASKATGYPLAYVAAKLA
60
+ LGLPLPDIKNSVTGNTTACFEPSLDYCVVKIPRWDLAKFVRVSKHIGSSMKSVGEVMAIG
61
+ RNFEEAFQKALRMVDSDVLGFDPDVVPLNKEQLAEQLSEPTDRRPFVIAAALQLGMSLRE
62
+ LHQLTNIDYWFLEKLERIILLQSLLTRNGSRTDAALLLKAKRFGFSDKQIAKYIKSTELA
63
+ VRHQRQEFGIRPHVKQIDTVAGEWPASTNYLYHTYNGSEHDVDFPGGHTIVVGSGVYRIG
64
+ SSVEFDWCAVGCLRELRKLQRPTIMINYNPETVSTDYDMCDRLYFEEISFEVVMDIYEME
65
+ NSEGIILSMGGQLPNNIAMDLHRQQAKVLGTSPESIDCAENRFKFSRMLDRKGILQPRWK
66
+ ELTNLQSAIEFCEEVGYPCLVRPSYVLSGAAMNVAYSNQDLETYLNAASEVSREHPVVIS
67
+ KFLTEAKEIDVDAVASDGRILCMAVSEHVENAGVHSGDATLVTPPQDLNAETLEAIKRIT
68
+ CDLASVLDVTGPFNMQLIAKNNELKVIECNVRVSRSFPFVSKTLDHDFVATATRAIVGLD
69
+ VEPLDVLHGVGKVGVKVPQFSFSRLAGADVQLGVEMASTGEVACFGDNRYEAYLKAMMST
70
+ GFQIPKNAVLLSIGSFKHKMELLPSIRDLAKMGYKLYASMGTGDFYAEHGVNVSIFMSVC
71
+ LLYIMQL
72
+ >BIB_DROME 362030490
73
+ MADESLHTVPLEHNIDYHIVTLFERLEAMRKDSHGGGHGVNNRLSSTLQAPKRSMQAEIR
74
+ TLEFWRSIISECLASFMYVFIVCGAAAGVGVGASVSSVLLATALASGLAMATLTQCFLHI
75
+ SGAHINPAVTLALCVVRSISPIRAAMYITAQCGGGIAGAALLYGVTVPGYQGNLQAAISH
76
+ SAALAAWERFGVEFILTFLVVLCYFVSTDPMKKFMGNSAASIGCAYSACCFVSMPYLNPA
77
+ RSLGPSFVLNKWDSHWVYWFGPLVGGMASGLVYEYIFNSRNRNLRHNKGSIDNDSSSIHS
78
+ EDELNYDMDMEKPNKYQQSQGTYPRGQSNGNGGGQAAGNGQHQAANMGQMPGVVANAGQG
79
+ NYCQNLYTAPPLSSKYDQQQEPLYGGTRSLYCRSPTLTRSNLNRSQSVYAKSNTAINRDI
80
+ VPRPGPLVPAQSLYPMRTQQQQQQQQQQQQQVAPAPQSSHLQNQNVQNQMQQRSESIYGM
81
+ RGSMRGQQQPIQQQQQQQQQQQLQQQQPNMGVQQQQMQPPPQMMSDPQQQPQGFQPVYGT
82
+ RTNPTPMDGNHKYDRRDPQQMYGVTGPRNRGQSAQSDDSSYGSYHGSAVTPPARHPSVEP
83
+ SPPPPPMLMYAPPPQPNAAHPQPIRTQSERKVSAPVVVSQPAACAVTYTTSQGSAVTAQQ
84
+ QQQQQQQQQQQQQQQQQQMMMQQQQQHYGMLPLRPN
@@ -0,0 +1,5 @@
1
+ ##gff-version 3
2
+ 2R FlyMine gene 5866746 5868284 . + . ID=FBgn0000606
3
+ 2R FlyMine exon 5866746 5867058 . + . ID=CG2328%3A1;Parent=FBgn0000606
4
+ 2R FlyMine mRNA 5866746 5868284 . + . ID=FBtr0088390;Parent=FBgn0000606
5
+ 2R FlyMine exon 5867130 5868284 . + . ID=CG2328%3A2;Parent=FBgn0000606
@@ -0,0 +1,67 @@
1
+ $LOAD_PATH << File.expand_path( File.dirname(__FILE__) + '/../lib' )
2
+ require "rexml/document"
3
+ require "test/unit"
4
+
5
+ include Test::Unit::Assertions
6
+
7
+ def compare_xml(a, b)
8
+ require "rexml/document"
9
+ docA = REXML::Document.new(a.to_s)
10
+ docB = REXML::Document.new(b.to_s)
11
+
12
+ a_elems = docA.elements.to_a
13
+ b_elems = docB.elements.to_a
14
+
15
+ (0 ... a_elems.size).each do |idx|
16
+ compare_elements(a_elems[idx], b_elems[idx])
17
+ end
18
+ end
19
+
20
+ private
21
+
22
+ def fail_xml_compare(elemA, elemB, problem, e)
23
+ formatter = REXML::Formatters::Pretty.new
24
+ elemA_str = String.new
25
+ elemB_str = String.new
26
+ formatter.write(elemA, elemA_str)
27
+ formatter.write(elemB, elemB_str)
28
+ first_part = "#{elemA_str}\nis not equal to\n#{elemB_str}\n"
29
+
30
+ raise Test::Unit::AssertionFailedError, "#{first_part}because #{problem} - #{e.message}"
31
+ end
32
+
33
+ def compare_elements(elemA, elemB)
34
+
35
+ begin
36
+ assert_equal(elemA.name, elemB.name)
37
+ rescue Test::Unit::AssertionFailedError => e
38
+ fail_xml_compare(elemA, elemB, "names of element differ", e)
39
+ end
40
+
41
+ begin
42
+ assert_equal(elemA.attributes, elemB.attributes)
43
+ rescue Test::Unit::AssertionFailedError => e
44
+ fail_xml_compare(elemA, elemB, "attributes of element differ", e)
45
+ end
46
+
47
+ begin
48
+ assert_equal(elemA.text, elemB.text)
49
+ rescue Test::Unit::AssertionFailedError => e
50
+ fail_xml_compare(elemA, elemB, "text contents of element differ", e)
51
+ end
52
+
53
+ begin
54
+ assert_equal(elemA.elements.size, elemB.elements.size)
55
+ rescue Test::Unit::AssertionFailedError => e
56
+ fail_xml_compare(elemA, elemB, "number of children of element differ", e)
57
+ end
58
+
59
+ a_elems = elemA.elements.to_a
60
+ b_elems = elemB.elements.to_a
61
+
62
+ (0 ... a_elems.size).each do |idx|
63
+ compare_elements(a_elems[idx], b_elems[idx])
64
+ end
65
+ end
66
+
67
+
@@ -0,0 +1,112 @@
1
+ require File.dirname(__FILE__) + "/test_helper.rb"
2
+
3
+ require "intermine/bio"
4
+ require "test/unit"
5
+
6
+ class MockQuery
7
+ def service
8
+ return MockService.new
9
+ end
10
+ end
11
+
12
+ class MockService
13
+ def root
14
+ return "http://www.flymine.org/query/service"
15
+ end
16
+ end
17
+
18
+ module InterMine
19
+
20
+ module PathQuery
21
+
22
+ class Query
23
+
24
+ def results_reader
25
+ return InterMine::Results::ResultsReader.new(MockQuery.new)
26
+ end
27
+
28
+ end
29
+
30
+ end
31
+
32
+ module Results
33
+
34
+ class ResultsReader
35
+
36
+ attr_reader :uri
37
+
38
+ def initialize(mock_query)
39
+ @query = mock_query
40
+ end
41
+
42
+ def each_line(params)
43
+ f = File.new(File.dirname(__FILE__) + "/data/test.#{params[:format]}", "r")
44
+ f.each_line {|line| yield line}
45
+ end
46
+
47
+ def params(format)
48
+ return {:format => format}
49
+ end
50
+
51
+ unless ENV['LIVE_TESTING'] == "1"
52
+ def adjust_path(*args)
53
+ # disable, as it makes requests
54
+ end
55
+ end
56
+
57
+ end
58
+ end
59
+ end
60
+
61
+ class TestBio < Test::Unit::TestCase
62
+
63
+ def setup
64
+ @rr = InterMine::Results::ResultsReader.new(MockQuery.new)
65
+ end
66
+
67
+ def testServiceResolution
68
+ if ENV['LIVE_TESTING'] == "1"
69
+ @rr.adjust_path("bed")
70
+ assert_equal("/query/service/query/results/bed", @rr.uri.path)
71
+ @rr.adjust_path("gff3")
72
+ assert_equal("/query/service/query/results/gff3", @rr.uri.path)
73
+ @rr.adjust_path("fasta")
74
+ assert_equal("/query/service/query/results/fasta", @rr.uri.path)
75
+ end
76
+ end
77
+
78
+ def testBedParsing
79
+ assert_equal(InterMine::PathQuery::Query.new.bed, File.new(File.dirname(__FILE__) + "/data/test.bed", "r").read)
80
+
81
+ c = 0
82
+ header = InterMine::PathQuery::Query.new.bed {|b| c += 1}
83
+ assert_equal(4, c)
84
+ assert_match(/Source: FlyMine/, header)
85
+ end
86
+
87
+ def testGff3Parsing
88
+ assert_equal(InterMine::PathQuery::Query.new.gff3, File.new(File.dirname(__FILE__) + "/data/test.gff3", "r").read)
89
+
90
+ c = 0
91
+ header = InterMine::PathQuery::Query.new.gff3 {|g| c += 1}
92
+ assert_equal(4, c)
93
+ assert_match(/gff-version 3/, header)
94
+ end
95
+
96
+ def testFastaParsing
97
+ assert_equal(InterMine::PathQuery::Query.new.fasta, File.new(File.dirname(__FILE__) + "/data/test.fasta", "r").read)
98
+
99
+ c = 0
100
+ last = nil
101
+ InterMine::PathQuery::Query.new.fasta {|f| c += 1; last = f}
102
+ assert_equal(4, c)
103
+ assert_match(/BIB_DROME 362030490/, last)
104
+ end
105
+
106
+ end
107
+
108
+
109
+
110
+
111
+
112
+
metadata ADDED
@@ -0,0 +1,146 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: intermine-bio
3
+ version: !ruby/object:Gem::Version
4
+ hash: 405
5
+ prerelease: false
6
+ segments:
7
+ - 0
8
+ - 98
9
+ - 1
10
+ version: 0.98.1
11
+ platform: ruby
12
+ authors:
13
+ - Alex Kalderimis
14
+ autorequire:
15
+ bindir: bin
16
+ cert_chain: []
17
+
18
+ date: 2011-08-02 00:00:00 +01:00
19
+ default_executable:
20
+ dependencies:
21
+ - !ruby/object:Gem::Dependency
22
+ name: intermine
23
+ prerelease: false
24
+ requirement: &id001 !ruby/object:Gem::Requirement
25
+ none: false
26
+ requirements:
27
+ - - ">="
28
+ - !ruby/object:Gem::Version
29
+ hash: 3
30
+ segments:
31
+ - 0
32
+ version: "0"
33
+ type: :runtime
34
+ version_requirements: *id001
35
+ description: |
36
+ = Biological Extensions to the InterMine Webservice Client Library
37
+
38
+ This library is a set of extensions to the InterMine Webservices client,
39
+ providing access for data in biological formats. It directly extends the
40
+ InterMine classes, providing extra methods to the Query class.
41
+
42
+ == Example
43
+
44
+ Get all sequences for proteins on "h", "r", "eve", "bib" and "zen":
45
+
46
+ require "rubygems"
47
+ require "intermine/service"
48
+ require "intermine/bio"
49
+
50
+ s = Service.new("www.flymine.org/query")
51
+
52
+ puts s.query("Gene").select("proteins").where(:symbol => %w{h r eve bib zen}).fasta
53
+
54
+ Process the locations of these genes one at a time:
55
+
56
+ s.query.select("Gene").where(:symbol => %w{h r eve bib zen}).bed do |line|
57
+ process line
58
+ end
59
+
60
+ == Who is this for?
61
+
62
+ InterMine data warehouses are typically constructed to hold
63
+ Biological data, and as this library facilitates programmatic
64
+ access to these data, this install is primarily aimed at
65
+ bioinformaticians. In particular, users of the following services
66
+ may find it especially useful:
67
+ * FlyMine (http://www.flymine.org/query)
68
+ * YeastMine (http://yeastmine.yeastgenome.org/yeastmine)
69
+ * RatMine (http://ratmine.mcw.edu/ratmine)
70
+ * modMine (http://intermine.modencode.org/release-23)
71
+ * metabolicMine (http://www.metabolicmine.org/beta)
72
+
73
+ These extensions are aimed at bioinformaticians looking to integrate
74
+ these sources of data into other workflows.
75
+
76
+ For details on constructing queries, see the intermine documentation.
77
+
78
+ == Support
79
+
80
+ Support is available on our development mailing list: dev@intermine.org
81
+
82
+ == License
83
+
84
+ This code is Open Source under the LGPL. Source code for all InterMine code
85
+ can be checked out from svn://subversion.flymine.org/flymine
86
+
87
+ email:
88
+ - dev@intermine.org
89
+ executables: []
90
+
91
+ extensions: []
92
+
93
+ extra_rdoc_files: []
94
+
95
+ files:
96
+ - lib/intermine/bio.rb
97
+ - test/data/test.bed
98
+ - test/data/test.fasta
99
+ - test/data/test.gff3
100
+ - test/test_helper.rb
101
+ - test/unit_tests.rb
102
+ - LICENCE
103
+ - Rakefile
104
+ - README.rdoc
105
+ - Gemfile
106
+ - contact_header.rdoc
107
+ has_rdoc: true
108
+ homepage: http://www.intermine.org
109
+ licenses:
110
+ - LGPL
111
+ post_install_message:
112
+ rdoc_options:
113
+ - --title
114
+ - Biological Extensions to the InterMine Webservice Client
115
+ - --main
116
+ - README.rdoc
117
+ - --line-numbers
118
+ require_paths:
119
+ - lib
120
+ required_ruby_version: !ruby/object:Gem::Requirement
121
+ none: false
122
+ requirements:
123
+ - - ">="
124
+ - !ruby/object:Gem::Version
125
+ hash: 3
126
+ segments:
127
+ - 0
128
+ version: "0"
129
+ required_rubygems_version: !ruby/object:Gem::Requirement
130
+ none: false
131
+ requirements:
132
+ - - ">="
133
+ - !ruby/object:Gem::Version
134
+ hash: 3
135
+ segments:
136
+ - 0
137
+ version: "0"
138
+ requirements: []
139
+
140
+ rubyforge_project: intermine-bio
141
+ rubygems_version: 1.3.7
142
+ signing_key:
143
+ specification_version: 3
144
+ summary: Biological Extensions for the InterMine Webservice Client Library
145
+ test_files:
146
+ - test/unit_tests.rb