bio-gag 0.0.1
Sign up to get free protection for your applications and to get access to all the features.
- data/.document +5 -0
- data/.travis.yml +12 -0
- data/Gemfile +17 -0
- data/LICENSE.txt +20 -0
- data/README.rdoc +69 -0
- data/Rakefile +45 -0
- data/VERSION +1 -0
- data/bin/gag +288 -0
- data/lib/bio-gag.rb +8 -0
- data/lib/bio/db/gag.rb +215 -0
- data/test/helper.rb +18 -0
- data/test/test_bio-gag.rb +345 -0
- metadata +154 -0
data/.document
ADDED
data/.travis.yml
ADDED
@@ -0,0 +1,12 @@
|
|
1
|
+
language: ruby
|
2
|
+
rvm:
|
3
|
+
- 1.9.2
|
4
|
+
- 1.9.3
|
5
|
+
- jruby-19mode # JRuby in 1.9 mode
|
6
|
+
- rbx-19mode
|
7
|
+
# - 1.8.7
|
8
|
+
# - jruby-18mode # JRuby in 1.8 mode
|
9
|
+
# - rbx-18mode
|
10
|
+
|
11
|
+
# uncomment this line if your project needs to run something other than `rake`:
|
12
|
+
# script: bundle exec rspec spec
|
data/Gemfile
ADDED
@@ -0,0 +1,17 @@
|
|
1
|
+
source "http://rubygems.org"
|
2
|
+
# Add dependencies required to use your gem here.
|
3
|
+
# Example:
|
4
|
+
# gem "activesupport", ">= 2.3.5"
|
5
|
+
gem 'bio-pileup_iterator', '>=0.0.1'
|
6
|
+
gem 'bio-logger', '>=1.0.0'
|
7
|
+
|
8
|
+
# Add dependencies to develop your gem here.
|
9
|
+
# Include everything needed to run rake, tests, features, etc.
|
10
|
+
group :development do
|
11
|
+
gem "shoulda", ">= 0"
|
12
|
+
gem "rdoc", "~> 3.12"
|
13
|
+
gem "bundler", ">= 1.0.0"
|
14
|
+
gem "jeweler", "~> 1.8.3"
|
15
|
+
gem "bio", ">= 1.4.2"
|
16
|
+
gem "rdoc", "~> 3.12"
|
17
|
+
end
|
data/LICENSE.txt
ADDED
@@ -0,0 +1,20 @@
|
|
1
|
+
Copyright (c) 2012 Ben J Woodcroft
|
2
|
+
|
3
|
+
Permission is hereby granted, free of charge, to any person obtaining
|
4
|
+
a copy of this software and associated documentation files (the
|
5
|
+
"Software"), to deal in the Software without restriction, including
|
6
|
+
without limitation the rights to use, copy, modify, merge, publish,
|
7
|
+
distribute, sublicense, and/or sell copies of the Software, and to
|
8
|
+
permit persons to whom the Software is furnished to do so, subject to
|
9
|
+
the following conditions:
|
10
|
+
|
11
|
+
The above copyright notice and this permission notice shall be
|
12
|
+
included in all copies or substantial portions of the Software.
|
13
|
+
|
14
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
|
15
|
+
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
16
|
+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
17
|
+
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
|
18
|
+
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
|
19
|
+
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
|
20
|
+
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
data/README.rdoc
ADDED
@@ -0,0 +1,69 @@
|
|
1
|
+
= bio-gag
|
2
|
+
|
3
|
+
bio-gag is a biogem for detecting and correcting a particular type of error that occurs/occurred in particular versions of the IonTorrent sequencing kit:
|
4
|
+
|
5
|
+
* Ion Xpress Template 100 Kit
|
6
|
+
* Ion Xpress Template 200 Kit
|
7
|
+
* Ion Sequencing 100 Kit
|
8
|
+
* Ion Sequencing 200 Kit
|
9
|
+
|
10
|
+
Newer versions of these kits do not appear to be affected by this error, starting with the "Ion PGM 200 Sequencing Kit". There are discussions about this on the (closed access) Ion Torrent forum:
|
11
|
+
|
12
|
+
* http://lifetech-it.hosted.jivesoftware.com/message/7893
|
13
|
+
* http://lifetech-it.hosted.jivesoftware.com/message/7792
|
14
|
+
* http://lifetech-it.hosted.jivesoftware.com/message/6233
|
15
|
+
|
16
|
+
To search for these errors, a pileup format file of aligned sequences is required. These can be generated either from an assembly or by aligning to a reference, although it has only been tested on de-novo assemblies assembled with newbler. Note that it is probably not optimised due to time constraints combined with the fact they appear to have been fixed in newer kits.
|
17
|
+
|
18
|
+
== Installation
|
19
|
+
|
20
|
+
gem install bio-gag
|
21
|
+
|
22
|
+
== Usage
|
23
|
+
|
24
|
+
To use the script, the important options are these:
|
25
|
+
|
26
|
+
gag [options] <pileup_output>
|
27
|
+
|
28
|
+
At first, you probably want to just run it without any options. The output is a list of predicted sites at which the error occurs.
|
29
|
+
|
30
|
+
--lookahead Work out if gag predictions are supported by orf predictions being extended [default is just to print out found gag errors]. There's modifed usage too - probably best for you to look at the code if you are using this operation
|
31
|
+
--fix CONSENSUS_FASTA_FILE Find gag errors in the pileup file, correct them in CONSENSUS_FASTA_FILE, and print to STDOUT the fixed up consensus
|
32
|
+
-g, --gags GAG_FILE Specify a list of GAG errors to be fixed in tab-separated form (use with --fix, the tab-separated output is from regular output or --lookahead)
|
33
|
+
|
34
|
+
And some options for logging:
|
35
|
+
|
36
|
+
--logger filename Log to file (default STDERR)
|
37
|
+
--trace options Set log level (default INFO, see bio-logger documentation at https://github.com/pjotrp/bioruby-logger-plugin
|
38
|
+
-q, --quiet Run quietly
|
39
|
+
-v, --verbose Run verbosely
|
40
|
+
|
41
|
+
|
42
|
+
|
43
|
+
== Developers
|
44
|
+
|
45
|
+
To use the library
|
46
|
+
|
47
|
+
require 'bio-gag'
|
48
|
+
|
49
|
+
The API doc is online. For more code examples see also the test files in
|
50
|
+
the source tree.
|
51
|
+
|
52
|
+
== Project home page
|
53
|
+
|
54
|
+
Information on the source tree, documentation, issues and how to contribute, see
|
55
|
+
|
56
|
+
http://github.com/wwood/bioruby-gag
|
57
|
+
|
58
|
+
== Cite
|
59
|
+
|
60
|
+
If you use this software, please cite http://dx.doi.org/10.1093/bioinformatics/btq475
|
61
|
+
|
62
|
+
== Biogems.info
|
63
|
+
|
64
|
+
This Biogem is published at http://biogems.info/index.html#bio-gag
|
65
|
+
|
66
|
+
== Copyright
|
67
|
+
|
68
|
+
Copyright (c) 2012 Ben J Woodcroft. See LICENSE.txt for further details.
|
69
|
+
|
data/Rakefile
ADDED
@@ -0,0 +1,45 @@
|
|
1
|
+
# encoding: utf-8
|
2
|
+
|
3
|
+
require 'rubygems'
|
4
|
+
require 'bundler'
|
5
|
+
begin
|
6
|
+
Bundler.setup(:default, :development)
|
7
|
+
rescue Bundler::BundlerError => e
|
8
|
+
$stderr.puts e.message
|
9
|
+
$stderr.puts "Run `bundle install` to install missing gems"
|
10
|
+
exit e.status_code
|
11
|
+
end
|
12
|
+
require 'rake'
|
13
|
+
|
14
|
+
require 'jeweler'
|
15
|
+
Jeweler::Tasks.new do |gem|
|
16
|
+
# gem is a Gem::Specification... see http://docs.rubygems.org/read/chapter/20 for more options
|
17
|
+
gem.name = "bio-gag"
|
18
|
+
gem.homepage = "http://github.com/wwood/bioruby-gag"
|
19
|
+
gem.license = "MIT"
|
20
|
+
gem.summary = %Q{bio-gag is a biogem for detecting and correcting a particular type of error that occurs/occurred in particular versions of the IonTorrent DNA sequencing kit}
|
21
|
+
gem.description = %Q{bio-gag is a biogem for detecting and correcting a particular type of error that occurs/occurred in particular versions of the IonTorrent DNA sequencing kit. Recent versions of the system don't appear to suffer the same problem}
|
22
|
+
gem.email = "gmail.com after donttrustben"
|
23
|
+
gem.authors = ["Ben J Woodcroft"]
|
24
|
+
# dependencies defined in Gemfile
|
25
|
+
end
|
26
|
+
Jeweler::RubygemsDotOrgTasks.new
|
27
|
+
|
28
|
+
require 'rake/testtask'
|
29
|
+
Rake::TestTask.new(:test) do |test|
|
30
|
+
test.libs << 'lib' << 'test'
|
31
|
+
test.pattern = 'test/**/test_*.rb'
|
32
|
+
test.verbose = true
|
33
|
+
end
|
34
|
+
|
35
|
+
task :default => :test
|
36
|
+
|
37
|
+
require 'rdoc/task'
|
38
|
+
Rake::RDocTask.new do |rdoc|
|
39
|
+
version = File.exist?('VERSION') ? File.read('VERSION') : ""
|
40
|
+
|
41
|
+
rdoc.rdoc_dir = 'rdoc'
|
42
|
+
rdoc.title = "bio-gag #{version}"
|
43
|
+
rdoc.rdoc_files.include('README*')
|
44
|
+
rdoc.rdoc_files.include('lib/**/*.rb')
|
45
|
+
end
|
data/VERSION
ADDED
@@ -0,0 +1 @@
|
|
1
|
+
0.0.1
|
data/bin/gag
ADDED
@@ -0,0 +1,288 @@
|
|
1
|
+
#!/usr/bin/env ruby
|
2
|
+
|
3
|
+
require 'bio'
|
4
|
+
|
5
|
+
$:.unshift File.join(File.dirname(__FILE__),'..','lib')
|
6
|
+
require 'bio-gag'
|
7
|
+
|
8
|
+
|
9
|
+
require 'optparse'
|
10
|
+
require 'csv'
|
11
|
+
require 'pp'
|
12
|
+
|
13
|
+
|
14
|
+
# Possible operations
|
15
|
+
FIND = 'find'
|
16
|
+
FIX = 'fix'
|
17
|
+
LOOKAHEAD = 'lookahead'
|
18
|
+
options = {
|
19
|
+
:operation => FIND,
|
20
|
+
:logger => 'stderr',
|
21
|
+
:trace => 'info',
|
22
|
+
}
|
23
|
+
o = OptionParser.new do |opts|
|
24
|
+
opts.banner = "\ngag [options] <pileup_output>\n\n"
|
25
|
+
|
26
|
+
|
27
|
+
opts.on('--lookahead', 'Work out if gag predictions are supported by orf predictions being extended [default is just to print out found gag errors]. There\'s modifed usage too - probably best for you to look at the code if you are using this operation') do |v|
|
28
|
+
options[:operation] = LOOKAHEAD
|
29
|
+
end
|
30
|
+
|
31
|
+
opts.on('--fix CONSENSUS_FASTA_FILE', 'Find gag errors in the pileup file, correct them in CONSENSUS_FASTA_FILE, and print to STDOUT the fixed up consensus') do |v|
|
32
|
+
options[:operation] = FIX
|
33
|
+
options[:fix_file] = v
|
34
|
+
end
|
35
|
+
|
36
|
+
opts.on('-g','--gags GAG_FILE', 'Specify a list of GAG errors to be fixed in tab-separated form (use with --fix, the tab-separated output is from regular output or --lookahead)') do |v|
|
37
|
+
options[:gags_file] = v
|
38
|
+
end
|
39
|
+
|
40
|
+
|
41
|
+
opts.on("--logger filename",String,"Log to file (default STDERR)") do | name |
|
42
|
+
options[:logger] = name
|
43
|
+
end
|
44
|
+
|
45
|
+
opts.on("--trace options",String,"Set log level (default INFO, see bio-logger documentation at https://github.com/pjotrp/bioruby-logger-plugin") do | s |
|
46
|
+
options[:trace] = s
|
47
|
+
end
|
48
|
+
|
49
|
+
opts.on("-q", "--quiet", "Run quietly") do |q|
|
50
|
+
options[:trace] = 'error'
|
51
|
+
end
|
52
|
+
|
53
|
+
opts.on("-v", "--verbose", "Run verbosely") do |v|
|
54
|
+
options[:trace] = 'info'
|
55
|
+
end
|
56
|
+
end.parse!
|
57
|
+
|
58
|
+
# Realize settings
|
59
|
+
Bio::Log::CLI.trace(options[:trace])
|
60
|
+
Bio::Log::CLI.logger(options[:logger]) #defaults to STDERR not STDOUT
|
61
|
+
Bio::Log::CLI.configure('bio-gag')
|
62
|
+
log = Bio::Log::LoggerPlus.new 'gag'
|
63
|
+
Bio::Log::CLI.configure('gag')
|
64
|
+
|
65
|
+
piles = Bio::DB::PileupIterator.new(ARGF)
|
66
|
+
|
67
|
+
if options[:operation] == FIX
|
68
|
+
# Cache the fasta sequences
|
69
|
+
sequences = {} # Hash of sequence_id to sequences
|
70
|
+
|
71
|
+
# Read in the gags if they have already been specified
|
72
|
+
# e.g. contig00125 11130 A GAG
|
73
|
+
gags = {}
|
74
|
+
if options[:gags_file]
|
75
|
+
log.info "Using pre-specified GAG errors from #{options[:gag_file]}"
|
76
|
+
CSV.foreach(options[:gags_file], :headers => true, :col_sep => "\t") do |row|
|
77
|
+
contig = row[0]
|
78
|
+
gag = Bio::Gag.new(row[1].to_i, nil, contig)
|
79
|
+
gags[contig] ||= []
|
80
|
+
gags[contig].push gag
|
81
|
+
end
|
82
|
+
end
|
83
|
+
|
84
|
+
Bio::FlatFile.foreach(options[:fix_file]) do |s|
|
85
|
+
if sequences[s.entry_id]
|
86
|
+
raise Exception, "Unexpectedly found 2 sequences with the same sequence identifier '#{sequence_id}', giving up"
|
87
|
+
end
|
88
|
+
sequences[s.entry_id] = s.seq
|
89
|
+
end
|
90
|
+
log.info "Cached #{sequences.length} sequences from the consensus fasta file"
|
91
|
+
log.debug "Sequences being fixed hash: #{sequences.inspect}"
|
92
|
+
|
93
|
+
#$stderr.puts gags
|
94
|
+
piles.fix_gags(sequences, gags).sort{|a,b| a[0]<=>b[0]}.each do |name, fixed_seq|
|
95
|
+
puts ">#{name}"
|
96
|
+
puts fixed_seq
|
97
|
+
end
|
98
|
+
|
99
|
+
elsif options[:operation] == LOOKAHEAD
|
100
|
+
# Given a list of gag errors and gene predictions before and after, second-guess whether they are really true gag errors
|
101
|
+
# * Where there is only 1 gene predicted, go with that
|
102
|
+
# * Where both sets predict the same thing, go with either
|
103
|
+
# * Where the sets disagree and there is more than 2 total, give up and go manual
|
104
|
+
# * Where the sets disagree and there is one from each, starting from the gag error and working in the direction of the gene in the 2 frames
|
105
|
+
# ** Where there is two gag errors predicted in the same gene, give up and go manual.
|
106
|
+
|
107
|
+
genes1_file = ARGV[0]
|
108
|
+
genes2_file = ARGV[1]
|
109
|
+
gag_predictions_file = ARGV[2]
|
110
|
+
|
111
|
+
class GenePrediction
|
112
|
+
attr_accessor :start, :stop, :direction, :name
|
113
|
+
end
|
114
|
+
|
115
|
+
class Gag
|
116
|
+
attr_accessor :ref_name, :position, :inserted_base, :context, :adjusted_position
|
117
|
+
end
|
118
|
+
|
119
|
+
# Read in all the gene predictions
|
120
|
+
add_genes = lambda do |file|
|
121
|
+
hash = {} #hash of contig to array of GenePrediction objects
|
122
|
+
Bio::FlatFile.foreach(file) do |s|
|
123
|
+
# ["contig00001_1_1", "#", "412", "#", "624", "#", "1", "#", "ID=1_1;partial=00;start_type=ATG;rbs_motif=None;rbs_spacer=None"]
|
124
|
+
splits = s.definition.split(' ')
|
125
|
+
gene = GenePrediction.new
|
126
|
+
contig = splits[0].match(/(.+)_\d+_\d+$/)[1]
|
127
|
+
gene.start = splits[2].to_i
|
128
|
+
gene.stop = splits[4].to_i
|
129
|
+
gene.direction = splits[6]
|
130
|
+
gene.name = splits[0]
|
131
|
+
|
132
|
+
raise Exception, "Unexpected format for gene start (#{splits[3]}) or stop (#{splits[5]}) in fasta header #{s.definition}" if gene.start == 0 or gene.stop == 0
|
133
|
+
raise unless %w(1 -1).include? gene.direction
|
134
|
+
|
135
|
+
hash[contig] ||= []
|
136
|
+
hash[contig].push gene
|
137
|
+
end
|
138
|
+
hash
|
139
|
+
end
|
140
|
+
genes_before_unchanged = add_genes.call(genes1_file)
|
141
|
+
genes_after = add_genes.call(genes2_file)
|
142
|
+
|
143
|
+
# Read in the gag output file
|
144
|
+
gags = {} #hash of contigs to gag predictions (positions along the genome)
|
145
|
+
CSV.foreach(gag_predictions_file, :col_sep => "\t", :headers => true) do |row|
|
146
|
+
contig = row[0]
|
147
|
+
|
148
|
+
gag = Gag.new
|
149
|
+
gag.ref_name = row[0]
|
150
|
+
gag.position = row[1].to_i
|
151
|
+
gag.inserted_base = row[2]
|
152
|
+
gag.context = row[3]
|
153
|
+
|
154
|
+
gags[contig] ||= []
|
155
|
+
gags[contig].push gag
|
156
|
+
end
|
157
|
+
|
158
|
+
# Change the bases numbers of the gene predictions in the beforehand gene predictions to be in line so both sets of gene predictions line up
|
159
|
+
genes_before = {}
|
160
|
+
genes_before_unchanged.each do |contig, preds|
|
161
|
+
preds.each do |gene|
|
162
|
+
unless gags[contig].nil?
|
163
|
+
gags_before_start = gags[contig].count do |pos|
|
164
|
+
pos.position < gene.start
|
165
|
+
end
|
166
|
+
gene.start = gene.start+gags_before_start
|
167
|
+
|
168
|
+
gags_before_stop = gags[contig].count do |pos|
|
169
|
+
pos.position < gene.stop
|
170
|
+
end
|
171
|
+
gene.stop = gene.stop+gags_before_stop
|
172
|
+
end
|
173
|
+
|
174
|
+
genes_before[contig] ||= []
|
175
|
+
genes_before[contig].push gene
|
176
|
+
end
|
177
|
+
end
|
178
|
+
|
179
|
+
# Change the base numbers of the gag errors
|
180
|
+
gags.each do |contig, pregagged|
|
181
|
+
count = 0
|
182
|
+
pregagged.each do |g|
|
183
|
+
g.adjusted_position = g.position+count
|
184
|
+
count += 1
|
185
|
+
end
|
186
|
+
end
|
187
|
+
|
188
|
+
print_gag = lambda do |gag_object|
|
189
|
+
puts [
|
190
|
+
gag_object.ref_name,
|
191
|
+
gag_object.position,
|
192
|
+
gag_object.inserted_base,
|
193
|
+
gag_object.context
|
194
|
+
].join("\t")
|
195
|
+
end
|
196
|
+
|
197
|
+
# print headers
|
198
|
+
puts %w(ref_name position inserted_base context).join("\t")
|
199
|
+
|
200
|
+
# Iterate through the gag erors
|
201
|
+
gags.each do |contig, gags|
|
202
|
+
gags.each do |gag_object|
|
203
|
+
gag = gag_object.adjusted_position
|
204
|
+
# Find overlapping genes from both sets of predictions at this site
|
205
|
+
genes1 = []
|
206
|
+
unless genes_before[contig].nil?
|
207
|
+
genes1 = genes_before[contig].select{|gene| gene.start < gag and gene.stop > gag}
|
208
|
+
end
|
209
|
+
genes2 = []
|
210
|
+
unless genes_after[contig].nil?
|
211
|
+
genes2 = genes_after[contig].select{|gene| gene.start < gag and gene.stop > gag}
|
212
|
+
end
|
213
|
+
|
214
|
+
# if there is no predictions, then do nothing
|
215
|
+
if genes1.empty? and genes2.empty?
|
216
|
+
log.debug "Gag doesn't fall within any ORFs called on contig #{contig} position #{gag}, ignoring"
|
217
|
+
next
|
218
|
+
end
|
219
|
+
|
220
|
+
all_genes = [genes1,genes2].flatten
|
221
|
+
manual_message = lambda do
|
222
|
+
log.info "before: #{genes1.inspect}"
|
223
|
+
log.info "after #{genes2.inspect}"
|
224
|
+
end
|
225
|
+
|
226
|
+
if all_genes.length == 3 and all_genes.collect{|g| g.direction}.uniq.length == 1
|
227
|
+
if genes1.length == 2
|
228
|
+
# 2 genes from before, 1 from after
|
229
|
+
if genes1[0].start == genes2[0].start and genes1[1].stop == genes2[0].stop
|
230
|
+
log.debug "Gag correctly called at #{gag}, I reckon, because there was 1 gene afterwards, 2 from before"
|
231
|
+
print_gag.call gag_object
|
232
|
+
else
|
233
|
+
log.info "2 genes from before, 1 from after, but they don't line up, giving up at #{contig}/#{gag}"
|
234
|
+
manual_message.call
|
235
|
+
end
|
236
|
+
elsif genes2.length == 2
|
237
|
+
# 2 genes from after, 1 from before
|
238
|
+
if genes1[0].start == genes2[0].start and genes1[0].stop == genes2[1].stop
|
239
|
+
log.debug "Gag incorrectly called at #{contig}/#{gag}, I reckon, because there was 2 genes from afterwards, 1 from before"
|
240
|
+
else
|
241
|
+
log.info "1 genes from before, 2 from after, but they don't line up, giving up at #{contig}/#{gag}"
|
242
|
+
manual_message.call
|
243
|
+
end
|
244
|
+
else
|
245
|
+
# 3 genes all from the same set of predictions
|
246
|
+
log.info "3 genes all in the same direction.. whacko.. giving up - gag was at #{contig}/#{gag}"
|
247
|
+
manual_message.call
|
248
|
+
end
|
249
|
+
elsif genes1.length == 1 and genes2.length == 1
|
250
|
+
if genes1[0].stop - genes1[0].start > genes2[0].stop - genes2[0].start
|
251
|
+
log.debug "Gag incorrectly called at contig #{contig}, gag #{gag}"
|
252
|
+
else
|
253
|
+
log.debug "Gag correctly called at contig #{contig}, gag #{gag}"
|
254
|
+
print_gag.call gag_object
|
255
|
+
end
|
256
|
+
elsif all_genes.length == 1
|
257
|
+
if genes1.length == 1
|
258
|
+
log.debug "Gag incorrectly called at contig #{contig}, gag #{gag} because only 1 gene was found"
|
259
|
+
else
|
260
|
+
log.debug "Gag correctly called at contig #{contig}, gag #{gag} because only 1 gene was found"
|
261
|
+
print_gag.call gag_object
|
262
|
+
end
|
263
|
+
else
|
264
|
+
log.info "Not 3 genes or something is strange with the direction with gag, at #{contig}/#{gag}"
|
265
|
+
manual_message.call
|
266
|
+
end
|
267
|
+
end
|
268
|
+
end
|
269
|
+
|
270
|
+
else
|
271
|
+
# Don't do anything, just predict them
|
272
|
+
|
273
|
+
puts %w(
|
274
|
+
ref_name
|
275
|
+
position
|
276
|
+
inserted_base
|
277
|
+
context
|
278
|
+
).join("\t")
|
279
|
+
|
280
|
+
piles.gags do |gag|
|
281
|
+
puts [
|
282
|
+
gag.ref_name,
|
283
|
+
gag.position,
|
284
|
+
gag.inserted_base,
|
285
|
+
gag.gagging_pileups.collect{|g| g.ref_base}.join('')
|
286
|
+
].join("\t")
|
287
|
+
end
|
288
|
+
end
|
data/lib/bio-gag.rb
ADDED
data/lib/bio/db/gag.rb
ADDED
@@ -0,0 +1,215 @@
|
|
1
|
+
|
2
|
+
|
3
|
+
class Bio::DB::PileupIterator
|
4
|
+
# Find places in this pileup that correspond to GAG errors
|
5
|
+
# * Only certain sequences are considered to be possible errors. Can change this with options[:acceptable_gag_errors]
|
6
|
+
# ** GAAG/CTTC (namesake of GAG errors. So GAG is looked for, to see if it is probably GAAG instead)
|
7
|
+
# ** AGGC/GCCT
|
8
|
+
# ** GCCG/CGGC
|
9
|
+
# ** GCCA/TGGC
|
10
|
+
# * There is at least 3 reads that have an insertion of base Y next to Y, and are all in the one direction. Can change this with options[:min_disagreeing_absolute]
|
11
|
+
# * The 3 or more reads form at least a proportion of 0.1 (i.e. 10%) of all the reads at that position. Can change this with options[:min_disagreeing_proportion]
|
12
|
+
#
|
13
|
+
# Returns an array of Bio::Gag objects
|
14
|
+
#
|
15
|
+
# When a block is given, each gag is yielded
|
16
|
+
def gags(options={})
|
17
|
+
min_disagreeing_proportion = options[:min_disagreeing_proportion]
|
18
|
+
min_disagreeing_proportion ||= 0.1
|
19
|
+
min_disagreeing_absolute = options[:min_disagreeing_absolute]
|
20
|
+
min_disagreeing_absolute ||= 3
|
21
|
+
|
22
|
+
options[:acceptable_gag_errors] ||= %w(GAG CTC AGC GCT GCG CGC GCA TGC)
|
23
|
+
|
24
|
+
log = Bio::Log::LoggerPlus['bio-gag']
|
25
|
+
|
26
|
+
piles = []
|
27
|
+
gags = []
|
28
|
+
|
29
|
+
each do |pile|
|
30
|
+
if piles.length < 2
|
31
|
+
#log.debug "Piles cache for this reference sequence less than length 2"
|
32
|
+
piles = [piles, pile].flatten
|
33
|
+
next
|
34
|
+
elsif piles.length < 3
|
35
|
+
#log.debug "Piles cache for this reference sequence becoming full"
|
36
|
+
piles = [piles, pile].flatten
|
37
|
+
elsif piles[1].ref_name != pile.ref_name
|
38
|
+
#log.debug "Piles cache removed - moving to new contig"
|
39
|
+
piles = [pile]
|
40
|
+
next
|
41
|
+
else
|
42
|
+
#log.debug "Piles cache regular push through"
|
43
|
+
piles = [piles[1], piles[2], pile].flatten
|
44
|
+
end
|
45
|
+
#log.debug "Current piles now at #{piles[0].ref_name}, #{piles.collect{|pile| "#{pile.pos}/#{pile.ref_base}"}.join(', ')}"
|
46
|
+
|
47
|
+
# if not at the start/end of the contig
|
48
|
+
first = piles[0]
|
49
|
+
second = piles[1]
|
50
|
+
third = piles[2]
|
51
|
+
|
52
|
+
# Require particular sequences in the reference sequence
|
53
|
+
ref_bases = "#{first.ref_base}#{second.ref_base}#{third.ref_base}"
|
54
|
+
index = options[:acceptable_gag_errors].index(ref_bases)
|
55
|
+
if index.nil?
|
56
|
+
#log.debug "Sequence #{ref_bases} does not match whitelist, so not calling a gag"
|
57
|
+
next
|
58
|
+
end
|
59
|
+
gag_sequence = options[:acceptable_gag_errors][index]
|
60
|
+
|
61
|
+
# all reads that have a single insertion after the first or second position, but not both
|
62
|
+
inserting_reads = [first.reads, second.reads].flatten.uniq.select do |read|
|
63
|
+
!(read.insertions[first.pos] and read.insertions[second.pos]) and
|
64
|
+
(read.insertions[first.pos] or read.insertions[second.pos])
|
65
|
+
end
|
66
|
+
#log.debug "Inserting reads after filtering: #{inserting_reads.inspect}"
|
67
|
+
|
68
|
+
# ignore regions that aren't ever going to make it past the next filter
|
69
|
+
if inserting_reads.length < min_disagreeing_absolute or inserting_reads.length.to_f/first.coverage < min_disagreeing_proportion
|
70
|
+
#log.debug "Insufficient disagreement at step 1, so not calling a gag"
|
71
|
+
next
|
72
|
+
end
|
73
|
+
|
74
|
+
# what is the maximal base that is inserted and maximal number of directions
|
75
|
+
direction_counts = {'+' => 0, '-' => 0}
|
76
|
+
base_counts = {}
|
77
|
+
inserting_reads.each do |read|
|
78
|
+
insert = read.insertions[first.pos]
|
79
|
+
insert ||= read.insertions[second.pos]
|
80
|
+
insert.upcase!
|
81
|
+
direction_counts[read.direction] += 1
|
82
|
+
base_counts[insert] ||= 0
|
83
|
+
base_counts[insert] += 1
|
84
|
+
end
|
85
|
+
#log.debug "Direction counts of insertions: #{direction_counts.inspect}"
|
86
|
+
#log.debug "Base counts of insertions: #{base_counts.inspect}"
|
87
|
+
max_direction = direction_counts['+']>direction_counts['-'] ? '+' : '-'
|
88
|
+
max_base = base_counts.max do |a,b|
|
89
|
+
a[1] <=> b[1]
|
90
|
+
end[0]
|
91
|
+
#log.debug "Picking max direction #{max_direction} and max base #{max_base}"
|
92
|
+
|
93
|
+
# Only accept positions that are inserting a single base
|
94
|
+
if max_base.length > 1
|
95
|
+
#log.debug "Maximal insertion is too long, so not calling a gag"
|
96
|
+
next
|
97
|
+
end
|
98
|
+
|
99
|
+
counted_inserts = inserting_reads.select do |read|
|
100
|
+
insert = read.insertions[first.pos]
|
101
|
+
insert ||= read.insertions[second.pos]
|
102
|
+
insert.upcase!
|
103
|
+
if read.direction == max_direction and insert == max_base
|
104
|
+
# # Remove reads that don't match the first and third bases like the consensus sequence
|
105
|
+
read.sequence[read.sequence.length-1] == third.ref_base and
|
106
|
+
read.sequence[read.sequence.length-3] == first.ref_base
|
107
|
+
else
|
108
|
+
false
|
109
|
+
end
|
110
|
+
end
|
111
|
+
#log.debug "Reads counting after final filtering: #{counted_inserts.inspect}"
|
112
|
+
|
113
|
+
coverage = (first.coverage+second.coverage+third.coverage).to_f / 3.0
|
114
|
+
coverage_percent = counted_inserts.length.to_f / coverage
|
115
|
+
#log.debug "Final abundance calculations: max base #{max_base} (comparison base #{second.ref_base.upcase}) occurs #{counted_inserts.length} times compared to coverage #{coverage} (#{coverage_percent*10}%)"
|
116
|
+
if max_base != second.ref_base.upcase or # first and second bases must be the same
|
117
|
+
counted_inserts.length < min_disagreeing_absolute or # require 3 bases in that maximal direction
|
118
|
+
coverage_percent < min_disagreeing_proportion # at least 10% of reads with disagree with the consensus and agree with the gag
|
119
|
+
#log.debug "Failed final abundance cutoffs, so not calling a gag"
|
120
|
+
next
|
121
|
+
end
|
122
|
+
|
123
|
+
# alright, gamut navigated. We have a match, record it
|
124
|
+
gag = Bio::Gag.new(second.pos, piles, first.ref_name)
|
125
|
+
gags.push gag
|
126
|
+
log.debug "Yielding gag #{gag.inspect}"
|
127
|
+
yield gag if block_given?
|
128
|
+
end
|
129
|
+
|
130
|
+
return gags
|
131
|
+
end
|
132
|
+
|
133
|
+
# Given a hash containing sequence identifier => sequences, where both key and value are plain old Ruby strings, return the hash with any GAG errors in the sequences fixed.
|
134
|
+
# If the sequence_id_to_gags argument is specified, the gags are not searched from the pileups. If specified, it should be a hash of reference sequence IDs to an array of Bio::Gag objects
|
135
|
+
def fix_gags(hash_of_sequence_ids_to_sequence_strings, sequence_id_to_gags={})
|
136
|
+
log = Bio::Log::LoggerPlus['bio-gag']
|
137
|
+
|
138
|
+
# Get the gags
|
139
|
+
if sequence_id_to_gags == {}
|
140
|
+
log.info "Predicting gags from the pileup"
|
141
|
+
gags do |gag|
|
142
|
+
sequence_id_to_gags[gag.ref_name] ||= []
|
143
|
+
sequence_id_to_gags[gag.ref_name].push gag
|
144
|
+
end
|
145
|
+
else
|
146
|
+
log.info "Using pre-specified GAG errors"
|
147
|
+
end
|
148
|
+
log.info "Found #{sequence_id_to_gags.values.flatten.length} gag errors to fix"
|
149
|
+
|
150
|
+
# Make sure all gag errors in the pileup map to a sequence input fasta file by keeping tally
|
151
|
+
accounted_for_seq_ids = []
|
152
|
+
fixed_sequences = {} #Hash of sequence ids to sequences without gag errors
|
153
|
+
hash_of_sequence_ids_to_sequence_strings.each do |seq_id, seq|
|
154
|
+
log.debug "Now attempting to fix sequence #{seq_id}, sequence #{seq}"
|
155
|
+
toilet = sequence_id_to_gags[seq_id]
|
156
|
+
if toilet.nil?
|
157
|
+
# No gag errors found in this sequence (or pessimistically the sequence wasn't in the pileup -leaving that issue to the user though)
|
158
|
+
fixed_sequences[seq_id] = seq
|
159
|
+
else
|
160
|
+
# Gag error found at least once somewhere in this sequence
|
161
|
+
# Record that this was touched in the pileup
|
162
|
+
accounted_for_seq_ids.push seq_id
|
163
|
+
|
164
|
+
# Output the fixed-up sequence
|
165
|
+
last_gag = 0
|
166
|
+
fixed = ''
|
167
|
+
toilet.sort{|a,b| a.position<=>b.position}.each do |gag|
|
168
|
+
#log.debug "Attempting to fix gag at position #{gag.position} in sequence #{seq_id}, which is #{seq.length} bases long"
|
169
|
+
fixed = fixed+seq[last_gag..(gag.position-1)]
|
170
|
+
fixed = fixed+seq[(gag.position-1)..(gag.position-1)]
|
171
|
+
last_gag = gag.position
|
172
|
+
#log.debug "After fixing gag at position #{gag.position}, fixed sequence is now #{fixed}"
|
173
|
+
end
|
174
|
+
fixed = fixed+seq[last_gag..(seq.length-1)]
|
175
|
+
fixed_sequences[seq_id] = fixed
|
176
|
+
end
|
177
|
+
end
|
178
|
+
|
179
|
+
unless accounted_for_seq_ids.length == sequence_id_to_gags.length
|
180
|
+
log.warn "Unexpectedly found GAG errors in sequences that weren't in the sequence that are to be fixed: Found gags in #{sequence_id_to_gags.length}, but only fixed #{accounted_for_seq_ids.length}"
|
181
|
+
end
|
182
|
+
return fixed_sequences
|
183
|
+
end
|
184
|
+
end
|
185
|
+
|
186
|
+
class Bio::Gag
|
187
|
+
# The name of the reference sequence where the error was called
|
188
|
+
attr_accessor :ref_name
|
189
|
+
|
190
|
+
# Position in the reference sequence where the error was called
|
191
|
+
attr_accessor :position
|
192
|
+
|
193
|
+
# Bio::DB::Pileup objects around the GAG error
|
194
|
+
attr_accessor :gagging_pileups
|
195
|
+
|
196
|
+
# The base to be inserted. May be derived from @gagging_pileups if they have been specified
|
197
|
+
attr_writer :inserted_base
|
198
|
+
|
199
|
+
def initialize(position, gagging_pileups, ref_name)
|
200
|
+
@position = position
|
201
|
+
@gagging_pileups = gagging_pileups
|
202
|
+
@ref_name = ref_name
|
203
|
+
end
|
204
|
+
|
205
|
+
# The base to be inserted. May be manually specified in @inserted_base, otherwise it is the ref_base derived from @gagging_pileups at the inserted position
|
206
|
+
def inserted_base
|
207
|
+
if @inserted_base.nil?
|
208
|
+
@gagging_pileups[1].ref_base
|
209
|
+
else
|
210
|
+
@inserted_base
|
211
|
+
end
|
212
|
+
end
|
213
|
+
end
|
214
|
+
|
215
|
+
|
data/test/helper.rb
ADDED
@@ -0,0 +1,18 @@
|
|
1
|
+
require 'rubygems'
|
2
|
+
require 'bundler'
|
3
|
+
begin
|
4
|
+
Bundler.setup(:default, :development)
|
5
|
+
rescue Bundler::BundlerError => e
|
6
|
+
$stderr.puts e.message
|
7
|
+
$stderr.puts "Run `bundle install` to install missing gems"
|
8
|
+
exit e.status_code
|
9
|
+
end
|
10
|
+
require 'test/unit'
|
11
|
+
require 'shoulda'
|
12
|
+
|
13
|
+
$LOAD_PATH.unshift(File.join(File.dirname(__FILE__), '..', 'lib'))
|
14
|
+
$LOAD_PATH.unshift(File.dirname(__FILE__))
|
15
|
+
require 'bio-gag'
|
16
|
+
|
17
|
+
class Test::Unit::TestCase
|
18
|
+
end
|
@@ -0,0 +1,345 @@
|
|
1
|
+
require 'helper'
|
2
|
+
require 'tempfile'
|
3
|
+
require 'open3'
|
4
|
+
|
5
|
+
class TestBioGag < Test::Unit::TestCase
|
6
|
+
should "find_gag" do
|
7
|
+
test = "contig00091 4 C 32 ,,..,,......,,,.....,,.,,,,,,,., ~~I~~~u~u~t~~~~~~~~~~~~~~~~~~~~~
|
8
|
+
contig00091 5 G 32 ,,..,,......,,,.....,,.,,,,,,,., {{Ii{{iiii@i{{{iiiii{{i{{{{{{{i{
|
9
|
+
contig00091 6 A 33 ,,.$.+1A,,.+1A.+1A.+1A.+1A.+1A.+1A,,,.+1A.+1A.+1A.+1A.+1A,,.+1A,,,,,,,.+1A,^]. z{D${{$$$$!${{{$$$$${{${{{{{{{${E
|
10
|
+
contig00091 7 G 32 ,,.,,.....-1G.,,,.....,,.,,,,,,,.,. aaRaaRRRR&RaaaRRRRRaaRaaaaaaaRaU
|
11
|
+
contig00091 8 G 32 ,,.,,....*.,,,.....,,.,,,,,,,.,. aaRaaRRRRZRaaaRRRRRaaRaaaaaaaRaa".gsub(/ +/,"\t")
|
12
|
+
gags = Bio::DB::PileupIterator.new(test).gags
|
13
|
+
assert_equal [6], gags.collect{|g| g.position}
|
14
|
+
end
|
15
|
+
|
16
|
+
should "find_gag with first and third bases different, but whitelisted" do
|
17
|
+
test = "contig00091 4 C 32 ,,..,,......,,,.....,,.,,,,,,,., ~~I~~~u~u~t~~~~~~~~~~~~~~~~~~~~~
|
18
|
+
contig00091 5 G 32 ,,..,,......,,,.....,,.,,,,,,,., {{Ii{{iiii@i{{{iiiii{{i{{{{{{{i{
|
19
|
+
contig00091 6 C 33 ,,.$.+1C,,.+1C.+1C.+1C.+1C.+1C.+1C,,,.+1C.+1C.+1C.+1C.+1C,,.+1C,,,,,,,.+1C,^]. z{D${{$$$$!${{{$$$$${{${{{{{{{${E
|
20
|
+
contig00091 7 A 32 ,,.,,.....-1G.,,,.....,,.,,,,,,,.,. aaRaaRRRR&RaaaRRRRRaaRaaaaaaaRaU
|
21
|
+
contig00091 8 G 32 ,,.,,....*.,,,.....,,.,,,,,,,.,. aaRaaRRRRZRaaaRRRRRaaRaaaaaaaRaa".gsub(/ +/,"\t")
|
22
|
+
gags = Bio::DB::PileupIterator.new(test).gags
|
23
|
+
assert_equal [6], gags.collect{|g| g.position}
|
24
|
+
end
|
25
|
+
|
26
|
+
should "find no gag when XXX" do
|
27
|
+
test = "contig00091 4 C 32 ,,..,,......,,,.....,,.,,,,,,,., ~~I~~~u~u~t~~~~~~~~~~~~~~~~~~~~~
|
28
|
+
contig00091 5 G 32 ,,..,,......,,,.....,,.,,,,,,,., {{Ii{{iiii@i{{{iiiii{{i{{{{{{{i{
|
29
|
+
contig00091 6 G 33 ,,.$.+1A,,.+1A.+1A.+1A.+1A.+1A.+1A,,,.+1A.+1A.+1A.+1A.+1A,,.+1A,,,,,,,.+1A,^]. z{D${{$$$$!${{{$$$$${{${{{{{{{${E
|
30
|
+
contig00091 7 G 32 ,,.,,.....-1G.,,,.....,,.,,,,,,,.,. aaRaaRRRR&RaaaRRRRRaaRaaaaaaaRaU
|
31
|
+
contig00091 8 G 32 ,,.,,....*.,,,.....,,.,,,,,,,.,. aaRaaRRRRZRaaaRRRRRaaRaaaaaaaRaa".gsub(/ +/,"\t")
|
32
|
+
gags = Bio::DB::PileupIterator.new(test).gags
|
33
|
+
assert_equal [], gags.collect{|g| g.position}
|
34
|
+
end
|
35
|
+
|
36
|
+
should "find no gag with first and third bases are the same but aren't in the whitelist" do
|
37
|
+
test = "contig00091 4 C 32 ,,..,,......,,,.....,,.,,,,,,,., ~~I~~~u~u~t~~~~~~~~~~~~~~~~~~~~~
|
38
|
+
contig00091 5 C 32 ,,..,,......,,,.....,,.,,,,,,,., {{Ii{{iiii@i{{{iiiii{{i{{{{{{{i{
|
39
|
+
contig00091 6 A 33 ,,.$.+1A,,.+1A.+1A.+1A.+1A.+1A.+1A,,,.+1A.+1A.+1A.+1A.+1A,,.+1A,,,,,,,.+1A,^]. z{D${{$$$$!${{{$$$$${{${{{{{{{${E
|
40
|
+
contig00091 7 C 32 ,,.,,.....-1G.,,,.....,,.,,,,,,,.,. aaRaaRRRR&RaaaRRRRRaaRaaaaaaaRaU
|
41
|
+
contig00091 8 G 32 ,,.,,....*.,,,.....,,.,,,,,,,.,. aaRaaRRRRZRaaaRRRRRaaRaaaaaaaRaa".gsub(/ +/,"\t")
|
42
|
+
gags = Bio::DB::PileupIterator.new(test).gags
|
43
|
+
assert_equal [], gags.collect{|g| g.position}
|
44
|
+
end
|
45
|
+
|
46
|
+
should "fix gag" do
|
47
|
+
test = "contig00091 1 G 32 ,,..,,......,,,.....,,.,,,,,,,., {;c{{{l{l{l{{{{{{{{{{{{{{{{{{{{U
|
48
|
+
contig00091 2 T 32 ,,.-1T.,,.-1T..-1T..-1T.,,,.....,,.,,,,,,,., a`$aaa!a!a!aaaaaaaaaaaaaaaaaaaaa
|
49
|
+
contig00091 3 T 32 ,,*.,,*.*.*.,,,.....,,.,,,,,,,., a`Iaaauauataaaaaaaaaaaaaaaaaaaaa
|
50
|
+
contig00091 4 C 32 ,,..,,......,,,.....,,.,,,,,,,., ~~I~~~u~u~t~~~~~~~~~~~~~~~~~~~~~
|
51
|
+
contig00091 5 G 32 ,,..,,......,,,.....,,.,,,,,,,., {{Ii{{iiii@i{{{iiiii{{i{{{{{{{i{
|
52
|
+
contig00091 6 A 33 ,,.$.+1A,,.+1A.+1A.+1A.+1A.+1A.+1A,,,.+1A.+1A.+1A.+1A.+1A,,.+1A,,,,,,,.+1A,^]. z{D${{$$$$!${{{$$$$${{${{{{{{{${E
|
53
|
+
contig00091 7 G 32 ,,.,,.....-1G.,,,.....,,.,,,,,,,.,. aaRaaRRRR&RaaaRRRRRaaRaaaaaaaRaU
|
54
|
+
contig00091 8 G 32 ,,.,,....*.,,,.....,,.,,,,,,,.,. aaRaaRRRRZRaaaRRRRRaaRaaaaaaaRaa
|
55
|
+
contig00091 9 C 32 ,,.,,......,,,.....,,.,,,,,,,.,. ~~i~~~~~~Z~~~~~~~~~~~~~~~~~~~~~r
|
56
|
+
contig00091 10 A 33 ,,.,,......,,,.....,,.,,,,,,,.,.^]. aaPaa^aaaYaaaaaaaaaaaaaaaaaaaaaaB".gsub(/ +/,"\t")
|
57
|
+
hash = {'contig00091' => 'GTTCGAGGC'}
|
58
|
+
expe = {'contig00091' => 'GTTCGAAGGC'}
|
59
|
+
assert_equal expe, gags = Bio::DB::PileupIterator.new(test).fix_gags(hash)
|
60
|
+
end
|
61
|
+
|
62
|
+
should "fix gag prespecified" do
|
63
|
+
test = "contig00091 1 G 32 ,,..,,......,,,.....,,.,,,,,,,., {;c{{{l{l{l{{{{{{{{{{{{{{{{{{{{U
|
64
|
+
contig00091 2 T 32 ,,.-1T.,,.-1T..-1T..-1T.,,,.....,,.,,,,,,,., a`$aaa!a!a!aaaaaaaaaaaaaaaaaaaaa
|
65
|
+
contig00091 3 T 32 ,,*.,,*.*.*.,,,.....,,.,,,,,,,., a`Iaaauauataaaaaaaaaaaaaaaaaaaaa
|
66
|
+
contig00091 4 C 32 ,,..,,......,,,.....,,.,,,,,,,., ~~I~~~u~u~t~~~~~~~~~~~~~~~~~~~~~
|
67
|
+
contig00091 5 G 32 ,,..,,......,,,.....,,.,,,,,,,., {{Ii{{iiii@i{{{iiiii{{i{{{{{{{i{
|
68
|
+
contig00091 6 A 33 ,,.$.+1A,,.+1A.+1A.+1A.+1A.+1A.+1A,,,.+1A.+1A.+1A.+1A.+1A,,.+1A,,,,,,,.+1A,^]. z{D${{$$$$!${{{$$$$${{${{{{{{{${E
|
69
|
+
contig00091 7 G 32 ,,.,,.....-1G.,,,.....,,.,,,,,,,.,. aaRaaRRRR&RaaaRRRRRaaRaaaaaaaRaU
|
70
|
+
contig00091 8 G 32 ,,.,,....*.,,,.....,,.,,,,,,,.,. aaRaaRRRRZRaaaRRRRRaaRaaaaaaaRaa
|
71
|
+
contig00091 9 C 32 ,,.,,......,,,.....,,.,,,,,,,.,. ~~i~~~~~~Z~~~~~~~~~~~~~~~~~~~~~r
|
72
|
+
contig00091 10 A 33 ,,.,,......,,,.....,,.,,,,,,,.,.^]. aaPaa^aaaYaaaaaaaaaaaaaaaaaaaaaaB".gsub(/ +/,"\t")
|
73
|
+
hash = {'contig00091' => 'GTTCGAGGC'}
|
74
|
+
expe = {'contig00091' => 'GTTTCGAGGC'}
|
75
|
+
gag1 = Bio::Gag.new(2,nil,'contig00091')
|
76
|
+
gags = {'contig00091' => [gag1]}
|
77
|
+
assert_equal expe, gags = Bio::DB::PileupIterator.new(test).fix_gags(hash, gags)
|
78
|
+
end
|
79
|
+
|
80
|
+
should "fix gag prespecified in 2 seqs" do
|
81
|
+
hash = {'contig00091' => 'GTTCGAGGC',
|
82
|
+
'contig00092' => 'GAGTTCGAGGC'}
|
83
|
+
expe = {'contig00091' => 'GTTTCGAGGC',
|
84
|
+
'contig00092' => 'GAGTTCGAGGC'}
|
85
|
+
|
86
|
+
gag1 = Bio::Gag.new(2,nil,'contig00091')
|
87
|
+
gags = {'contig00091' => [gag1]}
|
88
|
+
assert_equal expe, gags = Bio::DB::PileupIterator.new('').fix_gags(hash, gags)
|
89
|
+
|
90
|
+
gag2 = Bio::Gag.new(8,nil,'contig00092')
|
91
|
+
gags = {'contig00091' => [gag1], 'contig00092' => [gag2]}
|
92
|
+
expe = {'contig00091' => 'GTTTCGAGGC',
|
93
|
+
'contig00092' => 'GAGTTCGAAGGC'}
|
94
|
+
assert_equal expe, gags = Bio::DB::PileupIterator.new('').fix_gags(hash, gags)
|
95
|
+
end
|
96
|
+
|
97
|
+
should "fix 2 gags" do
|
98
|
+
test = "contig00091 1 G 32 ,,..,,......,,,.....,,.,,,,,,,., {;c{{{l{l{l{{{{{{{{{{{{{{{{{{{{U
|
99
|
+
contig00091 2 T 32 ,,.-1T.,,.-1T..-1T..-1T.,,,.....,,.,,,,,,,., a`$aaa!a!a!aaaaaaaaaaaaaaaaaaaaa
|
100
|
+
contig00091 3 T 32 ,,*.,,*.*.*.,,,.....,,.,,,,,,,., a`Iaaauauataaaaaaaaaaaaaaaaaaaaa
|
101
|
+
contig00091 4 C 32 ,,..,,......,,,.....,,.,,,,,,,., ~~I~~~u~u~t~~~~~~~~~~~~~~~~~~~~~
|
102
|
+
contig00091 5 G 32 ,,..,,......,,,.....,,.,,,,,,,., {{Ii{{iiii@i{{{iiiii{{i{{{{{{{i{
|
103
|
+
contig00091 6 A 33 ,,..+1A,,.+1A.+1A.+1A.+1A.+1A.+1A,,,.+1A.+1A.+1A.+1A.+1A,,.+1A,,,,,,,.+1A,^]. z{D${{$$$$!${{{$$$$${{${{{{{{{${E
|
104
|
+
contig00091 7 G 32 ,,..,,.....-1G.,,,.....,,.,,,,,,,.,. aaRaaRRRR&RaaaRRRRRaaRaaaaaaaRaU
|
105
|
+
contig00091 8 G 32 ,,..,,......,,,.....,,.,,,,,,,.,. {{Ii{{iiii@i{{{iiiii{{i{{{{{{{i{
|
106
|
+
contig00091 9 A 33 ,,.$.+1A,,.+1A.+1A.+1A.+1A.+1A.+1A,,,.+1A.+1A.+1A.+1A.+1A,,.+1A,,,,,,,.+1A,. z{D${{$$$$!${{{$$$$${{${{{{{{{${E
|
107
|
+
contig00091 10 G 32 ,,.,,.....-1G.,,,.....,,.,,,,,,,.,. aaRaaRRRR&RaaaRRRRRaaRaaaaaaaRaU
|
108
|
+
contig00091 11 G 32 ,,.,,....*.,,,.....,,.,,,,,,,.,. aaRaaRRRRZRaaaRRRRRaaRaaaaaaaRaa
|
109
|
+
contig00091 12 C 32 ,,.,,......,,,.....,,.,,,,,,,.,. ~~i~~~~~~Z~~~~~~~~~~~~~~~~~~~~~r
|
110
|
+
contig00091 13 A 33 ,,.,,......,,,.....,,.,,,,,,,.,.^]. aaPaa^aaaYaaaaaaaaaaaaaaaaaaaaaaB".gsub(/ +/,"\t")
|
111
|
+
|
112
|
+
hash = {'contig00091' => 'GTTCGAGGAGGCA'}
|
113
|
+
expe = {'contig00091' => 'GTTCGAAGGAAGGCA'}
|
114
|
+
assert_equal expe, gags = Bio::DB::PileupIterator.new(test).fix_gags(hash)
|
115
|
+
end
|
116
|
+
|
117
|
+
should "run gagger predict ok" do
|
118
|
+
test = "contig00091 4 C 32 ,,..,,......,,,.....,,.,,,,,,,., ~~I~~~u~u~t~~~~~~~~~~~~~~~~~~~~~
|
119
|
+
contig00091 5 G 32 ,,..,,......,,,.....,,.,,,,,,,., {{Ii{{iiii@i{{{iiiii{{i{{{{{{{i{
|
120
|
+
contig00091 6 A 33 ,,.$.+1A,,.+1A.+1A.+1A.+1A.+1A.+1A,,,.+1A.+1A.+1A.+1A.+1A,,.+1A,,,,,,,.+1A,^]. z{D${{$$$$!${{{$$$$${{${{{{{{{${E
|
121
|
+
contig00091 7 G 32 ,,.,,.....-1G.,,,.....,,.,,,,,,,.,. aaRaaRRRR&RaaaRRRRRaaRaaaaaaaRaU
|
122
|
+
contig00091 8 G 32 ,,.,,....*.,,,.....,,.,,,,,,,.,. aaRaaRRRRZRaaaRRRRRaaRaaaaaaaRaa".gsub(/ +/,"\t")
|
123
|
+
command = File.join([File.dirname(__FILE__),%w(.. bin gag)].flatten)+ ' --trace info'
|
124
|
+
out = nil
|
125
|
+
err = nil
|
126
|
+
Open3.popen3(command) do |stdin, stdout, stderr|
|
127
|
+
stdin.puts test
|
128
|
+
stdin.close
|
129
|
+
out = stdout.readlines
|
130
|
+
err = stderr.readlines
|
131
|
+
end
|
132
|
+
assert_equal [], err
|
133
|
+
assert_equal [
|
134
|
+
"ref_name\tposition\tinserted_base\tcontext\n",
|
135
|
+
"contig00091\t6\tA\tGAG\n"
|
136
|
+
], out
|
137
|
+
end
|
138
|
+
|
139
|
+
should "run gagger fix ok without gags pre-specified" do
|
140
|
+
test = "contig00091 1 G 32 ,,..,,......,,,.....,,.,,,,,,,., {;c{{{l{l{l{{{{{{{{{{{{{{{{{{{{U
|
141
|
+
contig00091 2 T 32 ,,.-1T.,,.-1T..-1T..-1T.,,,.....,,.,,,,,,,., a`$aaa!a!a!aaaaaaaaaaaaaaaaaaaaa
|
142
|
+
contig00091 3 T 32 ,,*.,,*.*.*.,,,.....,,.,,,,,,,., a`Iaaauauataaaaaaaaaaaaaaaaaaaaa
|
143
|
+
contig00091 4 C 32 ,,..,,......,,,.....,,.,,,,,,,., ~~I~~~u~u~t~~~~~~~~~~~~~~~~~~~~~
|
144
|
+
contig00091 5 G 32 ,,..,,......,,,.....,,.,,,,,,,., {{Ii{{iiii@i{{{iiiii{{i{{{{{{{i{
|
145
|
+
contig00091 6 A 33 ,,..+1A,,.+1A.+1A.+1A.+1A.+1A.+1A,,,.+1A.+1A.+1A.+1A.+1A,,.+1A,,,,,,,.+1A,^]. z{D${{$$$$!${{{$$$$${{${{{{{{{${E
|
146
|
+
contig00091 7 G 32 ,,..,,.....-1G.,,,.....,,.,,,,,,,.,. aaRaaRRRR&RaaaRRRRRaaRaaaaaaaRaU
|
147
|
+
contig00091 8 G 32 ,,..,,......,,,.....,,.,,,,,,,.,. {{Ii{{iiii@i{{{iiiii{{i{{{{{{{i{
|
148
|
+
contig00091 9 A 33 ,,.$.+1A,,.+1A.+1A.+1A.+1A.+1A.+1A,,,.+1A.+1A.+1A.+1A.+1A,,.+1A,,,,,,,.+1A,. z{D${{$$$$!${{{$$$$${{${{{{{{{${E
|
149
|
+
contig00091 10 G 32 ,,.,,.....-1G.,,,.....,,.,,,,,,,.,. aaRaaRRRR&RaaaRRRRRaaRaaaaaaaRaU
|
150
|
+
contig00091 11 G 32 ,,.,,....*.,,,.....,,.,,,,,,,.,. aaRaaRRRRZRaaaRRRRRaaRaaaaaaaRaa
|
151
|
+
contig00091 12 C 32 ,,.,,......,,,.....,,.,,,,,,,.,. ~~i~~~~~~Z~~~~~~~~~~~~~~~~~~~~~r
|
152
|
+
contig00091 13 A 33 ,,.,,......,,,.....,,.,,,,,,,.,.^]. aaPaa^aaaYaaaaaaaaaaaaaaaaaaaaaaB".gsub(/ +/,"\t")
|
153
|
+
Tempfile.open('test_gag_fix') do |tempfile|
|
154
|
+
tempfile.puts '>contig00091'
|
155
|
+
tempfile.puts 'GTTCGAGGAGGCA'
|
156
|
+
tempfile.close
|
157
|
+
|
158
|
+
command = File.join([File.dirname(__FILE__),%w(.. bin gag)].flatten)+' --trace error --fix '+tempfile.path
|
159
|
+
out = nil
|
160
|
+
err = nil
|
161
|
+
Open3.popen3(command) do |stdin, stdout, stderr|
|
162
|
+
stdin.puts test
|
163
|
+
stdin.close
|
164
|
+
out = stdout.readlines
|
165
|
+
err = stderr.readlines
|
166
|
+
end
|
167
|
+
assert_equal [], err
|
168
|
+
assert_equal [
|
169
|
+
">contig00091\n",
|
170
|
+
"GTTCGAAGGAAGGCA\n"
|
171
|
+
], out
|
172
|
+
end
|
173
|
+
end
|
174
|
+
|
175
|
+
should "run gagger fix ok with fasta comments" do
|
176
|
+
test = "contig00091 1 G 32 ,,..,,......,,,.....,,.,,,,,,,., {;c{{{l{l{l{{{{{{{{{{{{{{{{{{{{U
|
177
|
+
contig00091 2 T 32 ,,.-1T.,,.-1T..-1T..-1T.,,,.....,,.,,,,,,,., a`$aaa!a!a!aaaaaaaaaaaaaaaaaaaaa
|
178
|
+
contig00091 3 T 32 ,,*.,,*.*.*.,,,.....,,.,,,,,,,., a`Iaaauauataaaaaaaaaaaaaaaaaaaaa
|
179
|
+
contig00091 4 C 32 ,,..,,......,,,.....,,.,,,,,,,., ~~I~~~u~u~t~~~~~~~~~~~~~~~~~~~~~
|
180
|
+
contig00091 5 G 32 ,,..,,......,,,.....,,.,,,,,,,., {{Ii{{iiii@i{{{iiiii{{i{{{{{{{i{
|
181
|
+
contig00091 6 A 33 ,,..+1A,,.+1A.+1A.+1A.+1A.+1A.+1A,,,.+1A.+1A.+1A.+1A.+1A,,.+1A,,,,,,,.+1A,^]. z{D${{$$$$!${{{$$$$${{${{{{{{{${E
|
182
|
+
contig00091 7 G 32 ,,..,,.....-1G.,,,.....,,.,,,,,,,.,. aaRaaRRRR&RaaaRRRRRaaRaaaaaaaRaU
|
183
|
+
contig00091 8 G 32 ,,..,,......,,,.....,,.,,,,,,,.,. {{Ii{{iiii@i{{{iiiii{{i{{{{{{{i{
|
184
|
+
contig00091 9 A 33 ,,.$.+1A,,.+1A.+1A.+1A.+1A.+1A.+1A,,,.+1A.+1A.+1A.+1A.+1A,,.+1A,,,,,,,.+1A,. z{D${{$$$$!${{{$$$$${{${{{{{{{${E
|
185
|
+
contig00091 10 G 32 ,,.,,.....-1G.,,,.....,,.,,,,,,,.,. aaRaaRRRR&RaaaRRRRRaaRaaaaaaaRaU
|
186
|
+
contig00091 11 G 32 ,,.,,....*.,,,.....,,.,,,,,,,.,. aaRaaRRRRZRaaaRRRRRaaRaaaaaaaRaa
|
187
|
+
contig00091 12 C 32 ,,.,,......,,,.....,,.,,,,,,,.,. ~~i~~~~~~Z~~~~~~~~~~~~~~~~~~~~~r
|
188
|
+
contig00091 13 A 33 ,,.,,......,,,.....,,.,,,,,,,.,.^]. aaPaa^aaaYaaaaaaaaaaaaaaaaaaaaaaB".gsub(/ +/,"\t")
|
189
|
+
Tempfile.open('test_gag_fix') do |tempfile|
|
190
|
+
tempfile.puts '>contig00091 with comment'
|
191
|
+
tempfile.puts 'GTTCGAGGAGGCA'
|
192
|
+
tempfile.close
|
193
|
+
|
194
|
+
command = File.join([File.dirname(__FILE__),%w(.. bin gag)].flatten)+' --trace error --fix '+tempfile.path
|
195
|
+
out = nil
|
196
|
+
err = nil
|
197
|
+
Open3.popen3(command) do |stdin, stdout, stderr|
|
198
|
+
stdin.puts test
|
199
|
+
stdin.close
|
200
|
+
out = stdout.readlines
|
201
|
+
err = stderr.readlines
|
202
|
+
end
|
203
|
+
assert_equal [], err
|
204
|
+
assert_equal [
|
205
|
+
">contig00091\n",
|
206
|
+
"GTTCGAAGGAAGGCA\n"
|
207
|
+
], out
|
208
|
+
end
|
209
|
+
end
|
210
|
+
|
211
|
+
should "run gagger fix when some sequences don't have gag errors" do
|
212
|
+
test = "contig00091 1 C 32 ,,..,,......,,,.....,,.,,,,,,,., ~~I~~~u~u~t~~~~~~~~~~~~~~~~~~~~~
|
213
|
+
contig00091 2 G 32 ,,..,,......,,,.....,,.,,,,,,,., {{Ii{{iiii@i{{{iiiii{{i{{{{{{{i{
|
214
|
+
contig00091 3 A 33 ,,.$.+1A,,.+1A.+1A.+1A.+1A.+1A.+1A,,,.+1A.+1A.+1A.+1A.+1A,,.+1A,,,,,,,.+1A,^]. z{D${{$$$$!${{{$$$$${{${{{{{{{${E
|
215
|
+
contig00091 4 G 32 ,,.,,.....-1G.,,,.....,,.,,,,,,,.,. aaRaaRRRR&RaaaRRRRRaaRaaaaaaaRaU
|
216
|
+
contig00091 5 G 32 ,,.,,....*.,,,.....,,.,,,,,,,.,. aaRaaRRRRZRaaaRRRRRaaRaaaaaaaRaa".gsub(/ +/,"\t")
|
217
|
+
|
218
|
+
Tempfile.open('test_gag_fix') do |tempfile|
|
219
|
+
tempfile.puts '>contig00091 with comment'
|
220
|
+
tempfile.puts 'CGAGG'
|
221
|
+
tempfile.puts '>contig00092'
|
222
|
+
tempfile.puts 'ATGC'
|
223
|
+
tempfile.close
|
224
|
+
|
225
|
+
command = File.join([File.dirname(__FILE__),%w(.. bin gag)].flatten)+ ' --trace error --fix '+tempfile.path
|
226
|
+
out = nil
|
227
|
+
err = nil
|
228
|
+
Open3.popen3(command) do |stdin, stdout, stderr|
|
229
|
+
stdin.puts test
|
230
|
+
stdin.close
|
231
|
+
out = stdout.readlines
|
232
|
+
err = stderr.readlines
|
233
|
+
end
|
234
|
+
assert_equal [], err
|
235
|
+
assert_equal [
|
236
|
+
">contig00091\n",
|
237
|
+
"CGAAGG\n",
|
238
|
+
">contig00092\n",
|
239
|
+
"ATGC\n"
|
240
|
+
], out
|
241
|
+
end
|
242
|
+
end
|
243
|
+
|
244
|
+
|
245
|
+
should "run gagger fix ok, but warn, when there's less sequences than gag errors" do
|
246
|
+
test = "contig00091 1 C 32 ,,..,,......,,,.....,,.,,,,,,,., ~~I~~~u~u~t~~~~~~~~~~~~~~~~~~~~~
|
247
|
+
contig00091 2 G 32 ,,..,,......,,,.....,,.,,,,,,,., {{Ii{{iiii@i{{{iiiii{{i{{{{{{{i{
|
248
|
+
contig00091 3 A 33 ,,.$.+1A,,.+1A.+1A.+1A.+1A.+1A.+1A,,,.+1A.+1A.+1A.+1A.+1A,,.+1A,,,,,,,.+1A,^]. z{D${{$$$$!${{{$$$$${{${{{{{{{${E
|
249
|
+
contig00091 4 G 32 ,,.,,.....-1G.,,,.....,,.,,,,,,,.,. aaRaaRRRR&RaaaRRRRRaaRaaaaaaaRaU
|
250
|
+
contig00091 5 G 32 ,$,$.$,$,$.$.$.$.$*$.$,$,$,$.$.$.$.$.$,$,$.$,$,$,$,$,$,$,$.$,$.$ aaRaaRRRRZRaaaRRRRRaaRaaaaaaaRaa
|
251
|
+
contig00090 1 C 32 ,,..,,......,,,.....,,.,,,,,,,., ~~I~~~u~u~t~~~~~~~~~~~~~~~~~~~~~
|
252
|
+
contig00090 2 G 32 ,,..,,......,,,.....,,.,,,,,,,., {{Ii{{iiii@i{{{iiiii{{i{{{{{{{i{
|
253
|
+
contig00090 3 A 33 ,,.$.+1A,,.+1A.+1A.+1A.+1A.+1A.+1A,,,.+1A.+1A.+1A.+1A.+1A,,.+1A,,,,,,,.+1A,^]. z{D${{$$$$!${{{$$$$${{${{{{{{{${E
|
254
|
+
contig00090 4 G 32 ,,.,,.....-1G.,,,.....,,.,,,,,,,.,. aaRaaRRRR&RaaaRRRRRaaRaaaaaaaRaU
|
255
|
+
contig00090 5 G 32 ,,.,,....*.,,,.....,,.,,,,,,,.,. aaRaaRRRRZRaaaRRRRRaaRaaaaaaaRaa".gsub(/ +/,"\t")
|
256
|
+
|
257
|
+
Tempfile.open('test_gag_fix') do |tempfile|
|
258
|
+
tempfile.puts '>contig00091 with comment'
|
259
|
+
tempfile.puts 'CGAGG'
|
260
|
+
tempfile.close
|
261
|
+
|
262
|
+
command = File.join([File.dirname(__FILE__),%w(.. bin gag)].flatten)+ ' --trace warn --fix '+tempfile.path
|
263
|
+
out = nil
|
264
|
+
err = nil
|
265
|
+
Open3.popen3(command) do |stdin, stdout, stderr|
|
266
|
+
stdin.puts test
|
267
|
+
stdin.close
|
268
|
+
out = stdout.readlines
|
269
|
+
err = stderr.readlines
|
270
|
+
end
|
271
|
+
assert_equal [" WARN bio-gag: Unexpectedly found GAG errors in sequences that weren't in the sequence that are to be fixed: Found gags in 2, but only fixed 1\n"], err
|
272
|
+
assert_equal [
|
273
|
+
">contig00091\n",
|
274
|
+
"CGAAGG\n",
|
275
|
+
], out
|
276
|
+
end
|
277
|
+
end
|
278
|
+
|
279
|
+
should "run gagger with --debug without any big problems" do
|
280
|
+
test = "contig00091 1 C 32 ,,..,,......,,,.....,,.,,,,,,,., ~~I~~~u~u~t~~~~~~~~~~~~~~~~~~~~~
|
281
|
+
contig00091 2 G 32 ,,..,,......,,,.....,,.,,,,,,,., {{Ii{{iiii@i{{{iiiii{{i{{{{{{{i{
|
282
|
+
contig00091 3 A 33 ,,.$.+1A,,.+1A.+1A.+1A.+1A.+1A.+1A,,,.+1A.+1A.+1A.+1A.+1A,,.+1A,,,,,,,.+1A,^]. z{D${{$$$$!${{{$$$$${{${{{{{{{${E
|
283
|
+
contig00091 4 G 32 ,,.,,.....-1G.,,,.....,,.,,,,,,,.,. aaRaaRRRR&RaaaRRRRRaaRaaaaaaaRaU
|
284
|
+
contig00091 5 G 32 ,$,$.$,$,$.$.$.$.$*$.$,$,$,$.$.$.$.$.$,$,$.$,$,$,$,$,$,$,$.$,$.$ aaRaaRRRRZRaaaRRRRRaaRaaaaaaaRaa
|
285
|
+
contig00090 1 C 32 ,,..,,......,,,.....,,.,,,,,,,., ~~I~~~u~u~t~~~~~~~~~~~~~~~~~~~~~
|
286
|
+
contig00090 2 G 32 ,,..,,......,,,.....,,.,,,,,,,., {{Ii{{iiii@i{{{iiiii{{i{{{{{{{i{
|
287
|
+
contig00090 3 A 33 ,,.$.+1A,,.+1A.+1A.+1A.+1A.+1A.+1A,,,.+1A.+1A.+1A.+1A.+1A,,.+1A,,,,,,,.+1A,^]. z{D${{$$$$!${{{$$$$${{${{{{{{{${E
|
288
|
+
contig00090 4 G 32 ,,.,,.....-1G.,,,.....,,.,,,,,,,.,. aaRaaRRRR&RaaaRRRRRaaRaaaaaaaRaU
|
289
|
+
contig00090 5 G 32 ,,.,,....*.,,,.....,,.,,,,,,,.,. aaRaaRRRRZRaaaRRRRRaaRaaaaaaaRaa".gsub(/ +/,"\t")
|
290
|
+
|
291
|
+
Tempfile.open('test_gag_fix') do |tempfile|
|
292
|
+
tempfile.puts '>contig00091 with comment'
|
293
|
+
tempfile.puts 'CGAGG'
|
294
|
+
tempfile.close
|
295
|
+
|
296
|
+
command = File.join([File.dirname(__FILE__),%w(.. bin gag)].flatten)+ ' --trace debug --fix '+tempfile.path
|
297
|
+
out = nil
|
298
|
+
err = nil
|
299
|
+
Open3.popen3(command) do |stdin, stdout, stderr|
|
300
|
+
stdin.puts test
|
301
|
+
stdin.close
|
302
|
+
out = stdout.readlines
|
303
|
+
err = stderr.readlines
|
304
|
+
end
|
305
|
+
assert err.length > 1, "expected more errors"
|
306
|
+
assert_equal [
|
307
|
+
">contig00091\n",
|
308
|
+
"CGAAGG\n",
|
309
|
+
], out
|
310
|
+
end
|
311
|
+
end
|
312
|
+
|
313
|
+
should "run gagger fix ok with prespecified gags" do
|
314
|
+
test = ""
|
315
|
+
Tempfile.open('test_gag_fix') do |tempfile|
|
316
|
+
tempfile.puts '>contig00091'
|
317
|
+
tempfile.puts 'GTTCGAGGAGGCA'
|
318
|
+
tempfile.close
|
319
|
+
|
320
|
+
Tempfile.open('gags_prespecified') do |gags_file|
|
321
|
+
gags_file.puts %w(ref_name position inserted_base context).join("\t")
|
322
|
+
gags_file.puts %w(contig00091 2 G CTC).join("\t")
|
323
|
+
gags_file.puts %w(contig00091 4 G CTC).join("\t")
|
324
|
+
gags_file.close
|
325
|
+
|
326
|
+
command = File.join([File.dirname(__FILE__),%w(.. bin gag)].flatten)+" --trace error --fix #{tempfile.path} --gags #{gags_file.path}"
|
327
|
+
out = nil
|
328
|
+
err = nil
|
329
|
+
Open3.popen3(command) do |stdin, stdout, stderr|
|
330
|
+
stdin.puts test
|
331
|
+
stdin.close
|
332
|
+
out = stdout.readlines
|
333
|
+
err = stderr.readlines
|
334
|
+
end
|
335
|
+
assert_equal [], err
|
336
|
+
assert_equal [
|
337
|
+
">contig00091\n",
|
338
|
+
"GTTTCCGAGGAGGCA\n"
|
339
|
+
], out
|
340
|
+
|
341
|
+
end
|
342
|
+
end
|
343
|
+
end
|
344
|
+
|
345
|
+
end
|
metadata
ADDED
@@ -0,0 +1,154 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: bio-gag
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 0.0.1
|
5
|
+
prerelease:
|
6
|
+
platform: ruby
|
7
|
+
authors:
|
8
|
+
- Ben J Woodcroft
|
9
|
+
autorequire:
|
10
|
+
bindir: bin
|
11
|
+
cert_chain: []
|
12
|
+
date: 2012-05-17 00:00:00.000000000 Z
|
13
|
+
dependencies:
|
14
|
+
- !ruby/object:Gem::Dependency
|
15
|
+
name: bio-pileup_iterator
|
16
|
+
requirement: &86055840 !ruby/object:Gem::Requirement
|
17
|
+
none: false
|
18
|
+
requirements:
|
19
|
+
- - ! '>='
|
20
|
+
- !ruby/object:Gem::Version
|
21
|
+
version: 0.0.1
|
22
|
+
type: :runtime
|
23
|
+
prerelease: false
|
24
|
+
version_requirements: *86055840
|
25
|
+
- !ruby/object:Gem::Dependency
|
26
|
+
name: bio-logger
|
27
|
+
requirement: &86055510 !ruby/object:Gem::Requirement
|
28
|
+
none: false
|
29
|
+
requirements:
|
30
|
+
- - ! '>='
|
31
|
+
- !ruby/object:Gem::Version
|
32
|
+
version: 1.0.0
|
33
|
+
type: :runtime
|
34
|
+
prerelease: false
|
35
|
+
version_requirements: *86055510
|
36
|
+
- !ruby/object:Gem::Dependency
|
37
|
+
name: shoulda
|
38
|
+
requirement: &86055220 !ruby/object:Gem::Requirement
|
39
|
+
none: false
|
40
|
+
requirements:
|
41
|
+
- - ! '>='
|
42
|
+
- !ruby/object:Gem::Version
|
43
|
+
version: '0'
|
44
|
+
type: :development
|
45
|
+
prerelease: false
|
46
|
+
version_requirements: *86055220
|
47
|
+
- !ruby/object:Gem::Dependency
|
48
|
+
name: rdoc
|
49
|
+
requirement: &86054930 !ruby/object:Gem::Requirement
|
50
|
+
none: false
|
51
|
+
requirements:
|
52
|
+
- - ~>
|
53
|
+
- !ruby/object:Gem::Version
|
54
|
+
version: '3.12'
|
55
|
+
type: :development
|
56
|
+
prerelease: false
|
57
|
+
version_requirements: *86054930
|
58
|
+
- !ruby/object:Gem::Dependency
|
59
|
+
name: bundler
|
60
|
+
requirement: &86054570 !ruby/object:Gem::Requirement
|
61
|
+
none: false
|
62
|
+
requirements:
|
63
|
+
- - ! '>='
|
64
|
+
- !ruby/object:Gem::Version
|
65
|
+
version: 1.0.0
|
66
|
+
type: :development
|
67
|
+
prerelease: false
|
68
|
+
version_requirements: *86054570
|
69
|
+
- !ruby/object:Gem::Dependency
|
70
|
+
name: jeweler
|
71
|
+
requirement: &86054260 !ruby/object:Gem::Requirement
|
72
|
+
none: false
|
73
|
+
requirements:
|
74
|
+
- - ~>
|
75
|
+
- !ruby/object:Gem::Version
|
76
|
+
version: 1.8.3
|
77
|
+
type: :development
|
78
|
+
prerelease: false
|
79
|
+
version_requirements: *86054260
|
80
|
+
- !ruby/object:Gem::Dependency
|
81
|
+
name: bio
|
82
|
+
requirement: &86053790 !ruby/object:Gem::Requirement
|
83
|
+
none: false
|
84
|
+
requirements:
|
85
|
+
- - ! '>='
|
86
|
+
- !ruby/object:Gem::Version
|
87
|
+
version: 1.4.2
|
88
|
+
type: :development
|
89
|
+
prerelease: false
|
90
|
+
version_requirements: *86053790
|
91
|
+
- !ruby/object:Gem::Dependency
|
92
|
+
name: rdoc
|
93
|
+
requirement: &86053100 !ruby/object:Gem::Requirement
|
94
|
+
none: false
|
95
|
+
requirements:
|
96
|
+
- - ~>
|
97
|
+
- !ruby/object:Gem::Version
|
98
|
+
version: '3.12'
|
99
|
+
type: :development
|
100
|
+
prerelease: false
|
101
|
+
version_requirements: *86053100
|
102
|
+
description: bio-gag is a biogem for detecting and correcting a particular type of
|
103
|
+
error that occurs/occurred in particular versions of the IonTorrent DNA sequencing
|
104
|
+
kit. Recent versions of the system don't appear to suffer the same problem
|
105
|
+
email: gmail.com after donttrustben
|
106
|
+
executables:
|
107
|
+
- gag
|
108
|
+
extensions: []
|
109
|
+
extra_rdoc_files:
|
110
|
+
- LICENSE.txt
|
111
|
+
- README.rdoc
|
112
|
+
files:
|
113
|
+
- .document
|
114
|
+
- .travis.yml
|
115
|
+
- Gemfile
|
116
|
+
- LICENSE.txt
|
117
|
+
- README.rdoc
|
118
|
+
- Rakefile
|
119
|
+
- VERSION
|
120
|
+
- bin/gag
|
121
|
+
- lib/bio-gag.rb
|
122
|
+
- lib/bio/db/gag.rb
|
123
|
+
- test/helper.rb
|
124
|
+
- test/test_bio-gag.rb
|
125
|
+
homepage: http://github.com/wwood/bioruby-gag
|
126
|
+
licenses:
|
127
|
+
- MIT
|
128
|
+
post_install_message:
|
129
|
+
rdoc_options: []
|
130
|
+
require_paths:
|
131
|
+
- lib
|
132
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
133
|
+
none: false
|
134
|
+
requirements:
|
135
|
+
- - ! '>='
|
136
|
+
- !ruby/object:Gem::Version
|
137
|
+
version: '0'
|
138
|
+
segments:
|
139
|
+
- 0
|
140
|
+
hash: 567820925
|
141
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
142
|
+
none: false
|
143
|
+
requirements:
|
144
|
+
- - ! '>='
|
145
|
+
- !ruby/object:Gem::Version
|
146
|
+
version: '0'
|
147
|
+
requirements: []
|
148
|
+
rubyforge_project:
|
149
|
+
rubygems_version: 1.8.17
|
150
|
+
signing_key:
|
151
|
+
specification_version: 3
|
152
|
+
summary: bio-gag is a biogem for detecting and correcting a particular type of error
|
153
|
+
that occurs/occurred in particular versions of the IonTorrent DNA sequencing kit
|
154
|
+
test_files: []
|