bio-pileup_iterator 0.0.1 → 0.0.3
Sign up to get free protection for your applications and to get access to all the features.
- data/README.md +69 -0
- data/VERSION +1 -1
- data/lib/bio/db/pileup_iterator.rb +49 -157
- data/test/test_bio-pileup_iterator.rb +45 -31
- metadata +23 -23
- data/README.rdoc +0 -39
data/README.md
ADDED
@@ -0,0 +1,69 @@
|
|
1
|
+
# bio-pileup_iterator
|
2
|
+
|
3
|
+
[Pileup format](http://samtools.sourceforge.net/pileup.shtml) files are a representation of an alignment/mapping of reads to a reference. This biogem builds on the [bio-samtools biogem](https://github.com/helios/bioruby-samtools) to create enable developers to iterate through columns of a pileup format file, and interrogate possible polymorphisms for e.g. SNP detection. Say we have the pile lines like so
|
4
|
+
|
5
|
+
contig00001 199 A 4 .$...$ >a^>
|
6
|
+
contig00001 200 T 2 ..+1A aR
|
7
|
+
|
8
|
+
i.e.
|
9
|
+
|
10
|
+
line = "contig00001\t199\tA\t4\t.$...$\t>a^>\ncontig00001\t200\tT\t2\t..+1A\taR"
|
11
|
+
|
12
|
+
Then
|
13
|
+
|
14
|
+
piles = Bio::DB::PileupIterator.new(line).to_a
|
15
|
+
piles[0].reads #=> An array of 4 pileup reads (Bio::DB::PileupIterator::PileupRead objects)
|
16
|
+
|
17
|
+
The first reads ends at the first position
|
18
|
+
|
19
|
+
piles[0].reads[0].sequence #=> 'A'
|
20
|
+
|
21
|
+
The second read covers both positions:
|
22
|
+
|
23
|
+
piles[0].reads[1].sequence #=> 'AT'
|
24
|
+
|
25
|
+
Note that when you don't use ```to_a```, instead using ```Bio::DB::PileupIterator#each```, there is no "lookahead" (yet), so it doesn't find the T before it has iterated over it:
|
26
|
+
|
27
|
+
Bio::DB::PileupIterator.new(line).each{|pile| puts pile.reads[1].sequence if pile.pos==199} #=> "A"
|
28
|
+
|
29
|
+
Directions
|
30
|
+
|
31
|
+
piles[0].reads[1].direction #=> '+'
|
32
|
+
|
33
|
+
Insertions
|
34
|
+
|
35
|
+
piles[1].reads[1].insertions #=> {200=>"A"}
|
36
|
+
|
37
|
+
Apologies in advance for any missing features (e.g. currently it does handle deletions) and slowness (it wasn't really written with speed in mind).
|
38
|
+
|
39
|
+
## Installation
|
40
|
+
|
41
|
+
gem install bio-pileup_iterator
|
42
|
+
|
43
|
+
## Developers
|
44
|
+
|
45
|
+
To use the library
|
46
|
+
|
47
|
+
require 'bio-pileup_iterator'
|
48
|
+
|
49
|
+
The API doc is online. For more code examples see also the test files in
|
50
|
+
the source tree.
|
51
|
+
|
52
|
+
## Project home page
|
53
|
+
|
54
|
+
Information on the source tree, documentation, issues and how to contribute, see
|
55
|
+
|
56
|
+
http://github.com/wwood/bioruby-pileup_iterator
|
57
|
+
|
58
|
+
## Cite
|
59
|
+
|
60
|
+
If you use this software, please cite http://dx.doi.org/10.1093/bioinformatics/btq475
|
61
|
+
|
62
|
+
## Biogems.info
|
63
|
+
|
64
|
+
This Biogem is published at http://biogems.info/index.html#bio-pileup_iterator
|
65
|
+
|
66
|
+
## Copyright
|
67
|
+
|
68
|
+
Copyright (c) 2012 Ben J. Woodcroft. See LICENSE.txt for further details.
|
69
|
+
|
data/VERSION
CHANGED
@@ -1 +1 @@
|
|
1
|
-
0.0.
|
1
|
+
0.0.3
|
@@ -21,45 +21,42 @@ class Bio::DB::PileupIterator
|
|
21
21
|
# Known problems:
|
22
22
|
# * Doesn't record start or ends of each read
|
23
23
|
# * Doesn't lookahead to determine the sequence of each read (though it does give the preceding bases)
|
24
|
-
# * Gives no information with mismatches
|
25
24
|
def each
|
26
25
|
current_ordered_reads = []
|
27
26
|
log = Bio::Log::LoggerPlus['bio-pileup_iterator']
|
27
|
+
logging = true
|
28
28
|
|
29
29
|
@io.each_line do |line|
|
30
|
-
#log.debug "new current_line: #{line.inspect}"
|
31
30
|
pileup = Bio::DB::Pileup.new(line.strip)
|
32
31
|
current_read_index = 0
|
33
32
|
reads_ending = []
|
34
33
|
|
35
34
|
bases = pileup.read_bases
|
36
|
-
|
37
|
-
|
35
|
+
log.debug "new column's read_bases: #{bases.inspect}" if log.debug?
|
36
|
+
log.debug "pileup entry parsed: #{pileup.inspect}" if log.debug?
|
38
37
|
while bases.length > 0
|
39
|
-
#log.debug "bases remaining: #{bases} ------------------------"
|
40
38
|
|
41
39
|
# Firstly, what is the current read we are working with
|
42
40
|
current_read = current_ordered_reads[current_read_index]
|
43
41
|
# if adding a new read
|
44
42
|
if current_read.nil?
|
45
|
-
|
43
|
+
log.debug 'adding a new read: '+bases if log.debug?
|
46
44
|
current_read = PileupRead.new
|
47
45
|
current_ordered_reads.push current_read
|
48
|
-
else
|
49
|
-
#log.debug 'reusing a read'
|
50
46
|
end
|
51
47
|
matches = nil
|
52
|
-
|
53
|
-
#
|
54
|
-
|
55
|
-
|
56
|
-
|
57
|
-
|
58
|
-
|
59
|
-
|
60
|
-
|
61
|
-
|
62
|
-
|
48
|
+
|
49
|
+
# if starting, remove it
|
50
|
+
matched_string = ''
|
51
|
+
if bases[0..1]=='^]'
|
52
|
+
matched_string += bases[0]
|
53
|
+
bases = bases[2...bases.length]
|
54
|
+
end
|
55
|
+
log.debug "after read start removal, pileup is #{bases}" if log.debug?
|
56
|
+
|
57
|
+
# next expect the actual base bit
|
58
|
+
if matches = bases.match(/^([ACGTNacgtn\.\,\*])/)
|
59
|
+
matched_string += bases[0]
|
63
60
|
if matches[1] == '.'
|
64
61
|
raise if !current_read.direction.nil? and current_read.direction != PileupRead::FORWARD_DIRECTION
|
65
62
|
current_read.direction = PileupRead::FORWARD_DIRECTION
|
@@ -72,152 +69,47 @@ class Bio::DB::PileupIterator
|
|
72
69
|
# Could sanity check the direction here by detecting case, but eh
|
73
70
|
current_read.sequence = "#{current_read.sequence}#{matches[1]}"
|
74
71
|
end
|
75
|
-
|
76
|
-
|
77
|
-
|
78
|
-
|
79
|
-
|
80
|
-
|
81
|
-
if matches[5].length > 0
|
82
|
-
#log.debug "Ending this read"
|
83
|
-
# end this read
|
84
|
-
reads_ending.push current_read_index
|
85
|
-
end
|
86
|
-
# currently I don't care about indels, except for the direction, so I'll leave it at that for now
|
87
|
-
|
88
|
-
# end of the read
|
89
|
-
elsif matches = bases.match(/^([\.\,])\$/)
|
90
|
-
#log.debug "matched #{matches.to_s} as end of read"
|
91
|
-
# regular match in some direction, end of read
|
92
|
-
if matches[1]=='.' # if forwards
|
93
|
-
raise if current_read.direction and current_read.direction != PileupRead::FORWARD_DIRECTION
|
94
|
-
current_read.direction = PileupRead::FORWARD_DIRECTION
|
95
|
-
current_read.sequence = "#{current_read.sequence}#{pileup.ref_base}"
|
96
|
-
else # else must be backwards, since it can only be , or .
|
97
|
-
raise if current_read.direction and current_read.direction != PileupRead::REVERSE_DIRECTION
|
98
|
-
current_read.direction = PileupRead::REVERSE_DIRECTION
|
99
|
-
current_read.sequence = "#{current_read.sequence}#{pileup.ref_base}"
|
100
|
-
end
|
101
|
-
#log.debug "current read after deletion: #{current_read.inspect}"
|
102
|
-
reads_ending.push current_read_index
|
103
|
-
|
104
|
-
# regular match continuuing onwards
|
105
|
-
elsif matches = bases.match(/^\./)
|
106
|
-
#log.debug "matched #{matches.to_s} as forward regular match"
|
107
|
-
# regular match in the forward direction
|
108
|
-
raise if !current_read.direction.nil? and current_read.direction != PileupRead::FORWARD_DIRECTION
|
109
|
-
current_read.direction = PileupRead::FORWARD_DIRECTION
|
110
|
-
#log.debug "before adding this base, current sequence is '#{current_read.sequence}'"
|
111
|
-
current_read.sequence = "#{current_read.sequence}#{pileup.ref_base}"
|
112
|
-
#log.debug "after adding this base, current sequence is '#{current_read.sequence}', ref_base: #{pileup.ref_base}"
|
113
|
-
elsif matches = bases.match(/^\,/)
|
114
|
-
#log.debug "matched #{matches.to_s} as reverse regular match"
|
115
|
-
# regular match in the reverse direction
|
116
|
-
if !current_read.direction.nil? and current_read.direction != PileupRead::REVERSE_DIRECTION
|
117
|
-
error_msg = "Unexpectedly found read a #{current_read.direction} direction read when expecting a positive direction one. This suggests there is a problem with either the pileup file or this pileup parser. Current pileup column #{pileup.inspect}, read #{current_read.inspect}, chomped until #{bases}"
|
118
|
-
log.error error_msg
|
119
|
-
raise Exception, error_msg
|
120
|
-
end
|
121
|
-
current_read.direction = PileupRead::REVERSE_DIRECTION
|
122
|
-
current_read.sequence = "#{current_read.sequence}#{pileup.ref_base}"
|
72
|
+
# remove the matched base
|
73
|
+
bases = bases[1...bases.length]
|
74
|
+
else
|
75
|
+
raise Exception, "Expected a character corresponding to a base, one of '[ACGTNacgtn.,]'. Starting here: #{bases}, from #{pileup.inspect}"
|
76
|
+
end
|
77
|
+
log.debug "after regular position removal, pileup is #{bases}" if log.debug?
|
123
78
|
|
124
|
-
#
|
125
|
-
|
126
|
-
|
127
|
-
|
128
|
-
|
129
|
-
|
130
|
-
elsif matches[1] == ','
|
131
|
-
#log.debug 'reverse match starting a read'
|
132
|
-
current_read.direction = PileupRead::REVERSE_DIRECTION
|
133
|
-
current_read.sequence = "#{current_read.sequence}#{pileup.ref_base}"
|
134
|
-
elsif matches[1] == '*'
|
135
|
-
#log.debug 'starting a read with a gap'
|
136
|
-
# leave direction unknown at this point
|
137
|
-
current_read.sequence = "#{current_read.sequence}#{matches[1]}"
|
138
|
-
elsif matches[1] == matches[1].upcase
|
139
|
-
#log.debug 'forward match starting a read, warning of insertion next'
|
140
|
-
current_read.direction = PileupRead::FORWARD_DIRECTION
|
141
|
-
current_read.sequence = "#{current_read.sequence}#{matches[1]}"
|
142
|
-
else
|
143
|
-
#log.debug 'forward match starting a read, warning of insertion next'
|
144
|
-
current_read.direction = PileupRead::REVERSE_DIRECTION
|
145
|
-
current_read.sequence = "#{current_read.sequence}#{matches[1]}"
|
146
|
-
end
|
79
|
+
# then read insertion or deletion in the coming position(s)
|
80
|
+
if matches = bases.match(/^([\+\-])([0-9]+)/)
|
81
|
+
matched_length = matches[1].length+matches[2].length
|
82
|
+
bases = bases[matched_length...bases.length]
|
83
|
+
matched_string += matches[1]+matches[2]
|
84
|
+
log.debug "after removal of bases leading up to an insertion/deletion, pileup is #{bases}" if log.debug?
|
147
85
|
|
148
|
-
|
149
|
-
|
150
|
-
|
151
|
-
end
|
152
|
-
|
153
|
-
if matches[5].length > 0
|
154
|
-
#log.debug "Ending this read"
|
155
|
-
# end this read
|
156
|
-
reads_ending.push current_read_index
|
157
|
-
end
|
158
|
-
|
86
|
+
regex = /^([ACGTNacgtn=]{#{matches[2].to_i}})/
|
87
|
+
log.debug "insertion/deletion secondary regex: #{regex.inspect}" if log.debug?
|
88
|
+
last_matched = bases.match(regex)
|
159
89
|
|
160
|
-
|
161
|
-
|
162
|
-
if matches[1] == '.'
|
163
|
-
#log.debug 'forward match starting a read'
|
164
|
-
current_read.direction = PileupRead::FORWARD_DIRECTION
|
165
|
-
current_read.sequence = "#{current_read.sequence}#{pileup.ref_base}"
|
166
|
-
elsif matches[1] == ','
|
167
|
-
#log.debug 'reverse match starting a read'
|
168
|
-
current_read.direction = PileupRead::REVERSE_DIRECTION
|
169
|
-
current_read.sequence = "#{current_read.sequence}#{pileup.ref_base}"
|
170
|
-
elsif matches[1] == '*'
|
171
|
-
#log.debug 'gap starting a read'
|
172
|
-
current_read.sequence = "#{current_read.sequence}#{matches[1]}"
|
173
|
-
elsif matches[1] == matches[1].upcase
|
174
|
-
#log.debug 'forward match starting a read, warning of insertion next'
|
175
|
-
current_read.direction = PileupRead::FORWARD_DIRECTION
|
176
|
-
current_read.sequence = "#{current_read.sequence}#{matches[1]}"
|
90
|
+
if last_matched.nil?
|
91
|
+
raise Exception, "Failed to parse insertion. Starting here: #{bases}, from #{pileup.inspect}"
|
177
92
|
else
|
178
|
-
|
179
|
-
|
180
|
-
|
181
|
-
|
182
|
-
|
183
|
-
|
184
|
-
|
185
|
-
reads_ending.push current_read_index
|
186
|
-
end
|
187
|
-
|
188
|
-
|
189
|
-
elsif matches = bases.match(/^\*([\+\-])([0-9]+)([ACGTNacgtn=]+)(\${0,1})/)
|
190
|
-
#log.debug 'gap then insert/delete found'
|
191
|
-
# gap - should already be known from the last position
|
192
|
-
current_read.sequence = "#{current_read.sequence}*"
|
193
|
-
if matches[4].length > 0
|
194
|
-
#log.debug "Ending this read"
|
195
|
-
# end this read
|
196
|
-
reads_ending.push current_read_index
|
197
|
-
end
|
198
|
-
|
199
|
-
# record the insertion
|
200
|
-
if matches[1] == '+'
|
201
|
-
current_read.add_insertion pileup.pos, matches[2], matches[3]
|
202
|
-
end
|
203
|
-
|
204
|
-
elsif matches = bases.match(/(^[ACGTNacgtn\*])(\${0,1})/)
|
205
|
-
#log.debug 'mismatch found (or deletion)'
|
206
|
-
# simple mismatch
|
207
|
-
current_read.sequence = "#{current_read.sequence}#{matches[1]}"
|
208
|
-
if matches[2].length > 0
|
209
|
-
#log.debug "Ending this read"
|
210
|
-
reads_ending.push current_read_index
|
93
|
+
bases = bases[last_matched[1].length...bases.length]
|
94
|
+
if matches[1]=='+'
|
95
|
+
# record the insertion
|
96
|
+
current_read.add_insertion pileup.pos, matches[2], last_matched[1]
|
97
|
+
elsif matches[1]=='-'
|
98
|
+
#currently deletions are not recorded, slipped to future
|
99
|
+
end
|
211
100
|
end
|
212
101
|
end
|
213
|
-
|
102
|
+
log.debug "after indel removal, pileup is now #{bases}" if log.debug?
|
214
103
|
|
215
|
-
#
|
216
|
-
|
104
|
+
# Then read an ending read
|
105
|
+
if bases[0]=='$'
|
106
|
+
reads_ending.push current_read_index
|
107
|
+
matched_string += '$'
|
108
|
+
bases = bases[1...bases.length]
|
109
|
+
end
|
110
|
+
|
111
|
+
log.debug "Matched '#{matched_string}', now the bases are '#{bases}'" if log.debug?
|
217
112
|
|
218
|
-
#remove the matched part from the base string for next time
|
219
|
-
bases = bases[matches.to_s.length..bases.length-1]
|
220
|
-
|
221
113
|
current_read_index += 1
|
222
114
|
end
|
223
115
|
|
@@ -1,80 +1,87 @@
|
|
1
1
|
require 'helper'
|
2
2
|
|
3
3
|
class TestBioPileupIterator < Test::Unit::TestCase
|
4
|
+
#enable debug logging for pileup_iterator
|
5
|
+
def setup
|
6
|
+
log_name = 'bio-pileup_iterator'
|
7
|
+
Bio::Log::CLI.logger('stderr')
|
8
|
+
#Bio::Log::CLI.configure(log_name) # when commented out no debug is printed out
|
9
|
+
end
|
10
|
+
|
4
11
|
def test_pileup_parsing
|
5
12
|
line = "contig00001\t199\tA\t4\t.$...$\t>a^>"
|
6
13
|
#contig00001\t200\tT\t2\t..\taR"
|
7
14
|
piles = Bio::DB::PileupIterator.new(line).to_a
|
8
15
|
pileup = piles[0]
|
9
16
|
reads = piles[0].reads
|
10
|
-
|
17
|
+
|
11
18
|
assert_equal 'A', reads[0].sequence
|
12
19
|
assert_equal 4, reads.length
|
13
20
|
assert_kind_of Bio::DB::Pileup, pileup
|
14
21
|
end
|
15
|
-
|
22
|
+
|
16
23
|
def test_2_pileup_columns
|
17
24
|
line = "contig00001\t199\tA\t4\t.$...$\t>a^>\ncontig00001\t200\tT\t2\t..\taR"
|
18
25
|
piles = Bio::DB::PileupIterator.new(line).to_a
|
19
|
-
|
26
|
+
|
20
27
|
pileup = piles[0]
|
21
28
|
reads = piles[0].reads
|
22
29
|
reads2 = piles[1].reads
|
23
|
-
|
30
|
+
|
24
31
|
assert_equal 'A', piles[0].ref_base
|
25
32
|
assert_equal 'T', piles[1].ref_base
|
26
33
|
assert_equal 4, reads.length
|
27
34
|
assert_equal 2, reads2.length
|
28
35
|
assert_equal 'AT', reads2[0].sequence
|
29
36
|
end
|
30
|
-
|
37
|
+
|
31
38
|
def test_fwd_rev
|
32
39
|
line = "contig00001\t199\tA\t4\t.$,..$\t>a^>\ncontig00001\t200\tT\t2\t,.\taR"
|
33
40
|
piles = Bio::DB::PileupIterator.new(line).to_a
|
34
|
-
|
41
|
+
|
35
42
|
pileup = piles[0]
|
36
43
|
reads = piles[0].reads
|
37
44
|
reads2 = piles[1].reads
|
38
|
-
|
45
|
+
|
39
46
|
assert_equal 4, reads.length
|
40
47
|
assert_equal 2, reads2.length
|
41
48
|
assert_equal 'AT', reads2[0].sequence
|
42
49
|
assert_equal '-', reads2[0].direction
|
43
50
|
assert_equal '+', reads2[1].direction
|
44
51
|
end
|
45
|
-
|
52
|
+
|
46
53
|
def test_deletion
|
47
54
|
line = "contig00001\t199\tA\t4\t.-1T...$\t>a^>\ncontig00001\t200\tT\t2\t*..\taR"
|
48
55
|
piles = Bio::DB::PileupIterator.new(line).to_a
|
49
|
-
|
56
|
+
|
50
57
|
pileup = piles[0]
|
51
58
|
reads = piles[0].reads
|
52
59
|
reads2 = piles[1].reads
|
53
|
-
|
60
|
+
|
54
61
|
assert_equal 'A*', reads[0].sequence
|
55
62
|
assert_equal Hash.new, reads[0].insertions
|
56
63
|
end
|
57
|
-
|
64
|
+
|
58
65
|
def test_substitution
|
59
66
|
line = "contig00001\t199\tA\t4\t.G..$\t>a^>"
|
60
67
|
piles = Bio::DB::PileupIterator.new(line).to_a
|
61
|
-
|
68
|
+
|
62
69
|
pileup = piles[0]
|
63
70
|
reads = piles[0].reads
|
64
|
-
|
71
|
+
|
65
72
|
assert_equal 'A', reads[0].sequence
|
66
73
|
assert_equal 'G', reads[1].sequence
|
67
74
|
assert_equal 'A', reads[0].sequence
|
68
75
|
end
|
69
|
-
|
76
|
+
|
70
77
|
def test_substitution_with_insertion
|
71
78
|
line = "contig00001\t199\tA\t4\tG-1T..$.\t>a^>\ncontig00001\t200\tT\t2\t*..\taR"
|
72
79
|
piles = Bio::DB::PileupIterator.new(line).to_a
|
73
|
-
|
80
|
+
|
74
81
|
pileup = piles[0]
|
75
82
|
reads = piles[0].reads
|
76
83
|
reads2 = piles[1].reads
|
77
|
-
|
84
|
+
|
78
85
|
assert_equal 2, piles.length
|
79
86
|
assert_equal 4, reads.length
|
80
87
|
assert_equal 3, reads2.length
|
@@ -83,12 +90,12 @@ class TestBioPileupIterator < Test::Unit::TestCase
|
|
83
90
|
assert_equal 'A', reads[2].sequence
|
84
91
|
assert_equal 'AT', reads[3].sequence
|
85
92
|
end
|
86
|
-
|
93
|
+
|
87
94
|
def test_start_read_warning_of_deletion_next
|
88
95
|
line = "contig00001\t8\tG\t4\t..,^],-1g\ta!U!\n"+
|
89
96
|
"contig00001\t9\tg\t4\t..,*\ta!aU"
|
90
97
|
piles = Bio::DB::PileupIterator.new(line).to_a
|
91
|
-
|
98
|
+
|
92
99
|
pileup = piles[0]
|
93
100
|
reads = piles[0].reads
|
94
101
|
reads2 = piles[1].reads
|
@@ -122,26 +129,23 @@ class TestBioPileupIterator < Test::Unit::TestCase
|
|
122
129
|
def test_start_with_a_gap
|
123
130
|
line = "contig00075\t503\tT\t24\t,^]*\tU\n"
|
124
131
|
piles = Bio::DB::PileupIterator.new(line)
|
125
|
-
# piles.log.level = Bio::Log::DEBUG
|
126
132
|
piles = piles.to_a
|
127
133
|
assert_equal 'T', piles[0].reads[0].sequence
|
128
134
|
assert_equal '*', piles[0].reads[1].sequence
|
129
135
|
end
|
130
|
-
|
136
|
+
|
131
137
|
def test_start_then_insert_then_end
|
132
138
|
line = "contig00075\t503\tG\t24\t^].+1T$^].\t~~\n"
|
133
139
|
piles = Bio::DB::PileupIterator.new(line)
|
134
|
-
# piles.log.level = Bio::Log::DEBUG
|
135
140
|
piles = piles.to_a
|
136
141
|
assert_equal 'G', piles[0].reads[0].sequence
|
137
142
|
assert_equal({503 => 'T'}, piles[0].reads[0].insertions)
|
138
143
|
assert_equal 'G', piles[0].reads[1].sequence
|
139
144
|
end
|
140
|
-
|
145
|
+
|
141
146
|
def test_star_then_insert2
|
142
147
|
line = "contig00075\t503\tG\t24\t,*+1g.\t~~\n"
|
143
148
|
piles = Bio::DB::PileupIterator.new(line)
|
144
|
-
# piles.log.level = Bio::Log::DEBUG
|
145
149
|
piles = piles.to_a
|
146
150
|
assert_equal 'G', piles[0].reads[0].sequence
|
147
151
|
assert_equal '*', piles[0].reads[1].sequence
|
@@ -153,7 +157,6 @@ class TestBioPileupIterator < Test::Unit::TestCase
|
|
153
157
|
"contig00075\t504\tA\t24\t,,.,\tE~\n"
|
154
158
|
|
155
159
|
piles = Bio::DB::PileupIterator.new(line)
|
156
|
-
# piles.log.level = Bio::Log::DEBUG
|
157
160
|
piles = piles.to_a
|
158
161
|
assert_equal 'GA', piles[0].reads[0].sequence
|
159
162
|
assert_equal 'GA', piles[0].reads[1].sequence
|
@@ -163,19 +166,17 @@ class TestBioPileupIterator < Test::Unit::TestCase
|
|
163
166
|
end
|
164
167
|
|
165
168
|
def test_double_insertion
|
166
|
-
line = "contig00075\t503\tG\t24\t*+
|
169
|
+
line = "contig00075\t503\tG\t24\t*+2gg\tE\n"
|
167
170
|
|
168
171
|
piles = Bio::DB::PileupIterator.new(line)
|
169
|
-
# piles.log.level = Bio::Log::DEBUG
|
170
172
|
piles = piles.to_a
|
171
173
|
assert_equal({503 => 'gg'}, piles[0].reads[0].insertions)
|
172
174
|
end
|
173
175
|
|
174
176
|
def test_non_perfect_starting_read
|
175
|
-
line = "contig00075\t503\tG\t24\t^
|
177
|
+
line = "contig00075\t503\tG\t24\t^].*+2gg\tE\n"
|
176
178
|
|
177
179
|
piles = Bio::DB::PileupIterator.new(line)
|
178
|
-
# piles.log.level = Bio::Log::DEBUG
|
179
180
|
piles = piles.to_a
|
180
181
|
assert_equal '+', piles[0].reads[0].direction
|
181
182
|
assert_equal 'G', piles[0].reads[0].sequence
|
@@ -185,10 +186,9 @@ class TestBioPileupIterator < Test::Unit::TestCase
|
|
185
186
|
def test_non_matching_finish
|
186
187
|
line = "contig00002\t6317\tC\t2\ta$.\t!B\n"+
|
187
188
|
"contig00002\t6318\tT\t1\t.\tA\n"
|
188
|
-
|
189
|
+
|
189
190
|
|
190
191
|
piles = Bio::DB::PileupIterator.new(line)
|
191
|
-
# piles.log.level = Bio::Log::DEBUG
|
192
192
|
piles = piles.to_a
|
193
193
|
assert_equal 2, piles[0].reads.length
|
194
194
|
assert_equal 'a', piles[0].reads[0].sequence
|
@@ -198,7 +198,7 @@ class TestBioPileupIterator < Test::Unit::TestCase
|
|
198
198
|
def test_insertion_then_mismatch
|
199
199
|
line = "contig00044\t867\tC\t6\t,,,,,.\t!:!!:=\n"+
|
200
200
|
"contig00044\t868\tG\t6\tt,+1ttt,.\t!A!!C9\n"
|
201
|
-
|
201
|
+
|
202
202
|
piles = Bio::DB::PileupIterator.new(line)
|
203
203
|
|
204
204
|
piles = piles.to_a
|
@@ -209,4 +209,18 @@ class TestBioPileupIterator < Test::Unit::TestCase
|
|
209
209
|
assert_equal hash, piles[0].reads[1].insertions
|
210
210
|
assert_equal 'Ct', piles[0].reads[2].sequence
|
211
211
|
end
|
212
|
+
|
213
|
+
def test_some_beta_testing_bug
|
214
|
+
#<Bio::DB::Pileup:0x00000009d5e8d8 @ref_name="contig03007", @pos=658, @ref_base="a", @coverage=14.0, @read_bases="gg+1cG-1Ag+1c*+1aG-1A*+1a****g**+1A", @read_quals="!!!!~!~!!!!!!~", @ref_count=nil, @non_ref_count_hash=nil, @non_ref_count=nil> (Exception)
|
215
|
+
line = "contig03007\t658\ta\t14\tgg+1cG-1Ag+1c*+1aG-1A*+1a****g**+1A\t!!!!~!~!!!!!!~\n"
|
216
|
+
piles = Bio::DB::PileupIterator.new(line).to_a #parse, it should fail otherwise
|
217
|
+
assert_equal 14, piles[0].coverage
|
218
|
+
end
|
219
|
+
|
220
|
+
def test_unexpected_equals
|
221
|
+
line = "contig00032\t264\tg\t10\t*+1=$.*+1c,.*+1c...,\t~^~^^~^^^W\n"
|
222
|
+
piles = Bio::DB::PileupIterator.new(line).to_a #parse, it should fail otherwise
|
223
|
+
assert_equal 10, piles[0].coverage
|
224
|
+
assert_equal 10, piles[0].reads.length
|
225
|
+
end
|
212
226
|
end
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: bio-pileup_iterator
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.0.
|
4
|
+
version: 0.0.3
|
5
5
|
prerelease:
|
6
6
|
platform: ruby
|
7
7
|
authors:
|
@@ -9,11 +9,11 @@ authors:
|
|
9
9
|
autorequire:
|
10
10
|
bindir: bin
|
11
11
|
cert_chain: []
|
12
|
-
date: 2012-
|
12
|
+
date: 2012-08-15 00:00:00.000000000 Z
|
13
13
|
dependencies:
|
14
14
|
- !ruby/object:Gem::Dependency
|
15
15
|
name: bio
|
16
|
-
requirement: &
|
16
|
+
requirement: &79398350 !ruby/object:Gem::Requirement
|
17
17
|
none: false
|
18
18
|
requirements:
|
19
19
|
- - ! '>='
|
@@ -21,10 +21,10 @@ dependencies:
|
|
21
21
|
version: 1.4.2
|
22
22
|
type: :runtime
|
23
23
|
prerelease: false
|
24
|
-
version_requirements: *
|
24
|
+
version_requirements: *79398350
|
25
25
|
- !ruby/object:Gem::Dependency
|
26
26
|
name: bio-samtools
|
27
|
-
requirement: &
|
27
|
+
requirement: &79397770 !ruby/object:Gem::Requirement
|
28
28
|
none: false
|
29
29
|
requirements:
|
30
30
|
- - ! '>='
|
@@ -32,10 +32,10 @@ dependencies:
|
|
32
32
|
version: 0.5.3
|
33
33
|
type: :runtime
|
34
34
|
prerelease: false
|
35
|
-
version_requirements: *
|
35
|
+
version_requirements: *79397770
|
36
36
|
- !ruby/object:Gem::Dependency
|
37
37
|
name: bio-logger
|
38
|
-
requirement: &
|
38
|
+
requirement: &79397240 !ruby/object:Gem::Requirement
|
39
39
|
none: false
|
40
40
|
requirements:
|
41
41
|
- - ! '>='
|
@@ -43,10 +43,10 @@ dependencies:
|
|
43
43
|
version: 1.0.0
|
44
44
|
type: :runtime
|
45
45
|
prerelease: false
|
46
|
-
version_requirements: *
|
46
|
+
version_requirements: *79397240
|
47
47
|
- !ruby/object:Gem::Dependency
|
48
48
|
name: shoulda
|
49
|
-
requirement: &
|
49
|
+
requirement: &79396870 !ruby/object:Gem::Requirement
|
50
50
|
none: false
|
51
51
|
requirements:
|
52
52
|
- - ! '>='
|
@@ -54,10 +54,10 @@ dependencies:
|
|
54
54
|
version: '0'
|
55
55
|
type: :development
|
56
56
|
prerelease: false
|
57
|
-
version_requirements: *
|
57
|
+
version_requirements: *79396870
|
58
58
|
- !ruby/object:Gem::Dependency
|
59
59
|
name: rdoc
|
60
|
-
requirement: &
|
60
|
+
requirement: &79396550 !ruby/object:Gem::Requirement
|
61
61
|
none: false
|
62
62
|
requirements:
|
63
63
|
- - ~>
|
@@ -65,10 +65,10 @@ dependencies:
|
|
65
65
|
version: '3.12'
|
66
66
|
type: :development
|
67
67
|
prerelease: false
|
68
|
-
version_requirements: *
|
68
|
+
version_requirements: *79396550
|
69
69
|
- !ruby/object:Gem::Dependency
|
70
70
|
name: bundler
|
71
|
-
requirement: &
|
71
|
+
requirement: &79396200 !ruby/object:Gem::Requirement
|
72
72
|
none: false
|
73
73
|
requirements:
|
74
74
|
- - ! '>='
|
@@ -76,10 +76,10 @@ dependencies:
|
|
76
76
|
version: 1.0.0
|
77
77
|
type: :development
|
78
78
|
prerelease: false
|
79
|
-
version_requirements: *
|
79
|
+
version_requirements: *79396200
|
80
80
|
- !ruby/object:Gem::Dependency
|
81
81
|
name: jeweler
|
82
|
-
requirement: &
|
82
|
+
requirement: &79395820 !ruby/object:Gem::Requirement
|
83
83
|
none: false
|
84
84
|
requirements:
|
85
85
|
- - ~>
|
@@ -87,10 +87,10 @@ dependencies:
|
|
87
87
|
version: 1.8.3
|
88
88
|
type: :development
|
89
89
|
prerelease: false
|
90
|
-
version_requirements: *
|
90
|
+
version_requirements: *79395820
|
91
91
|
- !ruby/object:Gem::Dependency
|
92
92
|
name: bio
|
93
|
-
requirement: &
|
93
|
+
requirement: &79481120 !ruby/object:Gem::Requirement
|
94
94
|
none: false
|
95
95
|
requirements:
|
96
96
|
- - ! '>='
|
@@ -98,10 +98,10 @@ dependencies:
|
|
98
98
|
version: 1.4.2
|
99
99
|
type: :development
|
100
100
|
prerelease: false
|
101
|
-
version_requirements: *
|
101
|
+
version_requirements: *79481120
|
102
102
|
- !ruby/object:Gem::Dependency
|
103
103
|
name: rdoc
|
104
|
-
requirement: &
|
104
|
+
requirement: &79480630 !ruby/object:Gem::Requirement
|
105
105
|
none: false
|
106
106
|
requirements:
|
107
107
|
- - ~>
|
@@ -109,20 +109,20 @@ dependencies:
|
|
109
109
|
version: '3.12'
|
110
110
|
type: :development
|
111
111
|
prerelease: false
|
112
|
-
version_requirements: *
|
112
|
+
version_requirements: *79480630
|
113
113
|
description: Iterate through a samtools pileup file
|
114
114
|
email: donttrustben near gmail.com
|
115
115
|
executables: []
|
116
116
|
extensions: []
|
117
117
|
extra_rdoc_files:
|
118
118
|
- LICENSE.txt
|
119
|
-
- README.
|
119
|
+
- README.md
|
120
120
|
files:
|
121
121
|
- .document
|
122
122
|
- .travis.yml
|
123
123
|
- Gemfile
|
124
124
|
- LICENSE.txt
|
125
|
-
- README.
|
125
|
+
- README.md
|
126
126
|
- Rakefile
|
127
127
|
- VERSION
|
128
128
|
- lib/bio-pileup_iterator.rb
|
@@ -144,7 +144,7 @@ required_ruby_version: !ruby/object:Gem::Requirement
|
|
144
144
|
version: '0'
|
145
145
|
segments:
|
146
146
|
- 0
|
147
|
-
hash:
|
147
|
+
hash: 421311475
|
148
148
|
required_rubygems_version: !ruby/object:Gem::Requirement
|
149
149
|
none: false
|
150
150
|
requirements:
|
data/README.rdoc
DELETED
@@ -1,39 +0,0 @@
|
|
1
|
-
= bio-pileup_iterator
|
2
|
-
|
3
|
-
Full description goes here
|
4
|
-
|
5
|
-
Note: this software is under active development!
|
6
|
-
|
7
|
-
== Installation
|
8
|
-
|
9
|
-
gem install bio-pileup_iterator
|
10
|
-
|
11
|
-
== Usage
|
12
|
-
|
13
|
-
== Developers
|
14
|
-
|
15
|
-
To use the library
|
16
|
-
|
17
|
-
require 'bio-pileup_iterator
|
18
|
-
|
19
|
-
The API doc is online. For more code examples see also the test files in
|
20
|
-
the source tree.
|
21
|
-
|
22
|
-
== Project home page
|
23
|
-
|
24
|
-
Information on the source tree, documentation, issues and how to contribute, see
|
25
|
-
|
26
|
-
http://github.com/wwood/bioruby-pileup_iterator
|
27
|
-
|
28
|
-
== Cite
|
29
|
-
|
30
|
-
If you use this software, please cite http://dx.doi.org/10.1093/bioinformatics/btq475
|
31
|
-
|
32
|
-
== Biogems.info
|
33
|
-
|
34
|
-
This Biogem is published at http://biogems.info/index.html#bio-pileup_iterator
|
35
|
-
|
36
|
-
== Copyright
|
37
|
-
|
38
|
-
Copyright (c) 2012 Ben J. Woodcroft. See LICENSE.txt for further details.
|
39
|
-
|