bio-pileup_iterator 0.0.1 → 0.0.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/README.md +69 -0
- data/VERSION +1 -1
- data/lib/bio/db/pileup_iterator.rb +49 -157
- data/test/test_bio-pileup_iterator.rb +45 -31
- metadata +23 -23
- data/README.rdoc +0 -39
data/README.md
ADDED
@@ -0,0 +1,69 @@
|
|
1
|
+
# bio-pileup_iterator
|
2
|
+
|
3
|
+
[Pileup format](http://samtools.sourceforge.net/pileup.shtml) files are a representation of an alignment/mapping of reads to a reference. This biogem builds on the [bio-samtools biogem](https://github.com/helios/bioruby-samtools) to create enable developers to iterate through columns of a pileup format file, and interrogate possible polymorphisms for e.g. SNP detection. Say we have the pile lines like so
|
4
|
+
|
5
|
+
contig00001 199 A 4 .$...$ >a^>
|
6
|
+
contig00001 200 T 2 ..+1A aR
|
7
|
+
|
8
|
+
i.e.
|
9
|
+
|
10
|
+
line = "contig00001\t199\tA\t4\t.$...$\t>a^>\ncontig00001\t200\tT\t2\t..+1A\taR"
|
11
|
+
|
12
|
+
Then
|
13
|
+
|
14
|
+
piles = Bio::DB::PileupIterator.new(line).to_a
|
15
|
+
piles[0].reads #=> An array of 4 pileup reads (Bio::DB::PileupIterator::PileupRead objects)
|
16
|
+
|
17
|
+
The first reads ends at the first position
|
18
|
+
|
19
|
+
piles[0].reads[0].sequence #=> 'A'
|
20
|
+
|
21
|
+
The second read covers both positions:
|
22
|
+
|
23
|
+
piles[0].reads[1].sequence #=> 'AT'
|
24
|
+
|
25
|
+
Note that when you don't use ```to_a```, instead using ```Bio::DB::PileupIterator#each```, there is no "lookahead" (yet), so it doesn't find the T before it has iterated over it:
|
26
|
+
|
27
|
+
Bio::DB::PileupIterator.new(line).each{|pile| puts pile.reads[1].sequence if pile.pos==199} #=> "A"
|
28
|
+
|
29
|
+
Directions
|
30
|
+
|
31
|
+
piles[0].reads[1].direction #=> '+'
|
32
|
+
|
33
|
+
Insertions
|
34
|
+
|
35
|
+
piles[1].reads[1].insertions #=> {200=>"A"}
|
36
|
+
|
37
|
+
Apologies in advance for any missing features (e.g. currently it does handle deletions) and slowness (it wasn't really written with speed in mind).
|
38
|
+
|
39
|
+
## Installation
|
40
|
+
|
41
|
+
gem install bio-pileup_iterator
|
42
|
+
|
43
|
+
## Developers
|
44
|
+
|
45
|
+
To use the library
|
46
|
+
|
47
|
+
require 'bio-pileup_iterator'
|
48
|
+
|
49
|
+
The API doc is online. For more code examples see also the test files in
|
50
|
+
the source tree.
|
51
|
+
|
52
|
+
## Project home page
|
53
|
+
|
54
|
+
Information on the source tree, documentation, issues and how to contribute, see
|
55
|
+
|
56
|
+
http://github.com/wwood/bioruby-pileup_iterator
|
57
|
+
|
58
|
+
## Cite
|
59
|
+
|
60
|
+
If you use this software, please cite http://dx.doi.org/10.1093/bioinformatics/btq475
|
61
|
+
|
62
|
+
## Biogems.info
|
63
|
+
|
64
|
+
This Biogem is published at http://biogems.info/index.html#bio-pileup_iterator
|
65
|
+
|
66
|
+
## Copyright
|
67
|
+
|
68
|
+
Copyright (c) 2012 Ben J. Woodcroft. See LICENSE.txt for further details.
|
69
|
+
|
data/VERSION
CHANGED
@@ -1 +1 @@
|
|
1
|
-
0.0.
|
1
|
+
0.0.3
|
@@ -21,45 +21,42 @@ class Bio::DB::PileupIterator
|
|
21
21
|
# Known problems:
|
22
22
|
# * Doesn't record start or ends of each read
|
23
23
|
# * Doesn't lookahead to determine the sequence of each read (though it does give the preceding bases)
|
24
|
-
# * Gives no information with mismatches
|
25
24
|
def each
|
26
25
|
current_ordered_reads = []
|
27
26
|
log = Bio::Log::LoggerPlus['bio-pileup_iterator']
|
27
|
+
logging = true
|
28
28
|
|
29
29
|
@io.each_line do |line|
|
30
|
-
#log.debug "new current_line: #{line.inspect}"
|
31
30
|
pileup = Bio::DB::Pileup.new(line.strip)
|
32
31
|
current_read_index = 0
|
33
32
|
reads_ending = []
|
34
33
|
|
35
34
|
bases = pileup.read_bases
|
36
|
-
|
37
|
-
|
35
|
+
log.debug "new column's read_bases: #{bases.inspect}" if log.debug?
|
36
|
+
log.debug "pileup entry parsed: #{pileup.inspect}" if log.debug?
|
38
37
|
while bases.length > 0
|
39
|
-
#log.debug "bases remaining: #{bases} ------------------------"
|
40
38
|
|
41
39
|
# Firstly, what is the current read we are working with
|
42
40
|
current_read = current_ordered_reads[current_read_index]
|
43
41
|
# if adding a new read
|
44
42
|
if current_read.nil?
|
45
|
-
|
43
|
+
log.debug 'adding a new read: '+bases if log.debug?
|
46
44
|
current_read = PileupRead.new
|
47
45
|
current_ordered_reads.push current_read
|
48
|
-
else
|
49
|
-
#log.debug 'reusing a read'
|
50
46
|
end
|
51
47
|
matches = nil
|
52
|
-
|
53
|
-
#
|
54
|
-
|
55
|
-
|
56
|
-
|
57
|
-
|
58
|
-
|
59
|
-
|
60
|
-
|
61
|
-
|
62
|
-
|
48
|
+
|
49
|
+
# if starting, remove it
|
50
|
+
matched_string = ''
|
51
|
+
if bases[0..1]=='^]'
|
52
|
+
matched_string += bases[0]
|
53
|
+
bases = bases[2...bases.length]
|
54
|
+
end
|
55
|
+
log.debug "after read start removal, pileup is #{bases}" if log.debug?
|
56
|
+
|
57
|
+
# next expect the actual base bit
|
58
|
+
if matches = bases.match(/^([ACGTNacgtn\.\,\*])/)
|
59
|
+
matched_string += bases[0]
|
63
60
|
if matches[1] == '.'
|
64
61
|
raise if !current_read.direction.nil? and current_read.direction != PileupRead::FORWARD_DIRECTION
|
65
62
|
current_read.direction = PileupRead::FORWARD_DIRECTION
|
@@ -72,152 +69,47 @@ class Bio::DB::PileupIterator
|
|
72
69
|
# Could sanity check the direction here by detecting case, but eh
|
73
70
|
current_read.sequence = "#{current_read.sequence}#{matches[1]}"
|
74
71
|
end
|
75
|
-
|
76
|
-
|
77
|
-
|
78
|
-
|
79
|
-
|
80
|
-
|
81
|
-
if matches[5].length > 0
|
82
|
-
#log.debug "Ending this read"
|
83
|
-
# end this read
|
84
|
-
reads_ending.push current_read_index
|
85
|
-
end
|
86
|
-
# currently I don't care about indels, except for the direction, so I'll leave it at that for now
|
87
|
-
|
88
|
-
# end of the read
|
89
|
-
elsif matches = bases.match(/^([\.\,])\$/)
|
90
|
-
#log.debug "matched #{matches.to_s} as end of read"
|
91
|
-
# regular match in some direction, end of read
|
92
|
-
if matches[1]=='.' # if forwards
|
93
|
-
raise if current_read.direction and current_read.direction != PileupRead::FORWARD_DIRECTION
|
94
|
-
current_read.direction = PileupRead::FORWARD_DIRECTION
|
95
|
-
current_read.sequence = "#{current_read.sequence}#{pileup.ref_base}"
|
96
|
-
else # else must be backwards, since it can only be , or .
|
97
|
-
raise if current_read.direction and current_read.direction != PileupRead::REVERSE_DIRECTION
|
98
|
-
current_read.direction = PileupRead::REVERSE_DIRECTION
|
99
|
-
current_read.sequence = "#{current_read.sequence}#{pileup.ref_base}"
|
100
|
-
end
|
101
|
-
#log.debug "current read after deletion: #{current_read.inspect}"
|
102
|
-
reads_ending.push current_read_index
|
103
|
-
|
104
|
-
# regular match continuuing onwards
|
105
|
-
elsif matches = bases.match(/^\./)
|
106
|
-
#log.debug "matched #{matches.to_s} as forward regular match"
|
107
|
-
# regular match in the forward direction
|
108
|
-
raise if !current_read.direction.nil? and current_read.direction != PileupRead::FORWARD_DIRECTION
|
109
|
-
current_read.direction = PileupRead::FORWARD_DIRECTION
|
110
|
-
#log.debug "before adding this base, current sequence is '#{current_read.sequence}'"
|
111
|
-
current_read.sequence = "#{current_read.sequence}#{pileup.ref_base}"
|
112
|
-
#log.debug "after adding this base, current sequence is '#{current_read.sequence}', ref_base: #{pileup.ref_base}"
|
113
|
-
elsif matches = bases.match(/^\,/)
|
114
|
-
#log.debug "matched #{matches.to_s} as reverse regular match"
|
115
|
-
# regular match in the reverse direction
|
116
|
-
if !current_read.direction.nil? and current_read.direction != PileupRead::REVERSE_DIRECTION
|
117
|
-
error_msg = "Unexpectedly found read a #{current_read.direction} direction read when expecting a positive direction one. This suggests there is a problem with either the pileup file or this pileup parser. Current pileup column #{pileup.inspect}, read #{current_read.inspect}, chomped until #{bases}"
|
118
|
-
log.error error_msg
|
119
|
-
raise Exception, error_msg
|
120
|
-
end
|
121
|
-
current_read.direction = PileupRead::REVERSE_DIRECTION
|
122
|
-
current_read.sequence = "#{current_read.sequence}#{pileup.ref_base}"
|
72
|
+
# remove the matched base
|
73
|
+
bases = bases[1...bases.length]
|
74
|
+
else
|
75
|
+
raise Exception, "Expected a character corresponding to a base, one of '[ACGTNacgtn.,]'. Starting here: #{bases}, from #{pileup.inspect}"
|
76
|
+
end
|
77
|
+
log.debug "after regular position removal, pileup is #{bases}" if log.debug?
|
123
78
|
|
124
|
-
#
|
125
|
-
|
126
|
-
|
127
|
-
|
128
|
-
|
129
|
-
|
130
|
-
elsif matches[1] == ','
|
131
|
-
#log.debug 'reverse match starting a read'
|
132
|
-
current_read.direction = PileupRead::REVERSE_DIRECTION
|
133
|
-
current_read.sequence = "#{current_read.sequence}#{pileup.ref_base}"
|
134
|
-
elsif matches[1] == '*'
|
135
|
-
#log.debug 'starting a read with a gap'
|
136
|
-
# leave direction unknown at this point
|
137
|
-
current_read.sequence = "#{current_read.sequence}#{matches[1]}"
|
138
|
-
elsif matches[1] == matches[1].upcase
|
139
|
-
#log.debug 'forward match starting a read, warning of insertion next'
|
140
|
-
current_read.direction = PileupRead::FORWARD_DIRECTION
|
141
|
-
current_read.sequence = "#{current_read.sequence}#{matches[1]}"
|
142
|
-
else
|
143
|
-
#log.debug 'forward match starting a read, warning of insertion next'
|
144
|
-
current_read.direction = PileupRead::REVERSE_DIRECTION
|
145
|
-
current_read.sequence = "#{current_read.sequence}#{matches[1]}"
|
146
|
-
end
|
79
|
+
# then read insertion or deletion in the coming position(s)
|
80
|
+
if matches = bases.match(/^([\+\-])([0-9]+)/)
|
81
|
+
matched_length = matches[1].length+matches[2].length
|
82
|
+
bases = bases[matched_length...bases.length]
|
83
|
+
matched_string += matches[1]+matches[2]
|
84
|
+
log.debug "after removal of bases leading up to an insertion/deletion, pileup is #{bases}" if log.debug?
|
147
85
|
|
148
|
-
|
149
|
-
|
150
|
-
|
151
|
-
end
|
152
|
-
|
153
|
-
if matches[5].length > 0
|
154
|
-
#log.debug "Ending this read"
|
155
|
-
# end this read
|
156
|
-
reads_ending.push current_read_index
|
157
|
-
end
|
158
|
-
|
86
|
+
regex = /^([ACGTNacgtn=]{#{matches[2].to_i}})/
|
87
|
+
log.debug "insertion/deletion secondary regex: #{regex.inspect}" if log.debug?
|
88
|
+
last_matched = bases.match(regex)
|
159
89
|
|
160
|
-
|
161
|
-
|
162
|
-
if matches[1] == '.'
|
163
|
-
#log.debug 'forward match starting a read'
|
164
|
-
current_read.direction = PileupRead::FORWARD_DIRECTION
|
165
|
-
current_read.sequence = "#{current_read.sequence}#{pileup.ref_base}"
|
166
|
-
elsif matches[1] == ','
|
167
|
-
#log.debug 'reverse match starting a read'
|
168
|
-
current_read.direction = PileupRead::REVERSE_DIRECTION
|
169
|
-
current_read.sequence = "#{current_read.sequence}#{pileup.ref_base}"
|
170
|
-
elsif matches[1] == '*'
|
171
|
-
#log.debug 'gap starting a read'
|
172
|
-
current_read.sequence = "#{current_read.sequence}#{matches[1]}"
|
173
|
-
elsif matches[1] == matches[1].upcase
|
174
|
-
#log.debug 'forward match starting a read, warning of insertion next'
|
175
|
-
current_read.direction = PileupRead::FORWARD_DIRECTION
|
176
|
-
current_read.sequence = "#{current_read.sequence}#{matches[1]}"
|
90
|
+
if last_matched.nil?
|
91
|
+
raise Exception, "Failed to parse insertion. Starting here: #{bases}, from #{pileup.inspect}"
|
177
92
|
else
|
178
|
-
|
179
|
-
|
180
|
-
|
181
|
-
|
182
|
-
|
183
|
-
|
184
|
-
|
185
|
-
reads_ending.push current_read_index
|
186
|
-
end
|
187
|
-
|
188
|
-
|
189
|
-
elsif matches = bases.match(/^\*([\+\-])([0-9]+)([ACGTNacgtn=]+)(\${0,1})/)
|
190
|
-
#log.debug 'gap then insert/delete found'
|
191
|
-
# gap - should already be known from the last position
|
192
|
-
current_read.sequence = "#{current_read.sequence}*"
|
193
|
-
if matches[4].length > 0
|
194
|
-
#log.debug "Ending this read"
|
195
|
-
# end this read
|
196
|
-
reads_ending.push current_read_index
|
197
|
-
end
|
198
|
-
|
199
|
-
# record the insertion
|
200
|
-
if matches[1] == '+'
|
201
|
-
current_read.add_insertion pileup.pos, matches[2], matches[3]
|
202
|
-
end
|
203
|
-
|
204
|
-
elsif matches = bases.match(/(^[ACGTNacgtn\*])(\${0,1})/)
|
205
|
-
#log.debug 'mismatch found (or deletion)'
|
206
|
-
# simple mismatch
|
207
|
-
current_read.sequence = "#{current_read.sequence}#{matches[1]}"
|
208
|
-
if matches[2].length > 0
|
209
|
-
#log.debug "Ending this read"
|
210
|
-
reads_ending.push current_read_index
|
93
|
+
bases = bases[last_matched[1].length...bases.length]
|
94
|
+
if matches[1]=='+'
|
95
|
+
# record the insertion
|
96
|
+
current_read.add_insertion pileup.pos, matches[2], last_matched[1]
|
97
|
+
elsif matches[1]=='-'
|
98
|
+
#currently deletions are not recorded, slipped to future
|
99
|
+
end
|
211
100
|
end
|
212
101
|
end
|
213
|
-
|
102
|
+
log.debug "after indel removal, pileup is now #{bases}" if log.debug?
|
214
103
|
|
215
|
-
#
|
216
|
-
|
104
|
+
# Then read an ending read
|
105
|
+
if bases[0]=='$'
|
106
|
+
reads_ending.push current_read_index
|
107
|
+
matched_string += '$'
|
108
|
+
bases = bases[1...bases.length]
|
109
|
+
end
|
110
|
+
|
111
|
+
log.debug "Matched '#{matched_string}', now the bases are '#{bases}'" if log.debug?
|
217
112
|
|
218
|
-
#remove the matched part from the base string for next time
|
219
|
-
bases = bases[matches.to_s.length..bases.length-1]
|
220
|
-
|
221
113
|
current_read_index += 1
|
222
114
|
end
|
223
115
|
|
@@ -1,80 +1,87 @@
|
|
1
1
|
require 'helper'
|
2
2
|
|
3
3
|
class TestBioPileupIterator < Test::Unit::TestCase
|
4
|
+
#enable debug logging for pileup_iterator
|
5
|
+
def setup
|
6
|
+
log_name = 'bio-pileup_iterator'
|
7
|
+
Bio::Log::CLI.logger('stderr')
|
8
|
+
#Bio::Log::CLI.configure(log_name) # when commented out no debug is printed out
|
9
|
+
end
|
10
|
+
|
4
11
|
def test_pileup_parsing
|
5
12
|
line = "contig00001\t199\tA\t4\t.$...$\t>a^>"
|
6
13
|
#contig00001\t200\tT\t2\t..\taR"
|
7
14
|
piles = Bio::DB::PileupIterator.new(line).to_a
|
8
15
|
pileup = piles[0]
|
9
16
|
reads = piles[0].reads
|
10
|
-
|
17
|
+
|
11
18
|
assert_equal 'A', reads[0].sequence
|
12
19
|
assert_equal 4, reads.length
|
13
20
|
assert_kind_of Bio::DB::Pileup, pileup
|
14
21
|
end
|
15
|
-
|
22
|
+
|
16
23
|
def test_2_pileup_columns
|
17
24
|
line = "contig00001\t199\tA\t4\t.$...$\t>a^>\ncontig00001\t200\tT\t2\t..\taR"
|
18
25
|
piles = Bio::DB::PileupIterator.new(line).to_a
|
19
|
-
|
26
|
+
|
20
27
|
pileup = piles[0]
|
21
28
|
reads = piles[0].reads
|
22
29
|
reads2 = piles[1].reads
|
23
|
-
|
30
|
+
|
24
31
|
assert_equal 'A', piles[0].ref_base
|
25
32
|
assert_equal 'T', piles[1].ref_base
|
26
33
|
assert_equal 4, reads.length
|
27
34
|
assert_equal 2, reads2.length
|
28
35
|
assert_equal 'AT', reads2[0].sequence
|
29
36
|
end
|
30
|
-
|
37
|
+
|
31
38
|
def test_fwd_rev
|
32
39
|
line = "contig00001\t199\tA\t4\t.$,..$\t>a^>\ncontig00001\t200\tT\t2\t,.\taR"
|
33
40
|
piles = Bio::DB::PileupIterator.new(line).to_a
|
34
|
-
|
41
|
+
|
35
42
|
pileup = piles[0]
|
36
43
|
reads = piles[0].reads
|
37
44
|
reads2 = piles[1].reads
|
38
|
-
|
45
|
+
|
39
46
|
assert_equal 4, reads.length
|
40
47
|
assert_equal 2, reads2.length
|
41
48
|
assert_equal 'AT', reads2[0].sequence
|
42
49
|
assert_equal '-', reads2[0].direction
|
43
50
|
assert_equal '+', reads2[1].direction
|
44
51
|
end
|
45
|
-
|
52
|
+
|
46
53
|
def test_deletion
|
47
54
|
line = "contig00001\t199\tA\t4\t.-1T...$\t>a^>\ncontig00001\t200\tT\t2\t*..\taR"
|
48
55
|
piles = Bio::DB::PileupIterator.new(line).to_a
|
49
|
-
|
56
|
+
|
50
57
|
pileup = piles[0]
|
51
58
|
reads = piles[0].reads
|
52
59
|
reads2 = piles[1].reads
|
53
|
-
|
60
|
+
|
54
61
|
assert_equal 'A*', reads[0].sequence
|
55
62
|
assert_equal Hash.new, reads[0].insertions
|
56
63
|
end
|
57
|
-
|
64
|
+
|
58
65
|
def test_substitution
|
59
66
|
line = "contig00001\t199\tA\t4\t.G..$\t>a^>"
|
60
67
|
piles = Bio::DB::PileupIterator.new(line).to_a
|
61
|
-
|
68
|
+
|
62
69
|
pileup = piles[0]
|
63
70
|
reads = piles[0].reads
|
64
|
-
|
71
|
+
|
65
72
|
assert_equal 'A', reads[0].sequence
|
66
73
|
assert_equal 'G', reads[1].sequence
|
67
74
|
assert_equal 'A', reads[0].sequence
|
68
75
|
end
|
69
|
-
|
76
|
+
|
70
77
|
def test_substitution_with_insertion
|
71
78
|
line = "contig00001\t199\tA\t4\tG-1T..$.\t>a^>\ncontig00001\t200\tT\t2\t*..\taR"
|
72
79
|
piles = Bio::DB::PileupIterator.new(line).to_a
|
73
|
-
|
80
|
+
|
74
81
|
pileup = piles[0]
|
75
82
|
reads = piles[0].reads
|
76
83
|
reads2 = piles[1].reads
|
77
|
-
|
84
|
+
|
78
85
|
assert_equal 2, piles.length
|
79
86
|
assert_equal 4, reads.length
|
80
87
|
assert_equal 3, reads2.length
|
@@ -83,12 +90,12 @@ class TestBioPileupIterator < Test::Unit::TestCase
|
|
83
90
|
assert_equal 'A', reads[2].sequence
|
84
91
|
assert_equal 'AT', reads[3].sequence
|
85
92
|
end
|
86
|
-
|
93
|
+
|
87
94
|
def test_start_read_warning_of_deletion_next
|
88
95
|
line = "contig00001\t8\tG\t4\t..,^],-1g\ta!U!\n"+
|
89
96
|
"contig00001\t9\tg\t4\t..,*\ta!aU"
|
90
97
|
piles = Bio::DB::PileupIterator.new(line).to_a
|
91
|
-
|
98
|
+
|
92
99
|
pileup = piles[0]
|
93
100
|
reads = piles[0].reads
|
94
101
|
reads2 = piles[1].reads
|
@@ -122,26 +129,23 @@ class TestBioPileupIterator < Test::Unit::TestCase
|
|
122
129
|
def test_start_with_a_gap
|
123
130
|
line = "contig00075\t503\tT\t24\t,^]*\tU\n"
|
124
131
|
piles = Bio::DB::PileupIterator.new(line)
|
125
|
-
# piles.log.level = Bio::Log::DEBUG
|
126
132
|
piles = piles.to_a
|
127
133
|
assert_equal 'T', piles[0].reads[0].sequence
|
128
134
|
assert_equal '*', piles[0].reads[1].sequence
|
129
135
|
end
|
130
|
-
|
136
|
+
|
131
137
|
def test_start_then_insert_then_end
|
132
138
|
line = "contig00075\t503\tG\t24\t^].+1T$^].\t~~\n"
|
133
139
|
piles = Bio::DB::PileupIterator.new(line)
|
134
|
-
# piles.log.level = Bio::Log::DEBUG
|
135
140
|
piles = piles.to_a
|
136
141
|
assert_equal 'G', piles[0].reads[0].sequence
|
137
142
|
assert_equal({503 => 'T'}, piles[0].reads[0].insertions)
|
138
143
|
assert_equal 'G', piles[0].reads[1].sequence
|
139
144
|
end
|
140
|
-
|
145
|
+
|
141
146
|
def test_star_then_insert2
|
142
147
|
line = "contig00075\t503\tG\t24\t,*+1g.\t~~\n"
|
143
148
|
piles = Bio::DB::PileupIterator.new(line)
|
144
|
-
# piles.log.level = Bio::Log::DEBUG
|
145
149
|
piles = piles.to_a
|
146
150
|
assert_equal 'G', piles[0].reads[0].sequence
|
147
151
|
assert_equal '*', piles[0].reads[1].sequence
|
@@ -153,7 +157,6 @@ class TestBioPileupIterator < Test::Unit::TestCase
|
|
153
157
|
"contig00075\t504\tA\t24\t,,.,\tE~\n"
|
154
158
|
|
155
159
|
piles = Bio::DB::PileupIterator.new(line)
|
156
|
-
# piles.log.level = Bio::Log::DEBUG
|
157
160
|
piles = piles.to_a
|
158
161
|
assert_equal 'GA', piles[0].reads[0].sequence
|
159
162
|
assert_equal 'GA', piles[0].reads[1].sequence
|
@@ -163,19 +166,17 @@ class TestBioPileupIterator < Test::Unit::TestCase
|
|
163
166
|
end
|
164
167
|
|
165
168
|
def test_double_insertion
|
166
|
-
line = "contig00075\t503\tG\t24\t*+
|
169
|
+
line = "contig00075\t503\tG\t24\t*+2gg\tE\n"
|
167
170
|
|
168
171
|
piles = Bio::DB::PileupIterator.new(line)
|
169
|
-
# piles.log.level = Bio::Log::DEBUG
|
170
172
|
piles = piles.to_a
|
171
173
|
assert_equal({503 => 'gg'}, piles[0].reads[0].insertions)
|
172
174
|
end
|
173
175
|
|
174
176
|
def test_non_perfect_starting_read
|
175
|
-
line = "contig00075\t503\tG\t24\t^
|
177
|
+
line = "contig00075\t503\tG\t24\t^].*+2gg\tE\n"
|
176
178
|
|
177
179
|
piles = Bio::DB::PileupIterator.new(line)
|
178
|
-
# piles.log.level = Bio::Log::DEBUG
|
179
180
|
piles = piles.to_a
|
180
181
|
assert_equal '+', piles[0].reads[0].direction
|
181
182
|
assert_equal 'G', piles[0].reads[0].sequence
|
@@ -185,10 +186,9 @@ class TestBioPileupIterator < Test::Unit::TestCase
|
|
185
186
|
def test_non_matching_finish
|
186
187
|
line = "contig00002\t6317\tC\t2\ta$.\t!B\n"+
|
187
188
|
"contig00002\t6318\tT\t1\t.\tA\n"
|
188
|
-
|
189
|
+
|
189
190
|
|
190
191
|
piles = Bio::DB::PileupIterator.new(line)
|
191
|
-
# piles.log.level = Bio::Log::DEBUG
|
192
192
|
piles = piles.to_a
|
193
193
|
assert_equal 2, piles[0].reads.length
|
194
194
|
assert_equal 'a', piles[0].reads[0].sequence
|
@@ -198,7 +198,7 @@ class TestBioPileupIterator < Test::Unit::TestCase
|
|
198
198
|
def test_insertion_then_mismatch
|
199
199
|
line = "contig00044\t867\tC\t6\t,,,,,.\t!:!!:=\n"+
|
200
200
|
"contig00044\t868\tG\t6\tt,+1ttt,.\t!A!!C9\n"
|
201
|
-
|
201
|
+
|
202
202
|
piles = Bio::DB::PileupIterator.new(line)
|
203
203
|
|
204
204
|
piles = piles.to_a
|
@@ -209,4 +209,18 @@ class TestBioPileupIterator < Test::Unit::TestCase
|
|
209
209
|
assert_equal hash, piles[0].reads[1].insertions
|
210
210
|
assert_equal 'Ct', piles[0].reads[2].sequence
|
211
211
|
end
|
212
|
+
|
213
|
+
def test_some_beta_testing_bug
|
214
|
+
#<Bio::DB::Pileup:0x00000009d5e8d8 @ref_name="contig03007", @pos=658, @ref_base="a", @coverage=14.0, @read_bases="gg+1cG-1Ag+1c*+1aG-1A*+1a****g**+1A", @read_quals="!!!!~!~!!!!!!~", @ref_count=nil, @non_ref_count_hash=nil, @non_ref_count=nil> (Exception)
|
215
|
+
line = "contig03007\t658\ta\t14\tgg+1cG-1Ag+1c*+1aG-1A*+1a****g**+1A\t!!!!~!~!!!!!!~\n"
|
216
|
+
piles = Bio::DB::PileupIterator.new(line).to_a #parse, it should fail otherwise
|
217
|
+
assert_equal 14, piles[0].coverage
|
218
|
+
end
|
219
|
+
|
220
|
+
def test_unexpected_equals
|
221
|
+
line = "contig00032\t264\tg\t10\t*+1=$.*+1c,.*+1c...,\t~^~^^~^^^W\n"
|
222
|
+
piles = Bio::DB::PileupIterator.new(line).to_a #parse, it should fail otherwise
|
223
|
+
assert_equal 10, piles[0].coverage
|
224
|
+
assert_equal 10, piles[0].reads.length
|
225
|
+
end
|
212
226
|
end
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: bio-pileup_iterator
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.0.
|
4
|
+
version: 0.0.3
|
5
5
|
prerelease:
|
6
6
|
platform: ruby
|
7
7
|
authors:
|
@@ -9,11 +9,11 @@ authors:
|
|
9
9
|
autorequire:
|
10
10
|
bindir: bin
|
11
11
|
cert_chain: []
|
12
|
-
date: 2012-
|
12
|
+
date: 2012-08-15 00:00:00.000000000 Z
|
13
13
|
dependencies:
|
14
14
|
- !ruby/object:Gem::Dependency
|
15
15
|
name: bio
|
16
|
-
requirement: &
|
16
|
+
requirement: &79398350 !ruby/object:Gem::Requirement
|
17
17
|
none: false
|
18
18
|
requirements:
|
19
19
|
- - ! '>='
|
@@ -21,10 +21,10 @@ dependencies:
|
|
21
21
|
version: 1.4.2
|
22
22
|
type: :runtime
|
23
23
|
prerelease: false
|
24
|
-
version_requirements: *
|
24
|
+
version_requirements: *79398350
|
25
25
|
- !ruby/object:Gem::Dependency
|
26
26
|
name: bio-samtools
|
27
|
-
requirement: &
|
27
|
+
requirement: &79397770 !ruby/object:Gem::Requirement
|
28
28
|
none: false
|
29
29
|
requirements:
|
30
30
|
- - ! '>='
|
@@ -32,10 +32,10 @@ dependencies:
|
|
32
32
|
version: 0.5.3
|
33
33
|
type: :runtime
|
34
34
|
prerelease: false
|
35
|
-
version_requirements: *
|
35
|
+
version_requirements: *79397770
|
36
36
|
- !ruby/object:Gem::Dependency
|
37
37
|
name: bio-logger
|
38
|
-
requirement: &
|
38
|
+
requirement: &79397240 !ruby/object:Gem::Requirement
|
39
39
|
none: false
|
40
40
|
requirements:
|
41
41
|
- - ! '>='
|
@@ -43,10 +43,10 @@ dependencies:
|
|
43
43
|
version: 1.0.0
|
44
44
|
type: :runtime
|
45
45
|
prerelease: false
|
46
|
-
version_requirements: *
|
46
|
+
version_requirements: *79397240
|
47
47
|
- !ruby/object:Gem::Dependency
|
48
48
|
name: shoulda
|
49
|
-
requirement: &
|
49
|
+
requirement: &79396870 !ruby/object:Gem::Requirement
|
50
50
|
none: false
|
51
51
|
requirements:
|
52
52
|
- - ! '>='
|
@@ -54,10 +54,10 @@ dependencies:
|
|
54
54
|
version: '0'
|
55
55
|
type: :development
|
56
56
|
prerelease: false
|
57
|
-
version_requirements: *
|
57
|
+
version_requirements: *79396870
|
58
58
|
- !ruby/object:Gem::Dependency
|
59
59
|
name: rdoc
|
60
|
-
requirement: &
|
60
|
+
requirement: &79396550 !ruby/object:Gem::Requirement
|
61
61
|
none: false
|
62
62
|
requirements:
|
63
63
|
- - ~>
|
@@ -65,10 +65,10 @@ dependencies:
|
|
65
65
|
version: '3.12'
|
66
66
|
type: :development
|
67
67
|
prerelease: false
|
68
|
-
version_requirements: *
|
68
|
+
version_requirements: *79396550
|
69
69
|
- !ruby/object:Gem::Dependency
|
70
70
|
name: bundler
|
71
|
-
requirement: &
|
71
|
+
requirement: &79396200 !ruby/object:Gem::Requirement
|
72
72
|
none: false
|
73
73
|
requirements:
|
74
74
|
- - ! '>='
|
@@ -76,10 +76,10 @@ dependencies:
|
|
76
76
|
version: 1.0.0
|
77
77
|
type: :development
|
78
78
|
prerelease: false
|
79
|
-
version_requirements: *
|
79
|
+
version_requirements: *79396200
|
80
80
|
- !ruby/object:Gem::Dependency
|
81
81
|
name: jeweler
|
82
|
-
requirement: &
|
82
|
+
requirement: &79395820 !ruby/object:Gem::Requirement
|
83
83
|
none: false
|
84
84
|
requirements:
|
85
85
|
- - ~>
|
@@ -87,10 +87,10 @@ dependencies:
|
|
87
87
|
version: 1.8.3
|
88
88
|
type: :development
|
89
89
|
prerelease: false
|
90
|
-
version_requirements: *
|
90
|
+
version_requirements: *79395820
|
91
91
|
- !ruby/object:Gem::Dependency
|
92
92
|
name: bio
|
93
|
-
requirement: &
|
93
|
+
requirement: &79481120 !ruby/object:Gem::Requirement
|
94
94
|
none: false
|
95
95
|
requirements:
|
96
96
|
- - ! '>='
|
@@ -98,10 +98,10 @@ dependencies:
|
|
98
98
|
version: 1.4.2
|
99
99
|
type: :development
|
100
100
|
prerelease: false
|
101
|
-
version_requirements: *
|
101
|
+
version_requirements: *79481120
|
102
102
|
- !ruby/object:Gem::Dependency
|
103
103
|
name: rdoc
|
104
|
-
requirement: &
|
104
|
+
requirement: &79480630 !ruby/object:Gem::Requirement
|
105
105
|
none: false
|
106
106
|
requirements:
|
107
107
|
- - ~>
|
@@ -109,20 +109,20 @@ dependencies:
|
|
109
109
|
version: '3.12'
|
110
110
|
type: :development
|
111
111
|
prerelease: false
|
112
|
-
version_requirements: *
|
112
|
+
version_requirements: *79480630
|
113
113
|
description: Iterate through a samtools pileup file
|
114
114
|
email: donttrustben near gmail.com
|
115
115
|
executables: []
|
116
116
|
extensions: []
|
117
117
|
extra_rdoc_files:
|
118
118
|
- LICENSE.txt
|
119
|
-
- README.
|
119
|
+
- README.md
|
120
120
|
files:
|
121
121
|
- .document
|
122
122
|
- .travis.yml
|
123
123
|
- Gemfile
|
124
124
|
- LICENSE.txt
|
125
|
-
- README.
|
125
|
+
- README.md
|
126
126
|
- Rakefile
|
127
127
|
- VERSION
|
128
128
|
- lib/bio-pileup_iterator.rb
|
@@ -144,7 +144,7 @@ required_ruby_version: !ruby/object:Gem::Requirement
|
|
144
144
|
version: '0'
|
145
145
|
segments:
|
146
146
|
- 0
|
147
|
-
hash:
|
147
|
+
hash: 421311475
|
148
148
|
required_rubygems_version: !ruby/object:Gem::Requirement
|
149
149
|
none: false
|
150
150
|
requirements:
|
data/README.rdoc
DELETED
@@ -1,39 +0,0 @@
|
|
1
|
-
= bio-pileup_iterator
|
2
|
-
|
3
|
-
Full description goes here
|
4
|
-
|
5
|
-
Note: this software is under active development!
|
6
|
-
|
7
|
-
== Installation
|
8
|
-
|
9
|
-
gem install bio-pileup_iterator
|
10
|
-
|
11
|
-
== Usage
|
12
|
-
|
13
|
-
== Developers
|
14
|
-
|
15
|
-
To use the library
|
16
|
-
|
17
|
-
require 'bio-pileup_iterator
|
18
|
-
|
19
|
-
The API doc is online. For more code examples see also the test files in
|
20
|
-
the source tree.
|
21
|
-
|
22
|
-
== Project home page
|
23
|
-
|
24
|
-
Information on the source tree, documentation, issues and how to contribute, see
|
25
|
-
|
26
|
-
http://github.com/wwood/bioruby-pileup_iterator
|
27
|
-
|
28
|
-
== Cite
|
29
|
-
|
30
|
-
If you use this software, please cite http://dx.doi.org/10.1093/bioinformatics/btq475
|
31
|
-
|
32
|
-
== Biogems.info
|
33
|
-
|
34
|
-
This Biogem is published at http://biogems.info/index.html#bio-pileup_iterator
|
35
|
-
|
36
|
-
== Copyright
|
37
|
-
|
38
|
-
Copyright (c) 2012 Ben J. Woodcroft. See LICENSE.txt for further details.
|
39
|
-
|