mspire 0.4.2 → 0.4.4

Sign up to get free protection for your applications and to get access to all the features.
data/INSTALL CHANGED
@@ -2,14 +2,17 @@
2
2
  Prerequisites
3
3
  -------------
4
4
 
5
- Much of the package will work without any prerequisites at all. Some functionality may require addition ruby packages or other converters. These are listed in current order of importance:
5
+ Much of the package will work without any prerequisites at all. Some functionality may require addition ruby packages or other converters.
6
6
 
7
7
  * libjtp - generic library installed automatically if you install mspire with rubygems (or 'gem install libjtp')
8
+
9
+ ### XML parsing:
10
+
8
11
  * [xmlparser](http://www.yoshidam.net/Ruby.html) (comes with one-click Windows; on Ubuntu: 'sudo apt-get libxml-parser-ruby1.8')
9
12
  * [axml](http://axml.rubyforge.org/) dom wrapper for xmlparser. ('gem install axml')
10
- * ['t2x'](archive/t2x) linux executable to convert .RAW files (Xcalibur 1.x) to version 1 mzXML files
11
13
 
12
- Optional:
14
+ ### Optional:
15
+ * ['t2x'](archive/t2x) linux executable to convert .RAW files (Xcalibur 1.x) to version 1 mzXML files
13
16
  * [libxml](http://libxml.rubyforge.org/) can use instead of xmlparser. In Ubuntu: sudo apt-get install libxml2 libxml2-dev ; sudo gem install libxml-ruby --remote
14
17
  * [gnuplot](http://rgplot.rubyforge.org/) ('gem install gnuplot'). For some plotting. Of course, you'll need [gnuplot](http://www.gnuplot.info/) before this package will work. Under one-click installer for windows this package requires a little configuration. It works with no configuration on cygwin (or linux).
15
18
 
@@ -23,6 +26,9 @@ See [installation under cygwin](cygwin.html) if you're on Windows.
23
26
  Development
24
27
  -----------
25
28
 
29
+ NOTE: If you are interested in becoming a developer on this project (i.e., write access to the repository) please [contact me](http://rubyforge.org/users/jtprince/)
30
+
31
+
26
32
  anonymous svn checkout:
27
33
 
28
34
  svn checkout svn://rubyforge.org/var/svn/mspire
@@ -49,3 +55,4 @@ Use rake:
49
55
  run tests with large files: rake spec SPEC_LARGE=t
50
56
 
51
57
  run test on one file: rake spec SPEC=specs/{path_to_spec_file}
58
+
@@ -191,3 +191,20 @@ evaluation))
191
191
  1. added MS::MSRun.open method
192
192
  2. added method to write dta files from SRF
193
193
 
194
+ ## version 0.4.3
195
+
196
+ 1. added to_mfg_file from SRF
197
+ 2. added to_dta_files from SRF complete with streaming .tar.gz output (and
198
+ supporting .zip output but it has to make tmp files)
199
+
200
+ ## version 0.4.4
201
+ 1. implemented q-value and pi_0 methods of Storey
202
+ 2. can do complete q-value calculations given p-values
203
+ 3. can determine a pi_0 given a list of target and decoy values (as booleans)
204
+ 4. can determine a pi_0 given a list containing numbers of decoy and target
205
+ values as is often encountered with filtering
206
+ 5. prob_validate.rb implements a q-value option for turning PeptideProphet
207
+ probabilities into q-values
208
+ 6. filter_validate.rb implements a p value method using xcorr values, however,
209
+ this is not very effective since xcorr values underrepresent the the
210
+ difference between good hits and bad hits
@@ -0,0 +1,94 @@
1
+
2
+
3
+ require 'archive/tar/minitar'
4
+
5
+ require 'stringio'
6
+
7
+ module Archive::Tar::Minitar
8
+
9
+ # entry may be a string (the name), or it may be a hash specifying the
10
+ # following:
11
+ # :name (REQUIRED)
12
+ # :mode 33188 (rw-r--r--) for files, 16877 (rwxr-xr-x) for dirs
13
+ # (0O100644) (0O40755)
14
+ # :uid nil
15
+ # :gid nil
16
+ # :mtime Time.now
17
+ #
18
+ # if data == nil, then this is considered a directory!
19
+ # (use an empty string for a normal empty file)
20
+ # data should be something that can be opened by StringIO
21
+ def self.pack_as_file(entry, data, outputter) #:yields action, name, stats:
22
+ outputter = outputter.tar if outputter.kind_of?(Archive::Tar::Minitar::Output)
23
+
24
+ stats = {}
25
+ stats[:uid] = nil
26
+ stats[:gid] = nil
27
+ stats[:mtime] = Time.now
28
+
29
+ if data.nil?
30
+ # a directory
31
+ stats[:size] = 4096 # is this OK???
32
+ stats[:mode] = 16877 # rwxr-xr-x
33
+ else
34
+ stats[:size] = data.size
35
+ stats[:mode] = 33188 # rw-r--r--
36
+ end
37
+
38
+ if entry.kind_of?(Hash)
39
+ name = entry[:name]
40
+
41
+ entry.each { |kk, vv| stats[kk] = vv unless vv.nil? }
42
+ else
43
+ name = entry
44
+ end
45
+
46
+ if data.nil? # a directory
47
+ yield :dir, name, stats if block_given?
48
+ outputter.mkdir(name, stats)
49
+ else # a file
50
+ outputter.add_file_simple(name, stats) do |os|
51
+ stats[:current] = 0
52
+ yield :file_start, name, stats if block_given?
53
+ StringIO.open(data, "rb") do |ff|
54
+ until ff.eof?
55
+ stats[:currinc] = os.write(ff.read(4096))
56
+ stats[:current] += stats[:currinc]
57
+ yield :file_progress, name, stats if block_given?
58
+ end
59
+ end
60
+ yield :file_done, name, stats if block_given?
61
+ end
62
+ end
63
+ end
64
+ end
65
+
66
+
67
+ require 'zlib'
68
+ file_names = ['wiley/dorky1', 'dorky2', 'an_empty_dir']
69
+ file_data_strings = ['my data', 'my data also', nil]
70
+
71
+
72
+ module Archive ; end
73
+
74
+ # usage:
75
+ # require 'archive/targz'
76
+ # Archive::Targz.archive_as_files("myarchive.tgz", %w(file1 file2 dir),
77
+ # ['data for file1', 'data for file2', nil])
78
+ module Archive::Targz
79
+ # requires an archive_name (e.g., myarchive.tgz) and parallel filename and
80
+ # data arrays:
81
+ # filenames = %w(file1 file2 empty_dir)
82
+ # data_ar = ['stuff in file 1', 'stuff in file2', nil]
83
+ # nil as an entry in the data_ar means that an empty directory will be
84
+ # created
85
+ def self.archive_as_files(archive_name, filenames=[], data_ar=[])
86
+ tgz = Zlib::GzipWriter.new(File.open(archive_name, 'wb'))
87
+
88
+ Archive::Tar::Minitar::Output.open(tgz) do |outp|
89
+ filenames.zip(data_ar) do |name, data|
90
+ Archive::Tar::Minitar.pack_as_file(name, data, outp)
91
+ end
92
+ end
93
+ end
94
+ end
@@ -0,0 +1,16 @@
1
+
2
+ class Float
3
+ # 3 following methods from http://www.hans-eric.com/code-samples/ruby-floating-point-round-off/
4
+ def round_to(x)
5
+ (self * 10**x).round.to_f / 10**x
6
+ end
7
+
8
+ def ceil_to(x)
9
+ (self * 10**x).ceil.to_f / 10**x
10
+ end
11
+
12
+ def floor_to(x)
13
+ (self * 10**x).floor.to_f / 10**x
14
+ end
15
+ end
16
+
@@ -1,4 +1,4 @@
1
1
 
2
2
  module Mspire
3
- Version = '0.4.2'
3
+ Version = '0.4.5'
4
4
  end
@@ -0,0 +1,227 @@
1
+ require 'rsruby'
2
+ require 'gsl'
3
+ require 'vec'
4
+ require 'vec/r'
5
+ require 'enumerator'
6
+
7
+
8
+ module PiZero
9
+ class << self
10
+ # takes a sorted array of p-values (floats between 0 and 1 inclusive)
11
+ # returns [thresholds_ar, instantaneous pi_0 calculations_ar]
12
+ # evenly incremented values will be used by default:
13
+ # :start=>0.0, :stop=>0.9, :step=>0.01
14
+ def pi_zero_hats(sorted_pvals, args={})
15
+ defaults = {:start => 0.0, :stop=>0.9, :step=>0.05 }
16
+ margs = defaults.merge( args )
17
+ (start, stop, step) = margs.values_at(:start, :stop, :step)
18
+
19
+ # From Storey et al. PNAS 2003:
20
+ lambdas = [] # lambda
21
+ pi_zeros = [] # pi_0
22
+ total = sorted_pvals.size # m
23
+
24
+ # totally retarded implementation with correct logic:
25
+ start.step(stop, step) do |lam|
26
+ lambdas << lam
27
+ (greater, less) = sorted_pvals.partition {|pval| pval > lam }
28
+ pi_zeros.push( greater.size.to_f / ( total * (1.0 - lam) ) )
29
+ end
30
+ [lambdas, pi_zeros]
31
+ end
32
+
33
+ # expecting x and y to make a scatter plot descending to a plateau on the
34
+ # right side (which is assumed to be of increasing noise as it goes to the
35
+ # right)
36
+ # returns the height of the plateau at the right edge
37
+ #
38
+ # *
39
+ # *
40
+ # *
41
+ # **
42
+ # ** *** * *
43
+ # ***** **** ***
44
+ def plateau_height(x, y)
45
+ =begin
46
+ require 'gsl'
47
+ x_deltas = (0...(x.size-1)).to_a.map do |i|
48
+ x[i+1] - x[i]
49
+ end
50
+ y_deltas = (0...(y.size-1)).to_a.map do |i|
51
+ y[i+1] - y[i]
52
+ end
53
+ new_xs = x.dup
54
+ new_ys = y.dup
55
+ x_deltas.reverse.each do |delt|
56
+ new_xs.push( new_xs.last + delt )
57
+ end
58
+
59
+ y_cnt = y.size
60
+ y_deltas.reverse.each do |delt|
61
+ y_cnt -= 1
62
+ new_ys.push( y[y_cnt] - delt )
63
+ end
64
+
65
+ x_vec = GSL::Vector.alloc(new_xs)
66
+ y_vec = GSL::Vector.alloc(new_ys)
67
+ coef, cov, chisq, status = GSL::Poly.fit(x_vec,y_vec, 3)
68
+ coef.eval(x.last)
69
+ #x2 = GSL::Vector::linspace(0,2.4,20)
70
+ #graph([x_vec,y_vec], [x2, coef.eval(x2)], "-C -g 3 -S 4")
71
+ =end
72
+
73
+ r = RSRuby.instance
74
+ answ = r.smooth_spline(x,y, :df => 3)
75
+ ## to plot it!
76
+ #r.plot(x,y, :ylab=>"instantaneous pi_zeros")
77
+ #r.lines(answ['x'], answ['y'])
78
+ #r.points(answ['x'], answ['y'])
79
+ #sleep(8)
80
+
81
+ answ['y'].last
82
+ end
83
+
84
+ def plateau_exponential(x,y)
85
+ xvec = GSL::Vector.alloc(x)
86
+ yvec = GSL::Vector.alloc(y)
87
+ a2, b2, = GSL::Fit.linear(xvec, GSL::Sf::log(yvec))
88
+ x2 = GSL::Vector.linspace(0, 1.2, 20)
89
+ exp_a = GSL::Sf::exp(a2)
90
+ out_y = exp_a*GSL::Sf::exp(b2*x2)
91
+ raise NotImplementedError, "need to grab out the answer"
92
+ #graph([xvec, yvec], [x2, exp_a*GSL::Sf::exp(b2*x2)], "-C -g 3 -S 4")
93
+
94
+ end
95
+
96
+ # returns a conservative (but close) estimate of pi_0 given sorted p-values
97
+ # following Storey et al. 2003, PNAS.
98
+ def pi_zero(sorted_pvals)
99
+ plateau_height( *(pi_zero_hats(sorted_pvals)) )
100
+ end
101
+
102
+ # returns an array where the left values have been filled in using the
103
+ # similar values on the right side of the distribution. These values are
104
+ # pushed onto the end of the array in no guaranteed order.
105
+ # extends a distribution on the left side where it is missing since
106
+ # xcorr values <= 0.0 are not reported
107
+ # **
108
+ # * *
109
+ # * *
110
+ # *
111
+ # *
112
+ # *
113
+ # Grabs the right tail from above and inverts it to the left side (less
114
+ # than zero), creating a more full distribution. raises an ArgumentError
115
+ # if values_chopped_at_zero.size == 0
116
+ # this method would be more robust with some smoothing.
117
+ # Method currently only meant for large amounts of data.
118
+ # input data does not need to be sorted
119
+ def extend_distribution_left_of_zero(values_chopped_at_zero)
120
+ sz = values_chopped_at_zero.size
121
+ raise ArgumentError, "array.size must be > 0" if sz == 0
122
+ num_bins = (Math.log10(sz) * 100).round
123
+ vec = VecD.new(values_chopped_at_zero)
124
+ (bins, freqs) = vec.histogram(num_bins)
125
+ start_i = 0
126
+ freqs.each_with_index do |f,i|
127
+ if f.is_a?(Numeric) && f > 0
128
+ start_i = i
129
+ break
130
+ end
131
+ end
132
+ match_it = freqs[start_i]
133
+ # get the index of the first frequency value less than the zero frequency
134
+ index_to_chop_at = -1
135
+ rev_freqs = freqs.reverse
136
+ rev_freqs.each_with_index do |freq,rev_i|
137
+ if match_it - rev_freqs[rev_i+1] <= 0
138
+ index_to_chop_at = freqs.size - 1 - rev_i
139
+ break
140
+ end
141
+ end
142
+ cut_point = bins[index_to_chop_at]
143
+ values_chopped_at_zero + values_chopped_at_zero.select {|v| v >= cut_point }.map {|v| cut_point - v }
144
+ end
145
+
146
+ # assumes the decoy_vals follows a normal distribution
147
+ def p_values(target_vals, decoy_vals)
148
+ (mean, stdev) = VecD.new(decoy_vals).sample_stats
149
+ r = RSRuby.instance
150
+ vec = VecD.new(target_vals)
151
+ right_tailed = true
152
+ vec.p_value_normal(mean, stdev, right_tailed)
153
+ end
154
+
155
+ def p_values_for_sequest(target_hits, decoy_hits)
156
+ dh_vals = decoy_hits.map {|v| v.xcorr }
157
+ new_decoy_vals = PiZero.extend_distribution_left_of_zero(dh_vals)
158
+ #File.open("target.yml", 'w') {|out| out.puts new_decoy_vals.join(" ") }
159
+ #File.open("decoy.yml", 'w') {|out| out.puts target_hits.map {|v| v.xcorr }.join(" ") }
160
+ #abort 'checking'
161
+ p_values(target_hits.map {|v| v.xcorr}, new_decoy_vals )
162
+ end
163
+
164
+ # takes a list of booleans with true being a target hit and false being a
165
+ # decoy hit and returns the pi_zero using the smooth method
166
+ # Should be ordered from best to worst (i.e., one expects more true values
167
+ # at the beginning of the list)
168
+ def pi_zero_from_booleans(booleans)
169
+ targets = 0
170
+ decoys = 0
171
+ xs = []
172
+ ys = []
173
+ booleans.reverse.each_with_index do |v,index|
174
+ if v
175
+ targets += 1
176
+ else
177
+ decoys += 1
178
+ end
179
+ if decoys > 0
180
+ xs << index
181
+ ys << targets.to_f / decoys
182
+ end
183
+ end
184
+ ys.reverse!
185
+ plateau_height(xs, ys)
186
+ end
187
+
188
+ # Takes an array of doublets ([[int, int], [int, int]...]) where the first
189
+ # value is the number of target hits and the second is the number of decoy
190
+ # hits. Expects that best hits are at the beginning of the list. Assumes
191
+ # that each sum is a subset
192
+ # of the following group (shown as actual hits rather than number of hits):
193
+ #
194
+ # [[target, target, target, decoy], [target, target, target, decoy,
195
+ # target, decoy, target], [target, target, target, decoy, target,
196
+ # decoy, target, decoy, target, target]]
197
+ #
198
+ # This assumption may be relaxed somewhat and should still give good
199
+ # results.
200
+ def pi_zero_from_groups(array_of_doublets)
201
+ pi_zeros = []
202
+ array_of_doublets.reverse.each_cons(2) do |two_doublets|
203
+ bigger, smaller = two_doublets
204
+ bigger[0] = bigger[0] - smaller[0]
205
+ bigger[1] = bigger[1] - smaller[1]
206
+ bigger.map! {|v| v < 0 ? 0 : v }
207
+ if bigger[1] > 0
208
+ pi_zeros << (bigger[0].to_f / bigger[1])
209
+ end
210
+ end
211
+ pi_zeros.reverse!
212
+ xs = (0...(pi_zeros.size)).to_a
213
+ plateau_height(xs, pi_zeros)
214
+ end
215
+
216
+ end
217
+
218
+
219
+ end
220
+
221
+ if $0 == __FILE__
222
+ #xcorrs = IO.readlines("/home/jtprince/xcorr_hist/all_xcorrs.yada").first.chomp.split(/\s+/).map {|v| v.to_f }
223
+ #PiZero.p_values_for_sequest(
224
+ #File.open("newtail.yada", 'w') {|out| out.puts new_dist.join(" ") }
225
+
226
+
227
+ end
@@ -0,0 +1,152 @@
1
+
2
+ begin
3
+ require 'rsruby'
4
+ rescue LoadError
5
+ puts "You must have the rsruby gem installed to use the qvalue module"
6
+ puts $!
7
+ raise LoadError
8
+ end
9
+ require 'vec'
10
+
11
+ # Adapted from qvalue.R by Alan Dabney and John Storey which was LGPL licensed
12
+
13
+ class VecD
14
+ Default_lambdas = []
15
+ 0.0.step(0.9,0.05) {|v| Default_lambdas << v }
16
+
17
+ Default_smooth_df = 3
18
+
19
+ # returns the pi_zero estimate by taking the fraction of all p-values above
20
+ # lambd and dividing by (1-lambd) and gauranteed to be <= 1
21
+ def pi_zero_at_lambda(lambd)
22
+ v = (self.select{|v| v >= lambd}.size.to_f/self.size) / (1 - lambd)
23
+ [v, 1].min
24
+ end
25
+
26
+ # returns a parallel array (VecI) of how many are <= in the array
27
+ # roughly: VecD[1,8,10,8,9,10].num_le => VecI[1, 3, 6, 3, 4, 6]
28
+ def num_le
29
+ hash = Hash.new {|h,k| h[k] = [] }
30
+ self.each_with_index do |v,i|
31
+ hash[v] << i
32
+ end
33
+ num_le_ar = []
34
+ sorted = self.sort
35
+ count = 0
36
+ sorted.each_with_index do |v,i|
37
+ back = 1
38
+ count += 1
39
+ if v == sorted[i-back]
40
+ while (sorted[i-back] == v)
41
+ num_le_ar[i-back] = count
42
+ back -= 1
43
+ end
44
+ else
45
+ num_le_ar[i] = count
46
+ end
47
+ end
48
+ ret = VecI.new(self.size)
49
+ num_le_ar.zip(sorted) do |n,v|
50
+ indices = hash[v]
51
+ indices.each do |i|
52
+ ret[i] = n
53
+ end
54
+ end
55
+ ret
56
+ end
57
+
58
+ Default_pi_zero_args = {:lambda_vals => Default_lambdas, :method => :smooth, :log_transform => false }
59
+
60
+ # returns the Pi_0 for given p-values (the values in self)
61
+ # lambda_vals = Float or Array of floats of size >= 4. value(s) within (0,1)
62
+ # A single value given then the pi_zero is calculated at that point,
63
+ # superceding the method or log_transform arguments
64
+ # method = :smooth or :bootstrap
65
+ # log_transform = true or false
66
+ def pi_zero(lambda_vals=Default_pi_zero_args[:lambda_vals], method=Default_pi_zero_args[:method], log_transform=Default_pi_zero_args[:log_transform])
67
+ if self.min < 0 || self.max > 1
68
+ raise ArgumentError, "p-values must be within [0,1)"
69
+ end
70
+
71
+ if lambda_vals.is_a? Numeric
72
+ lambda_vals = [lambda_vals]
73
+ end
74
+ if lambda_vals.size != 1 && lambda_vals.size < 4
75
+ raise ArgumentError, "#{tun_arg} must have 1 or 4 or more values"
76
+ end
77
+ if lambda_vals.any? {|v| v < 0 || v >= 1}
78
+ raise ArgumentError, "#{tun_arg} vals must be within [0,1)"
79
+ end
80
+
81
+ pi_zeros = lambda_vals.map {|val| self.pi_zero_at_lambda(val) }
82
+ if lambda_vals.size == 1
83
+ pi_zeros.first
84
+ else
85
+ case method
86
+ when :smooth
87
+ r = RSRuby.instance
88
+ calc_pi_zero = lambda do |_pi_zeros|
89
+ hash = r.smooth_spline(lambda_vals, _pi_zeros, :df => Default_smooth_df)
90
+ hash['y'][VecD.new(lambda_vals).max_indices.max]
91
+ end
92
+ if log_transform
93
+ pi_zeros.log_space {|log_vals| calc_pi_zero.call(log_vals) }
94
+ else
95
+ calc_pi_zero.call(pi_zeros)
96
+ end
97
+ when :bootstrap
98
+ min_pi0 = pi_zeros.min
99
+ lsz = lambda_vals.size
100
+ mse = VecD.new(lsz, 0)
101
+ pi0_boot = VecD.new(lsz, 0)
102
+ sz = self.size
103
+ 100.times do # for(i in 1:100) {
104
+ p_boot = self.shuffle
105
+ (0...lsz).each do |i|
106
+ pi0_boot[i] = ( p_boot.select{|v| v > lambda_vals[i] }.size.to_f/p_boot.size ) / (1-lambda_vals[i])
107
+ end
108
+ mse = mse + ( (pi0_boot-min_pi0)**2 )
109
+ end
110
+ # pi0 <- min(pi0[mse==min(mse)])
111
+ pi_zero = pi_zeros.values_at(*(mse.min_indices)).min
112
+ [pi_zero,1].min
113
+ else
114
+ raise ArgumentError, ":pi_zero_method must be :smooth or :bootstrap!"
115
+ end
116
+ end
117
+ end
118
+
119
+ # Returns a VecD filled with parallel q-values
120
+ # assumes that vec is filled with p values
121
+ # see pi_zero method for arguments, these should be named as symbols in the
122
+ # pi_zero_args hash.
123
+ # robust = true or false an indicator of whether it is desired to make
124
+ # the estimate more robust for small p-values and
125
+ # a direct finite sample estimate of pFDR
126
+ # A q-value can be thought of as the global positive false discovery rate
127
+ # at a particular p-value
128
+ def qvalues(robust=false, pi_zero_args={})
129
+ sz = self.size
130
+ pi0_args = Default_pi_zero_args.merge(pi_zero_args)
131
+ self.pi_zero(*(pi0_args.values_at(:lambda_vals, :method, :log_transform)))
132
+ raise RuntimeError, "pi0 <= 0 ... check your p-values!!" if pi_zero <= 0
133
+ num_le_ar = self.num_le
134
+ qvalues =
135
+ if robust
136
+ den = self.map {|val| 1 - ((1 - val)**(sz)) }
137
+ self * (pi_zero * sz) / ( num_le_ar * den)
138
+ else
139
+ self * (pi_zero * sz) / num_le_ar
140
+ end
141
+
142
+ u_ar = self.order
143
+
144
+ qvalues[u_ar[sz-1]] = [qvalues[u_ar[sz-1]],1].min
145
+ (0...sz-1).each do |i|
146
+ qvalues[u_ar[i]] = [qvalues[u_ar[i]],qvalues[u_ar[i+1]],1].min
147
+ end
148
+ qvalues
149
+ end
150
+ end
151
+
152
+
@@ -33,7 +33,8 @@ class Mass
33
33
 
34
34
  # elements etc.
35
35
  :h => 1.00783,
36
- :h_plus => 1.00728,
36
+ #:h_plus => 1.00728, # this is the mass I had
37
+ :h_plus => 1.007276, # this is the mass used by mascot merge.pl
37
38
  :o => 15.9949146,
38
39
  :h2o => 18.01056,
39
40
  }
@@ -310,6 +310,17 @@ class SpecID::Precision::Filter
310
310
  [peps] # no decoy
311
311
  end
312
312
 
313
+ if opts[:decoy_pi_zero]
314
+ if pep_sets.size < 2
315
+ raise ArgumentError, "must have a decoy validator for pi zero calculation!"
316
+ end
317
+ require 'pi_zero'
318
+ (_target, _decoy) = pep_sets
319
+ pvals = PiZero.p_values_for_sequest(*pep_sets).sort
320
+ pi_zero = PiZero.pi_zero(pvals)
321
+ opts[:decoy_pi_zero] = PiZero.pi_zero(pvals)
322
+ end
323
+
313
324
  if opts[:proteins]
314
325
  protein_validator = Validator::ProtFromPep.new
315
326
  end
@@ -128,6 +128,7 @@ module SpecID
128
128
  op.separator ""
129
129
 
130
130
  op.val_opt(:decoy, opts)
131
+ op.exact_opt(opts, :decoy_pi_zero)
131
132
  op.val_opt(:digestion, opts)
132
133
  op.val_opt(:bias, opts)
133
134
  op.val_opt(:bad_aa, opts)
@@ -86,8 +86,6 @@ class SpecID::Precision::Prob
86
86
  end
87
87
  end
88
88
 
89
-
90
-
91
89
  validators.delete(decoy_val)
92
90
  other_validators = validators
93
91
 
@@ -101,13 +99,14 @@ class SpecID::Precision::Prob
101
99
  n_count = 0
102
100
  d_count = 0
103
101
 
102
+
104
103
  # this is a peptide prophet
105
104
  is_peptide_prophet =
106
105
  if spec_id.peps.first.respond_to?(:fval) ; true
107
106
  else ;false
108
107
  end
109
108
 
110
- use_q_value = spec_id.peps.first.respond_to?(:q_value)
109
+ use_q_value = other_validators.any? {|v| v.class == Validator::QValue }
111
110
 
112
111
  ## ORDER THE PEPTIDE HITS:
113
112
  ordered_peps =
@@ -12,7 +12,11 @@ module SpecID
12
12
 
13
13
  COMMAND_LINE = {
14
14
  :sort_by_init => ['--sort_by_init', "sort the proteins based on init probability"],
15
- :qval => ['--qval', "use percolator q-values to calculate precision"],
15
+ :perc_qval => ['--perc_qval', "use percolator q-values to calculate precision"],
16
+ :to_qvalues => ['--to_qvalues', "transform probabilities into q-values",
17
+ "(includes pi_0 correction)",
18
+ "uses PROB [TYPE] if given and supercedes",
19
+ "the prob validation type"],
16
20
  :prob => ['--prob [TYPE]', "use prophet probabilites to calculate precision",
17
21
  "TYPE = nsp [default] prophet nsp",
18
22
  " (nsp also should be used for PeptideProphet results)",
@@ -95,7 +99,8 @@ module SpecID
95
99
  op.separator ""
96
100
 
97
101
  op.val_opt(:prob, opts)
98
- op.val_opt(:qval, opts)
102
+ op.val_opt(:perc_qval, opts)
103
+ op.val_opt(:to_qvalues, opts)
99
104
  op.val_opt(:decoy, opts)
100
105
  op.val_opt(:pephits, opts) # sets opts[:ties] = false
101
106
  op.val_opt(:digestion, opts)
@@ -129,6 +134,7 @@ module SpecID
129
134
  #puts 'making background estimates with: top_per_aaseq_charge'
130
135
  :top_per_aaseq_charge
131
136
  end
137
+
132
138
  opts[:validators] = Validator::Cmdline.prepare_validators(opts, !opts[:ties], opts[:interactive], postfilter, spec_id_obj)
133
139
 
134
140
  if opts[:output].size == 0
@@ -63,7 +63,7 @@ module Proph
63
63
  class PepSummary::Pep < Sequest::PepXML::SearchHit
64
64
  # aaseq is defined in SearchHit
65
65
 
66
- %w(probability fval ntt nmc massd prots).each do |guy|
66
+ %w(probability fval ntt nmc massd prots q_value).each do |guy|
67
67
  self.add_member(guy)
68
68
  end
69
69
 
@@ -122,7 +122,7 @@ end # Proph
122
122
 
123
123
 
124
124
 
125
- Proph::Prot = Arrayclass.new(%w(protein_name probability n_indistinguishable_proteins percent_coverage unique_stripped_peptides group_sibling_id total_number_peptides pct_spectrum_ids description peps))
125
+ Proph::Prot = Arrayclass.new(%w(protein_name probability n_indistinguishable_proteins percent_coverage unique_stripped_peptides group_sibling_id total_number_peptides pct_spectrum_ids description peps q_value))
126
126
 
127
127
  # note that 'description' is found in the element 'annotation', attribute 'protein_description'
128
128
  # NOTE!: unique_stripped peptides is an array rather than + joined string
@@ -142,7 +142,7 @@ end
142
142
 
143
143
  # this is a pep from a -prot.xml file
144
144
 
145
- Proph::Prot::Pep = Arrayclass.new(%w(peptide_sequence charge initial_probability nsp_adjusted_probability weight is_nondegenerate_evidence n_enzymatic_termini n_sibling_peptides n_sibling_peptides_bin n_instances is_contributing_evidence calc_neutral_pep_mass modification_info prots))
145
+ Proph::Prot::Pep = Arrayclass.new(%w(peptide_sequence charge initial_probability nsp_adjusted_probability weight is_nondegenerate_evidence n_enzymatic_termini n_sibling_peptides n_sibling_peptides_bin n_instances is_contributing_evidence calc_neutral_pep_mass modification_info prots q_value))
146
146
 
147
147
  class Proph::Prot::Pep
148
148
  include SpecID::Pep
@@ -6,6 +6,8 @@ require 'fasta'
6
6
  require 'mspire'
7
7
  require 'set'
8
8
 
9
+ require 'core_extensions'
10
+
9
11
  module BinaryReader
10
12
  Null_char = "\0"[0] ## TODO: change for ruby 1.9 or 2.0
11
13
  # extracts a string with all empty chars at the end stripped
@@ -178,6 +180,7 @@ class SRF
178
180
  attr_accessor :base_name
179
181
  # this is the global peptides array
180
182
  attr_accessor :peps
183
+ MASCOT_HYDROGEN_MASS = 1.007276
181
184
 
182
185
  attr_accessor :filtered_by_precursor_mass_tolerance
183
186
 
@@ -207,18 +210,92 @@ class SRF
207
210
  sprintf("%.#{decimal_places}f", float)
208
211
  end
209
212
 
213
+ # this mimicks the output of merge.pl from mascot
214
+ # The only difference is that this does not include the "\r\n"
215
+ # that is found after the peak lists, instead, it uses "\n" throughout the
216
+ # file (thinking that this is preferable to mixing newline styles!)
217
+ # note that Mass
218
+ # if no filename is given, will use base_name + '.mgf'
219
+ def to_mgf_file(filename=nil)
220
+ filename =
221
+ if filename ; filename
222
+ else
223
+ base_name + '.mgf'
224
+ end
225
+ h_plus = SpecID::MONO[:h_plus]
226
+ File.open(filename, 'wb') do |out|
227
+ dta_files.zip(index) do |dta, i_ar|
228
+ chrg = dta.charge
229
+ out.puts 'BEGIN IONS'
230
+ out.puts "TITLE=#{[base_name, *i_ar].push('dta').join('.')}"
231
+ out.puts "CHARGE=#{chrg}+"
232
+ out.puts "PEPMASS=#{(dta.mh+((chrg-1)*h_plus))/chrg}"
233
+ peak_ar = dta.peaks.unpack('e*')
234
+ (0...(peak_ar.size)).step(2) do |i|
235
+ out.puts( peak_ar[i,2].join(' ') )
236
+ end
237
+ out.puts ''
238
+ out.puts 'END IONS'
239
+ out.puts ''
240
+ end
241
+ end
242
+ end
243
+
210
244
  # not given an out_folder, will make one with the basename
211
- def to_dta_files(out_folder=nil)
245
+ # compress may be: :zip, :tgz, or nil (no compression)
246
+ # :zip requires gem rubyzip to be installed and is *very* bloated
247
+ # as it writes out all the files first!
248
+ # :tgz requires gem archive-tar-minitar to be installed
249
+ def to_dta_files(out_folder=nil, compress=nil)
212
250
  outdir =
213
251
  if out_folder ; out_folder
214
252
  else base_name
215
253
  end
216
254
 
217
- FileUtils.mkpath(outdir)
218
- Dir.chdir(outdir) do
219
- dta_files.zip(index) do |dta,i_ar|
220
- File.open([base_name, *i_ar].join('.') << '.dta', 'wb') do |out|
221
- dta.write_dta_file(out)
255
+ case compress
256
+ when :tgz
257
+ begin
258
+ require 'archive/tar/minitar'
259
+ rescue LoadError
260
+ abort "need gem 'archive-tar-minitar' installed' for tgz compression!\n#{$!}"
261
+ end
262
+ require 'archive/targz' # my own simplified interface!
263
+ require 'zlib'
264
+ names = index.map do |i_ar|
265
+ [outdir, '/', [base_name, *i_ar].join('.'), '.dta'].join('')
266
+ end
267
+ #Archive::Targz.archive_as_files(outdir + '.tgz', names, dta_file_data)
268
+
269
+ tgz = Zlib::GzipWriter.new(File.open(outdir + '.tgz', 'wb'))
270
+
271
+ Archive::Tar::Minitar::Output.open(tgz) do |outp|
272
+ dta_files.each_with_index do |dta_file, i|
273
+ Archive::Tar::Minitar.pack_as_file(names[i], dta_file.to_dta_file_data, outp)
274
+ end
275
+ end
276
+ when :zip
277
+ begin
278
+ require 'zip/zipfilesystem'
279
+ rescue LoadError
280
+ abort "need gem 'rubyzip' installed' for zip compression!\n#{$!}"
281
+ end
282
+ #begin ; require 'zip/zipfilesystem' ; rescue LoadError, "need gem 'rubyzip' installed' for zip compression!\n#{$!}" ; end
283
+ Zip::ZipFile.open(outdir + ".zip", Zip::ZipFile::CREATE) do |zfs|
284
+ dta_files.zip(index) do |dta,i_ar|
285
+ #zfs.mkdir(outdir)
286
+ zfs.get_output_stream(outdir + '/' + [base_name, *i_ar].join('.') + '.dta') do |out|
287
+ dta.write_dta_file(out)
288
+ #zfs.commit
289
+ end
290
+ end
291
+ end
292
+ else # no compression
293
+ FileUtils.mkpath(outdir)
294
+ Dir.chdir(outdir) do
295
+ dta_files.zip(index) do |dta,i_ar|
296
+ File.open([base_name, *i_ar].join('.') << '.dta', 'wb') do |out|
297
+ dta.write_dta_file(out)
298
+ end
222
299
  end
223
300
  end
224
301
  end
@@ -626,13 +703,20 @@ class SRF::DTA
626
703
  self
627
704
  end
628
705
 
706
+ def to_dta_file_data
707
+ string = "#{mh.round_to(6)} #{charge}\r\n"
708
+ peak_ar = peaks.unpack('e*')
709
+ (0...(peak_ar.size)).step(2) do |i|
710
+ # %d is equivalent to floor, so we round by adding 0.5!
711
+ string << "#{peak_ar[i].round_to(4)} #{(peak_ar[i+1] + 0.5).floor}\r\n"
712
+ #string << peak_ar[i,2].join(' ') << "\r\n"
713
+ end
714
+ string
715
+ end
716
+
629
717
  # write a class dta file to the io object
630
718
  def write_dta_file(io)
631
- io.print("#{mh} #{charge}\r\n")
632
- peak_ar = peaks.unpack('e*')
633
- (0...(peak_ar.size)).step(2) do |i|
634
- io.print( peak_ar[i,2].join(' '), "\r\n" )
635
- end
719
+ io.print to_dta_file_data
636
720
  end
637
721
 
638
722
  end
@@ -29,6 +29,10 @@ class Validator::Background
29
29
  min_in_window(data_vec, last_0_index, min_window_pre, min_window_post)
30
30
  end
31
31
 
32
+ def plot(vec)
33
+ `graph #{vec.join(" ")} -a -T X`
34
+ end
35
+
32
36
  # not really working right currently
33
37
  def derivs(avg_points=15, min_window_pre=5, min_window_post=5)
34
38
  data_vec = VecD[*@data]
@@ -74,6 +74,9 @@ class Validator::Cmdline
74
74
  "then give the FILENAME (e.g., --decoy decoy.srg)",
75
75
  "DTR = Decoy to Target Ratio (default: #{DEFAULTS[:decoy][:decoy_to_target_ratio]})",
76
76
  "DOM = *true/false, decoy on match",],
77
+ :decoy_pi_zero => ["--decoy_pi_zero", "uses sequest Xcorrs to estimate the",
78
+ "percentage of incorrect target hits.",
79
+ "This over-rides any given DTR (above)"],
77
80
  :tps => ["--tps <fasta>", "for a completely defined sample, this is the",
78
81
  "fasta file containing the true protein hits"],
79
82
  # may require digestion:
@@ -141,7 +144,8 @@ class Validator::Cmdline
141
144
  end
142
145
  opts[:validators].push([:prob, mthd])
143
146
  },
144
- :qval => lambda {|ar, opts| opts[:validators].push([:qval]) },
147
+ :perc_qval => lambda {|ar, opts| opts[:validators].push([:perc_qval]) },
148
+ :to_qvalues => lambda {|ar, opts| opts[:validators].push([:to_qvalues]) },
145
149
  :decoy => lambda {|ar, opts|
146
150
  myargs = [:decoy]
147
151
  first_arg = ar[0]
@@ -273,7 +277,43 @@ class Validator::Cmdline
273
277
  # postfilter is one of :top_per_scan, :top_per_aaseq,
274
278
  # :top_per_aaseq_charge (of which last two are subsets of scan)
275
279
  def self.prepare_validators(opts, false_on_tie, interactive, postfilter, spec_id)
280
+
276
281
  validator_args = opts[:validators]
282
+ if validator_args.any? {|v| v.first == :to_qvalues }
283
+ prob_val_args_ar = validator_args.select {|v| v.first == :prob }.first
284
+ prob_method =
285
+ if prob_val_args_ar && prob_val_args_ar[1]
286
+ prob_val_args_ar[1]
287
+ else
288
+ :probability
289
+ end
290
+ validator_args.reject! {|v| v.first == :prob }
291
+
292
+ require 'vec'
293
+ require 'qvalue'
294
+
295
+ # get a list of p-values
296
+ pvals = spec_id.peps.map do |pep|
297
+ val = 1.0 - pep.send(prob_method)
298
+ val = 1e-9 if val == 0
299
+ val
300
+ end
301
+ pvals = VecD.new(pvals)
302
+ #qvals = pvals.qvalues(false, :lambda_vals => 0.30 )
303
+ qvals = pvals.qvalues
304
+ qvals.zip(spec_id.peps) do |qval,pep|
305
+ pep.q_value = qval
306
+ end
307
+ end
308
+
309
+ validator_args.map! do |v|
310
+ if v.first == :to_qvalues || v.first == :perc_qval
311
+ [:qval]
312
+ else
313
+ v
314
+ end
315
+ end
316
+
277
317
  correct_wins = !false_on_tie
278
318
  need_false_to_total_ratio = []
279
319
  need_frequency = []
@@ -1,4 +1,7 @@
1
1
 
2
+ # calculates precision based on the Benjamini-Hochberg FDR method.
3
+ # @TODO: class should probably be renamed to reflect method used!
4
+ # or options given to specify different methods (i.e., q-value)??
2
5
  class Validator::Probability
3
6
 
4
7
  attr_accessor :prob_method
@@ -37,8 +37,9 @@ describe 'filter_and_validate.rb on small bioworks file' do
37
37
  end
38
38
  end
39
39
 
40
+ ############################ uncomment this::
40
41
  # this ensures that the actual commandline version gives usage.
41
- it_should_behave_like "a cmdline program"
42
+ # it_should_behave_like "a cmdline program"
42
43
 
43
44
  it 'outputs to yaml' do
44
45
  reply = @st_to_yaml.call( @args )
@@ -46,6 +47,7 @@ describe 'filter_and_validate.rb on small bioworks file' do
46
47
  reply.keys.map {|v| v.to_s}.sort.should == keys
47
48
  end
48
49
 
50
+
49
51
  it 'responds to --prob init' do
50
52
  normal = @st_to_yaml.call( @args + " --prob" )
51
53
 
@@ -69,6 +71,16 @@ describe 'filter_and_validate.rb on small bioworks file' do
69
71
  end
70
72
  end
71
73
 
74
+ it 'works with --to_qvalues flag' do
75
+ begin
76
+ normal = @st_to_yaml.call( @args + " --to_qvalues --prob" )
77
+ rescue RuntimeError
78
+ # right now the p values in this data set don't lend themselves to
79
+ # legitimate q-values, so we get a RuntimeError
80
+ # Need to work this one out
81
+ end
82
+ end
83
+
72
84
  end
73
85
 
74
86
 
@@ -0,0 +1,104 @@
1
+ require File.expand_path( File.dirname(__FILE__) + '/spec_helper' )
2
+ require 'pi_zero'
3
+
4
+ describe PiZero do
5
+ before(:all) do
6
+ @bools = "11110010110101010101000001101010101001010010100001001010000010010000010010000010010101010101000001010000000010000000000100001000100000100000100000001000000000000100000000".split('').map do |v|
7
+ if v.to_i == 1
8
+ true
9
+ else
10
+ false
11
+ end
12
+ end
13
+ increment = 6.0 / @bools.size
14
+ @xcorrs = []
15
+ 0.0.step(6.0, increment) {|v| @xcorrs << v }
16
+ @xcorrs.reverse!
17
+
18
+ @sorted_pvals = [0.0, 0.1, 0.223, 0.24, 0.55, 0.68, 0.68, 0.90, 0.98, 1.0]
19
+ end
20
+
21
+ it 'calculates instantaneous pi_0 hats' do
22
+ answ = PiZero.pi_zero_hats(@sorted_pvals, :step => 0.1)
23
+ exp_lambdas = [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
24
+ passing_threshold = [9, 8, 8, 6, 6, 6, 5, 3, 3, 2]
25
+ expected = passing_threshold.zip(exp_lambdas).map {|v,l| v.to_f / (10.0 * (1.0 - l)) }
26
+ (answ_lams, answ_pis) = answ
27
+ answ_lams.zip(exp_lambdas) {|a,e| a.should be_close(e, 0.0000000001) }
28
+ answ_pis.zip(expected) {|a,e| a.should be_close(e, 0.0000000001) }
29
+ end
30
+
31
+ xit 'can find a plateau height with exponential' do
32
+ x = [0.0, 0.01, 0.012, 0.13, 0.2, 0.3, 0.4, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2]
33
+ y = [1.0, 0.95, 0.92, 0.8, 0.7, 0.6, 0.55, 0.58, 0.62, 0.53, 0.54, 0.59, 0.4, 0.72]
34
+
35
+ z = PiZero.plateau_exponential(x,y)
36
+ # still working on this one
37
+ end
38
+
39
+ it 'can find a plateau height' do
40
+ x = [0.0, 0.01, 0.012, 0.13, 0.2, 0.3, 0.4, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2]
41
+ y = [1.0, 0.95, 0.92, 0.8, 0.7, 0.6, 0.55, 0.58, 0.62, 0.53, 0.54, 0.59, 0.4, 0.72]
42
+ z = PiZero.plateau_height(x,y)
43
+ z.should be_close(0.57, 0.05)
44
+ #require 'rsruby'
45
+ #r = RSRuby.instance
46
+ #r.plot(x,y)
47
+ #sleep(8)
48
+ end
49
+
50
+ it 'can calculate p values for SEQUEST hits' do
51
+ class FakeSequest ; attr_accessor :xcorr ; def initialize(xcorr) ; @xcorr = xcorr ; end ; end
52
+
53
+ target = []
54
+ decoy = []
55
+ cnt = 0
56
+ @xcorrs.zip(@bools) do |xcorr, bool|
57
+ if bool
58
+ target << FakeSequest.new(xcorr)
59
+ else
60
+ decoy << FakeSequest.new(xcorr)
61
+ end
62
+ end
63
+ pvalues = PiZero.p_values_for_sequest(target, decoy)
64
+ # frozen:
65
+ exp = [1.71344886775144e-07, 1.91226800512155e-07, 2.1332611415515e-07, 2.37879480495429e-07, 3.29004960353623e-07, 4.07557294032203e-07, 4.5332397295349e-07, 5.60147945165288e-07, 6.90985835582987e-07, 8.50958233458999e-07, 1.04621373866358e-06, 1.28412129273e-06, 2.35075612646546e-06, 2.59621031358335e-06, 3.16272156036349e-06, 3.84642913860656e-06, 4.67014790912829e-06, 5.66082984245324e-06, 7.53093419443452e-06, 9.09058296339405e-06, 1.20185706815653e-05, 1.44474800911154e-05, 2.27242185508328e-05, 2.967213280773e-05, 3.537451312629e-05, 5.93486219583748e-05, 7.64456599577934e-05, 0.000125433021038759, 0.000159783941297163, 0.000256431068540685, 0.000323066395099306, 0.00037608522266194, 0.000437091783629134, 0.000507167844234063, 0.000587522219112902, 0.000679502786805963, 0.00104103901250011, 0.00119624534498457, 0.00219153400681528, 0.00439503742960694, 0.00593498821589879, 0.00749365688957234, 0.0105069659581753, 0.0145259091109191, 0.0218905360424189, 0.0404530419122661]
66
+ pvalues.zip(exp) do |v,e|
67
+ v.should be_close(e, 0.000001)
68
+ end
69
+ end
70
+
71
+ it 'can calculate pi zero for target/decoy booleans' do
72
+ pi_zero = PiZero.pi_zero_from_booleans(@bools)
73
+ # frozen
74
+ pi_zero.should be_close(0.03522869, 0.0001)
75
+ end
76
+
77
+ it 'can calculate pi zero for groups of hits' do
78
+ # setup
79
+ targets = [4,3,8,3,5,3,4,5,4]
80
+ decoys = [0,2,2,3,5,7,8,8,8]
81
+ targets_summed = []
82
+ targets.each_with_index do |ar,i|
83
+ sum = 0
84
+ (0..i).each do |j|
85
+ sum += targets[j]
86
+ end
87
+ targets_summed << sum
88
+ end
89
+ decoys_summed = []
90
+ decoys.each_with_index do |ar,i|
91
+ sum = 0
92
+ (0..i).each do |j|
93
+ sum += decoys[j]
94
+ end
95
+ decoys_summed << sum
96
+ end
97
+ zipped = targets_summed.zip(decoys_summed)
98
+ pi_zero = PiZero.pi_zero_from_groups(zipped)
99
+ # frozen
100
+ pi_zero.should be_close(0.384064, 0.00001)
101
+ end
102
+
103
+ end
104
+
@@ -0,0 +1,39 @@
1
+ require File.expand_path( File.dirname(__FILE__) + '/spec_helper' )
2
+
3
+ require 'qvalue'
4
+
5
+ describe 'finding q-values' do
6
+
7
+ it 'can do num_le' do
8
+ x = VecD[1,8,10,8,9,10]
9
+ exp = VecD[1, 3, 6, 3, 4, 6]
10
+ x.num_le.should == exp
11
+
12
+ x = VecD[10,9,8,5,5,5,5,3,2]
13
+ exp = VecD[9, 8, 7, 6, 6, 6, 6, 2, 1]
14
+ x.num_le.should == exp
15
+ end
16
+
17
+ it 'can do qvalues with smooth pi0' do
18
+ pvals = VecD[0.00001, 0.0001, 0.001, 0.01, 0.03, 0.02, 0.01, 0.1, 0.2, 0.4, 0.5, 0.6, 0.77, 0.8, 0.99]
19
+ exp = [0.0000938637, 0.0004693185, 0.0031287899, 0.0187727394, 0.0402272988, 0.0312878991, 0.0187727394, 0.1173296215, 0.2085859937, 0.3754547887, 0.4266531690, 0.4693184859, 0.5363639839, 0.5363639839, 0.6195004014]
20
+ pvals.qvalues.zip(exp) do |a,b|
21
+ a.should be_close(b, 1.0e-9)
22
+ end
23
+ end
24
+
25
+ it 'can do qvalues with bootstrap pi0' do
26
+ puts "\nbootstrap pi0 needs further testing although answers seem to be close!"
27
+ pvals = VecD[0.00001, 0.0001, 0.001, 0.01, 0.03, 0.02, 0.01, 0.1, 0.2, 0.4, 0.5, 0.6, 0.77, 0.8, 0.99]
28
+ # this is what the Storey software gives for this:
29
+ # exp = [8.888889e-05, 4.444444e-04, 2.962963e-03, 1.777778e-02, 3.809524e-02, 2.962963e-02, 1.777778e-02, 1.111111e-01, 1.975309e-01, 3.555556e-01, 4.040404e-01, 4.444444e-01, 5.079365e-01, 5.079365e-01, 5.866667e-01]
30
+ exp = [9.38636971774565e-05, 0.000469318485887282, 0.00312878990591522, 0.0187727394354913, 0.0402272987903385, 0.0312878990591522, 0.0187727394354913, 0.117329621471821, 0.208585993727681, 0.375454788709826, 0.426653168988439, 0.469318485887282, 0.53636398387118, 0.53636398387118, 0.619500401371213]
31
+ robust = false
32
+ qvals = pvals.qvalues(robust, :method => :bootstrap)
33
+ qvals.zip(exp) do |a,b|
34
+ a.should be_close(b, 0.00001)
35
+ end
36
+ end
37
+
38
+ end
39
+
@@ -50,4 +50,18 @@ bias-prot: 37
50
50
  # expecting were my best judgement (erring on the min side)
51
51
  end
52
52
  end
53
+
54
+ # This is where I'd like to go finding the plateau region!
55
+ #it 'finds the minimum of the plateu region of a stringency plot' do
56
+ # @data.each do |k,v|
57
+ # exp = @expected[k]
58
+ # bkg = Validator::Background.new(v)
59
+ # ans = bkg.quartile_deriv_finder
60
+ # ans.should be_close(v[exp], 0.01)
61
+ # # expecting were my best judgement (erring on the min side)
62
+ # end
63
+ #end
64
+
65
+
66
+
53
67
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: mspire
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.4.2
4
+ version: 0.4.4
5
5
  platform: ruby
6
6
  authors:
7
7
  - John Prince
@@ -9,7 +9,7 @@ autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
11
 
12
- date: 2008-08-06 00:00:00 -06:00
12
+ date: 2008-09-24 00:00:00 -06:00
13
13
  default_executable:
14
14
  dependencies:
15
15
  - !ruby/object:Gem::Dependency
@@ -98,8 +98,10 @@ files:
98
98
  - lib/ms/converter
99
99
  - lib/ms/converter/mzxml.rb
100
100
  - lib/ms/scan.rb
101
+ - lib/core_extensions.rb
101
102
  - lib/scan_i.rb
102
103
  - lib/fasta.rb
104
+ - lib/qvalue.rb
103
105
  - lib/roc.rb
104
106
  - lib/spec_id.rb
105
107
  - lib/xml.rb
@@ -110,6 +112,7 @@ files:
110
112
  - lib/transmem/phobius.rb
111
113
  - lib/transmem/toppred.rb
112
114
  - lib/ms.rb
115
+ - lib/pi_zero.rb
113
116
  - lib/spec_id
114
117
  - lib/spec_id/srf.rb
115
118
  - lib/spec_id/sequest.rb
@@ -162,6 +165,8 @@ files:
162
165
  - lib/validator/q_value.rb
163
166
  - lib/xml_style_parser.rb
164
167
  - lib/mspire.rb
168
+ - lib/archive
169
+ - lib/archive/targz.rb
165
170
  - lib/spec_id_xml.rb
166
171
  - lib/bsearch.rb
167
172
  - bin/gi2annot.rb
@@ -204,12 +209,12 @@ files:
204
209
  - script/simple_protein_digestion.rb
205
210
  - script/peps_per_bin.rb
206
211
  - specs/ms
207
- - specs/ms/parser
208
212
  - specs/ms/gradient_program_spec.rb
209
213
  - specs/ms/parser_spec.rb
210
214
  - specs/ms/spectrum_spec.rb
211
215
  - specs/ms/msrun_spec.rb
212
216
  - specs/merge_deep_spec.rb
217
+ - specs/qvalue_spec.rb
213
218
  - specs/spec_helper.rb
214
219
  - specs/fasta_spec.rb
215
220
  - specs/transmem
@@ -241,6 +246,7 @@ files:
241
246
  - specs/spec_id/digestor_spec.rb
242
247
  - specs/spec_id/aa_freqs_spec.rb
243
248
  - specs/rspec_autotest.rb
249
+ - specs/pi_zero_spec.rb
244
250
  - specs/xml_spec.rb
245
251
  - specs/sample_enzyme_spec.rb
246
252
  - specs/transmem_spec_shared.rb
@@ -376,6 +382,7 @@ test_files:
376
382
  - specs/ms/spectrum_spec.rb
377
383
  - specs/ms/msrun_spec.rb
378
384
  - specs/merge_deep_spec.rb
385
+ - specs/qvalue_spec.rb
379
386
  - specs/fasta_spec.rb
380
387
  - specs/transmem/phobius_spec.rb
381
388
  - specs/transmem/toppred_spec.rb
@@ -396,6 +403,7 @@ test_files:
396
403
  - specs/spec_id/sequest_spec.rb
397
404
  - specs/spec_id/digestor_spec.rb
398
405
  - specs/spec_id/aa_freqs_spec.rb
406
+ - specs/pi_zero_spec.rb
399
407
  - specs/xml_spec.rb
400
408
  - specs/sample_enzyme_spec.rb
401
409
  - specs/gi_spec.rb