distribution 0.5.0 → 0.6.0

Sign up to get free protection for your applications and to get access to all the features.
data.tar.gz.sig CHANGED
Binary file
File without changes
@@ -1,3 +1,15 @@
1
+ === 0.6.0 / 2011-09-23
2
+ * Incomplete Beta functions on math renamed to Regularized beta, because MathExtension::IncompleteBeta calculates regularized beta function, not Incomplete Beta.
3
+ * Corrected documention on F distribution and added comments on gamma and beta[Claudio Bustos]
4
+ * Moved ported methods from GSL to lib/math_extension. Updated spec for gamma and beta distributions[Claudio Bustos]
5
+ * Added beta distribution functions. p_value does not seem to work yet.[John Woods]
6
+ * Added most incomplete beta function GSL translations, also log_beta from GSL.[John Woods]
7
+ * Added header information to the incomplete gamma files translated from GSL[John Woods]
8
+ * Removed left-over invgammp function.[John Woods]
9
+ * Fixed lots of bugs, translated most GSL error tests into rspec.[John Woods]
10
+ * Added Gamma distribution, spec. No statistics2 functions for Gamma, so only implemented pure Ruby and GSL.[John Woods]
11
+ * Added console task to rakefile[John Woods]
12
+
1
13
  === 0.5.0 / 2011-05-03
2
14
 
3
15
  * Exception raises on calculation of T's cdf with ruby engine. For now, stick to gsl implementation
@@ -35,4 +47,4 @@
35
47
 
36
48
  === 0.1.0 / 2011-01-26
37
49
 
38
- * Basic set (pdf, cdf, p_value) for Normal, Chi Square, F and T distributions
50
+ * Basic set (pdf, cdf, p_value) for Normal, Chi Square, F and T distributions
@@ -17,6 +17,10 @@ data/template/distribution/gsl.erb
17
17
  data/template/distribution/ruby.erb
18
18
  data/template/spec.erb
19
19
  lib/distribution.rb
20
+ lib/distribution/beta.rb
21
+ lib/distribution/beta/gsl.rb
22
+ lib/distribution/beta/java.rb
23
+ lib/distribution/beta/ruby.rb
20
24
  lib/distribution/binomial.rb
21
25
  lib/distribution/binomial/gsl.rb
22
26
  lib/distribution/binomial/java.rb
@@ -39,6 +43,10 @@ lib/distribution/f/gsl.rb
39
43
  lib/distribution/f/java.rb
40
44
  lib/distribution/f/ruby.rb
41
45
  lib/distribution/f/statistics2.rb
46
+ lib/distribution/gamma.rb
47
+ lib/distribution/gamma/gsl.rb
48
+ lib/distribution/gamma/java.rb
49
+ lib/distribution/gamma/ruby.rb
42
50
  lib/distribution/hypergeometric.rb
43
51
  lib/distribution/hypergeometric/gsl.rb
44
52
  lib/distribution/hypergeometric/java.rb
@@ -46,6 +54,14 @@ lib/distribution/hypergeometric/ruby.rb
46
54
  lib/distribution/logistic.rb
47
55
  lib/distribution/logistic/ruby.rb
48
56
  lib/distribution/math_extension.rb
57
+ lib/distribution/math_extension/chebyshev_series.rb
58
+ lib/distribution/math_extension/erfc.rb
59
+ lib/distribution/math_extension/exponential_integral.rb
60
+ lib/distribution/math_extension/gammastar.rb
61
+ lib/distribution/math_extension/gsl_utilities.rb
62
+ lib/distribution/math_extension/incomplete_beta.rb
63
+ lib/distribution/math_extension/incomplete_gamma.rb
64
+ lib/distribution/math_extension/log_utilities.rb
49
65
  lib/distribution/normal.rb
50
66
  lib/distribution/normal/gsl.rb
51
67
  lib/distribution/normal/java.rb
@@ -60,12 +76,14 @@ lib/distribution/t/gsl.rb
60
76
  lib/distribution/t/java.rb
61
77
  lib/distribution/t/ruby.rb
62
78
  lib/distribution/t/statistics2.rb
79
+ spec/beta_spec.rb
63
80
  spec/binomial_spec.rb
64
81
  spec/bivariatenormal_spec.rb
65
82
  spec/chisquare_spec.rb
66
83
  spec/distribution_spec.rb
67
84
  spec/exponential_spec.rb
68
85
  spec/f_spec.rb
86
+ spec/gamma_spec.rb
69
87
  spec/hypergeometric_spec.rb
70
88
  spec/logistic_spec.rb
71
89
  spec/math_extension_spec.rb
data/README.txt CHANGED
@@ -4,7 +4,7 @@
4
4
 
5
5
  == DESCRIPTION:
6
6
 
7
- Statistical Distributions library. Includes Normal univariate and bivariate, T, F, Chi Square, Binomial, Hypergeometric, Exponential and Poisson.
7
+ Statistical Distributions library. Includes Normal univariate and bivariate, T, F, Chi Square, Binomial, Hypergeometric, Exponential, Poisson, Beta and Gamma.
8
8
 
9
9
  Uses Ruby by default and C (statistics2/GSL) or Java extensions where available.
10
10
 
@@ -44,6 +44,8 @@ Shortnames for distributions:
44
44
  * Hypergeometric: hypg
45
45
  * Exponential: expo
46
46
  * Poisson: pois
47
+ * Beta: beta
48
+ * Gamma: gamma
47
49
 
48
50
  For example
49
51
 
data/Rakefile CHANGED
@@ -17,5 +17,10 @@ Hoe.spec 'distribution' do
17
17
  self.extra_dev_deps << ["rspec",">=2.0"] << ["rubyforge",">=0"]
18
18
 
19
19
  end
20
+ # git log --pretty=format:"*%s[%cn]" v0.5.0..HEAD >> History.txt
21
+ desc "Open an irb session preloaded with distribution"
22
+ task :console do
23
+ sh "irb -rubygems -I lib -r distribution.rb"
24
+ end
20
25
 
21
26
  # vim: syntax=ruby
@@ -22,7 +22,8 @@
22
22
  # * Code of several Ruby engines came from statistics2.rb,
23
23
  # created by Shin-ichiro HARA(sinara@blade.nagaokaut.ac.jp).
24
24
  # Retrieve from http://blade.nagaokaut.ac.jp/~sinara/ruby/math/statistics2/
25
- #
25
+ # * Code of Beta and Gamma distribution came from GSL project.
26
+ # Ported by John O. Woods
26
27
  # Specific notices will be placed where there are appropiate
27
28
  #
28
29
  if !respond_to? :define_singleton_method
@@ -49,7 +50,7 @@ require 'distribution/math_extension'
49
50
  # Distribution::Normal.p_value(0.95)
50
51
  # => 1.64485364660836
51
52
  module Distribution
52
- VERSION="0.5.0"
53
+ VERSION="0.6.0"
53
54
 
54
55
  module Shorthand
55
56
  EQUIVALENCES={:p_value=>:p, :cdf=>:cdf, :pdf=>:pdf, :rng=>:r, :exact_pdf=>:epdf, :exact_cdf=>:ecdf, :exact_p_value=>:ep}
@@ -133,13 +134,15 @@ module Distribution
133
134
 
134
135
  end
135
136
  # create alias for common methods
136
- alias_method :inverse_cdf,:p_value if singleton_methods.include? :p_value
137
+ alias_method :inverse_cdf, :p_value if singleton_methods.include? :p_value
137
138
  end
138
139
 
139
140
  end
140
141
 
141
142
  autoload(:Normal, 'distribution/normal')
142
143
  autoload(:ChiSquare, 'distribution/chisquare')
144
+ autoload(:Gamma, 'distribution/gamma')
145
+ autoload(:Beta, 'distribution/beta')
143
146
  autoload(:T, 'distribution/t')
144
147
  autoload(:F, 'distribution/f')
145
148
  autoload(:BivariateNormal, 'distribution/bivariatenormal')
@@ -0,0 +1,34 @@
1
+ require 'distribution/beta/ruby'
2
+ require 'distribution/beta/gsl'
3
+ # no statistics2 functions for beta.
4
+ require 'distribution/beta/java'
5
+
6
+ module Distribution
7
+ # From Wikipedia:
8
+ # In probability theory and statistics, the beta distribution
9
+ # is a family of continuous probability distributions defined
10
+ # on the interval (0, 1) parameterized by two positive shape
11
+ # parameters, typically denoted by alpha and beta.
12
+ # This module calculate cdf and inverse cdf for Beta Distribution.
13
+ #
14
+ module Beta
15
+ extend Distributable
16
+ SHORTHAND='beta'
17
+ create_distribution_methods
18
+
19
+ ##
20
+ # :singleton-method: pdf(x,a,b)
21
+ # Returns PDF of of Beta distribution with parameters a and b
22
+
23
+
24
+ ##
25
+ # :singleton-method: cdf(x,a,b)
26
+ # Returns the integral of Beta distribution with parameters a and b
27
+
28
+ ##
29
+ # :singleton-method: p_value(qn,a,b)
30
+ # Return the quantile of the corresponding integral +qn+
31
+ # on a beta distribution's cdf with parameters a and b
32
+
33
+ end
34
+ end
@@ -0,0 +1,24 @@
1
+ module Distribution
2
+ module Beta
3
+ module GSL_
4
+ class << self
5
+ def pdf(x,a,b)
6
+ GSL::Ran::beta_pdf(x.to_f, a.to_f, b.to_f)
7
+ end
8
+ # Return the P-value of the corresponding integral with
9
+ # k degrees of freedom
10
+ def p_value(pr,a,b)
11
+ GSL::Cdf::beta_Pinv(pr.to_f, a.to_f, b.to_f)
12
+ end
13
+ # Beta cumulative distribution function (cdf).
14
+ #
15
+ # Returns the integral of Beta distribution
16
+ # with parameters +a+ and +b+ over [0, x]
17
+ #
18
+ def cdf(x,a,b)
19
+ GSL::Cdf::beta_P(x.to_f, a.to_f, b.to_f)
20
+ end
21
+ end
22
+ end
23
+ end
24
+ end
@@ -0,0 +1,9 @@
1
+ module Distribution
2
+ module Beta
3
+ # TODO
4
+ module Java_
5
+ class << self
6
+ end
7
+ end
8
+ end
9
+ end
@@ -0,0 +1,42 @@
1
+ # Added by John O. Woods, SciRuby project.
2
+ module Distribution
3
+ module Beta
4
+ module Ruby_
5
+ class << self
6
+
7
+ include Math
8
+ # Beta distribution probability density function
9
+ #
10
+ # Adapted from GSL-1.9 (apparently by Knuth originally), found in randist/beta.c
11
+ #
12
+ # Form: p(x) dx = (Gamma(a + b)/(Gamma(a) Gamma(b))) x^(a-1) (1-x)^(b-1) dx
13
+ #
14
+ # == References
15
+ # * http://www.gnu.org/s/gsl/manual/html_node/The-Gamma-Distribution.html
16
+ def pdf(x,a,b)
17
+ return 0 if x < 0 || x > 1
18
+
19
+ gab = Math.lgamma(a+b).first
20
+ ga = Math.lgamma(a).first
21
+ gb = Math.lgamma(b).first
22
+
23
+ if x == 0.0 || x == 1.0
24
+ Math.exp(gab - ga - gb) * x**(a-1) * (1-x)**(b-1)
25
+ else
26
+ Math.exp(gab - ga - gb + Math.log(x)*(a-1) + Math::Log.log1p(-x)*(b-1))
27
+ end
28
+ end
29
+
30
+ # Gamma cumulative distribution function
31
+ # Translated from GSL-1.9: cdf/beta.c gsl_cdf_beta_P
32
+ def cdf(x,a,b)
33
+ return 0.0 if x <= 0.0
34
+ return 1.0 if x >= 1.0
35
+ Math::IncompleteBeta.axpy(1.0, 0.0, a,b,x)
36
+ end
37
+
38
+
39
+ end
40
+ end
41
+ end
42
+ end
@@ -3,7 +3,7 @@ require 'distribution/binomial/gsl'
3
3
  require 'distribution/binomial/java'
4
4
  module Distribution
5
5
 
6
- # Calculate statisticals for T Distribution.
6
+ # Calculate statisticals for Binomial Distribution.
7
7
  module Binomial
8
8
  SHORTHAND = 'bino'
9
9
 
@@ -6,9 +6,11 @@ module Distribution
6
6
  raise "k>n" if k>n
7
7
  Math.binomial_coefficient(n,k)*(pr**k)*(1-pr)**(n-k)
8
8
  end
9
+ # TODO: Use exact_regularized_beta for
10
+ # small values and regularized_beta for bigger ones.
9
11
  def cdf(k,n,pr)
10
12
  #(0..x.floor).inject(0) {|ac,i| ac+pdf(i,n,pr)}
11
- Math.regularized_beta_function(1-pr,n - k,k+1)
13
+ Math.regularized_beta(1-pr,n - k,k+1)
12
14
  end
13
15
  def exact_cdf(k,n,pr)
14
16
  (0..k).inject(0) {|ac,i| ac+pdf(i,n,pr)}
@@ -3,7 +3,7 @@ require 'distribution/f/gsl'
3
3
  require 'distribution/f/statistics2'
4
4
  require 'distribution/f/java'
5
5
  module Distribution
6
- # Calculate cdf and inverse cdf for Chi Square Distribution.
6
+ # Calculate cdf and inverse cdf for F Distribution.
7
7
  #
8
8
  module F
9
9
  SHORTHAND='fdist'
@@ -0,0 +1,37 @@
1
+ require 'distribution/gamma/ruby'
2
+ require 'distribution/gamma/gsl'
3
+ # no statistics2 functions for gamma.
4
+ require 'distribution/gamma/java'
5
+
6
+ module Distribution
7
+ # From Wikipedia:
8
+ # The gamma distribution is a two-parameter family of
9
+ # continuous probability distributions. It has a scale parameter a
10
+ # and a shape parameter b.
11
+ #
12
+ # Calculate pdf, cdf and inverse cdf for Gamma Distribution.
13
+ #
14
+ module Gamma
15
+ extend Distributable
16
+ SHORTHAND='gamma'
17
+ create_distribution_methods
18
+
19
+ ##
20
+ # :singleton-method: pdf(x,a,b)
21
+ # Returns PDF of of Gamma distribution with +a+ as scale
22
+ # parameter and +b+ as shape parameter
23
+
24
+
25
+ ##
26
+ # :singleton-method: cdf(x,a,b)
27
+ # Returns the integral of Gamma distribution with +a+ as scale
28
+ # parameter and +b+ as shape parameter
29
+
30
+ ##
31
+ # :singleton-method: p_value(qn,a,b)
32
+ # Return the upper limit for the integral of a
33
+ # gamma distribution which returns +qn+
34
+ # with scale +a+ and shape +b+
35
+
36
+ end
37
+ end
@@ -0,0 +1,24 @@
1
+ module Distribution
2
+ module Gamma
3
+ module GSL_
4
+ class << self
5
+ def pdf(x,a,b)
6
+ GSL::Ran::gamma_pdf(x.to_f, a.to_f, b.to_f)
7
+ end
8
+ # Return the P-value of the corresponding integral with
9
+ # k degrees of freedom
10
+ def p_value(pr,a,b)
11
+ GSL::Cdf::gamma_Pinv(pr.to_f, a.to_f, b.to_f)
12
+ end
13
+ # Chi-square cumulative distribution function (cdf).
14
+ #
15
+ # Returns the integral of Chi-squared distribution
16
+ # with k degrees of freedom over [0, x]
17
+ #
18
+ def cdf(x,a,b)
19
+ GSL::Cdf::gamma_P(x.to_f, a.to_f, b.to_f)
20
+ end
21
+ end
22
+ end
23
+ end
24
+ end
@@ -0,0 +1,9 @@
1
+ module Distribution
2
+ module Gamma
3
+ # TODO
4
+ module Java_
5
+ class << self
6
+ end
7
+ end
8
+ end
9
+ end
@@ -0,0 +1,53 @@
1
+ # Added by John O. Woods, SciRuby project.
2
+ module Distribution
3
+ module Gamma
4
+ module Ruby_
5
+ class << self
6
+
7
+ include Math
8
+ # Gamma distribution probability density function
9
+ #
10
+ # If you're looking at Wikipedia's Gamma distribution page, the arguments for this pdf function correspond
11
+ # as follows:
12
+ #
13
+ # * +x+: same
14
+ # * +a+: alpha or k
15
+ # + +b+: theta or 1/beta
16
+ #
17
+ # This is confusing! But we're trying to most closely mirror the GSL function for the gamma distribution
18
+ # (see references).
19
+ #
20
+ # Adapted the function itself from GSL-1.9 in rng/gamma.c: gsl_ran_gamma_pdf
21
+ #
22
+ # ==References
23
+ # * http://www.gnu.org/software/gsl/manual/html_node/The-Gamma-Distribution.html
24
+ # * http://en.wikipedia.org/wiki/Gamma_distribution
25
+ def pdf(x,a,b)
26
+ return 0 if x < 0
27
+ if x == 0
28
+ return 1.quo(b) if a == 1
29
+ return 0
30
+ elsif a == 1
31
+ Math.exp(-x.quo(b)).quo(b)
32
+ else
33
+ Math.exp((a-1)*Math.log(x.quo(b)) - x.quo(b) - Math.lgamma(a).first).quo(b)
34
+ end
35
+ end
36
+
37
+ # Gamma cumulative distribution function
38
+ def cdf(x,a,b)
39
+ return 0.0 if x <= 0.0
40
+
41
+ y = x.quo(b)
42
+ return (1-Math::IncompleteGamma.q(a, y)) if y > a
43
+ return (Math::IncompleteGamma.p(a, y))
44
+ end
45
+
46
+ #def p_value(pr,a,b)
47
+ # cdf(1.0-pr,a,b)
48
+ #end
49
+
50
+ end
51
+ end
52
+ end
53
+ end
@@ -1,3 +1,15 @@
1
+ # The next few requires eventually probably need to go in their own gem. They're all functions and constants used by
2
+ # GSL-adapted pure Ruby math functions.
3
+ require "distribution/math_extension/chebyshev_series"
4
+ require "distribution/math_extension/erfc"
5
+ require "distribution/math_extension/exponential_integral"
6
+ require "distribution/math_extension/gammastar"
7
+ require "distribution/math_extension/gsl_utilities"
8
+ require "distribution/math_extension/incomplete_gamma"
9
+ require "distribution/math_extension/incomplete_beta"
10
+ require "distribution/math_extension/log_utilities"
11
+
12
+
1
13
  if RUBY_VERSION<"1.9"
2
14
  require 'mathn'
3
15
  def Prime.each(upper,&block)
@@ -30,6 +42,8 @@ module Distribution
30
42
  B12 = -691.0 / 2730.0
31
43
  B14 = 7.0 / 6.0
32
44
  B16 = -3617.0 / 510.0
45
+
46
+
33
47
  # From statistics2
34
48
  def loggamma(x)
35
49
  v = 1.0
@@ -203,28 +217,44 @@ module Distribution
203
217
  def beta(x,y)
204
218
  (gamma(x)*gamma(y)).quo(gamma(x+y))
205
219
  end
220
+
221
+ # Get pure-Ruby logbeta
206
222
  def logbeta(x,y)
207
- (loggamma(x)+loggamma(y))-loggamma(x+y)
223
+ Beta.log_beta(x,y).first
224
+ end
225
+
226
+ # Log beta function conforming to style of lgamma (returns sign in second array index)
227
+ def lbeta(x,y)
228
+ Beta.log_beta(x,y)
229
+ end
230
+
231
+ # I_x(a,b): Regularized incomplete beta function
232
+ # Fast version. For a exact calculation, based on factorial
233
+ # use exact_regularized_beta_function
234
+ def regularized_beta(x,a,b)
235
+ return 1 if x==1
236
+ IncompleteBeta.evaluate(a,b,x)
208
237
  end
209
238
  # I_x(a,b): Regularized incomplete beta function
210
239
  # TODO: Find a faster version.
211
240
  # Source:
212
241
  # * http://dlmf.nist.gov/8.17
213
- def regularized_beta_function(x,a,b)
242
+ def exact_regularized_beta(x,a,b)
214
243
  return 1 if x==1
215
- #incomplete_beta(x,a,b).quo(beta(a,b))
216
244
  m=a.to_i
217
245
  n=(b+a-1).to_i
218
246
  (m..n).inject(0) {|sum,j|
219
247
  sum+(binomial_coefficient(n,j)* x**j * (1-x)**(n-j))
220
248
  }
221
249
 
222
- end
223
- # B_x(a,b) : Incomplete beta function
224
- # Should be replaced by
225
- # http://lib.stat.cmu.edu/apstat/63
250
+ end
251
+ #
252
+ # Incomplete beta function: B(x;a,b)
253
+ # +a+ and +b+ are parameters and +x+ is
254
+ # integration upper limit.
226
255
  def incomplete_beta(x,a,b)
227
- raise "Doesn't work"
256
+ IncompleteBeta.evaluate(a,b,x)*beta(a,b)
257
+ #Math::IncompleteBeta.axpy(1.0, 0.0, a,b,x)
228
258
  end
229
259
 
230
260
  # Rising factorial
@@ -234,10 +264,26 @@ module Distribution
234
264
 
235
265
  # Ln of gamma
236
266
  def loggamma(x)
237
- lg=Math.lgamma(x)
238
- lg[0]*lg[1]
267
+ Math.lgamma(x).first
268
+ end
269
+
270
+ def incomplete_gamma(a, x = 0, with_error = false)
271
+ IncompleteGamma.p(a,x, with_error)
272
+ end
273
+ alias :gammp :incomplete_gamma
274
+
275
+ def gammq(a, x, with_error = false)
276
+ IncompleteGamma.q(a,x,with_error)
277
+ end
278
+
279
+ def unnormalized_incomplete_gamma(a, x, with_error = false)
280
+ IncompleteGamma.unnormalized(a, x, with_error)
281
+ end
282
+
283
+ # Not the same as erfc. This is the GSL version, which may have slightly different results.
284
+ def erfc_e x, with_error = false
285
+ Erfc.evaluate(x, with_error)
239
286
  end
240
-
241
287
 
242
288
  # Sequences without repetition. n^k'
243
289
  # Also called 'failing factorial'
@@ -306,13 +352,13 @@ end
306
352
 
307
353
  module Math
308
354
  include Distribution::MathExtension
309
- module_function :factorial, :beta, :loggamma, :binomial_coefficient, :binomial_coefficient_gamma, :regularized_beta_function, :incomplete_beta, :permutations, :rising_factorial , :fast_factorial, :combinations, :logbeta
355
+ module_function :factorial, :beta, :loggamma, :erfc_e, :unnormalized_incomplete_gamma, :incomplete_gamma, :gammp, :gammq, :binomial_coefficient, :binomial_coefficient_gamma, :exact_regularized_beta, :incomplete_beta, :regularized_beta, :permutations, :rising_factorial , :fast_factorial, :combinations, :logbeta, :lbeta
310
356
  end
311
357
 
312
358
  # Necessary on Ruby 1.9
313
359
  module CMath # :nodoc:
314
360
  include Distribution::MathExtension
315
- module_function :factorial, :beta, :loggamma, :binomial_coefficient, :binomial_coefficient_gamma, :regularized_beta_function, :incomplete_beta, :permutations, :rising_factorial, :fast_factorial, :combinations, :logbeta
361
+ module_function :factorial, :beta, :loggamma, :unnormalized_incomplete_gamma, :incomplete_gamma, :gammp, :gammq, :erfc_e, :binomial_coefficient, :binomial_coefficient_gamma, :incomplete_beta, :exact_regularized_beta, :regularized_beta, :permutations, :rising_factorial, :fast_factorial, :combinations, :logbeta, :lbeta
316
362
  end
317
363
 
318
364
  if RUBY_VERSION<"1.9"