distribution 0.5.0 → 0.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data.tar.gz.sig CHANGED
Binary file
File without changes
@@ -1,3 +1,15 @@
1
+ === 0.6.0 / 2011-09-23
2
+ * Incomplete Beta functions on math renamed to Regularized beta, because MathExtension::IncompleteBeta calculates regularized beta function, not Incomplete Beta.
3
+ * Corrected documention on F distribution and added comments on gamma and beta[Claudio Bustos]
4
+ * Moved ported methods from GSL to lib/math_extension. Updated spec for gamma and beta distributions[Claudio Bustos]
5
+ * Added beta distribution functions. p_value does not seem to work yet.[John Woods]
6
+ * Added most incomplete beta function GSL translations, also log_beta from GSL.[John Woods]
7
+ * Added header information to the incomplete gamma files translated from GSL[John Woods]
8
+ * Removed left-over invgammp function.[John Woods]
9
+ * Fixed lots of bugs, translated most GSL error tests into rspec.[John Woods]
10
+ * Added Gamma distribution, spec. No statistics2 functions for Gamma, so only implemented pure Ruby and GSL.[John Woods]
11
+ * Added console task to rakefile[John Woods]
12
+
1
13
  === 0.5.0 / 2011-05-03
2
14
 
3
15
  * Exception raises on calculation of T's cdf with ruby engine. For now, stick to gsl implementation
@@ -35,4 +47,4 @@
35
47
 
36
48
  === 0.1.0 / 2011-01-26
37
49
 
38
- * Basic set (pdf, cdf, p_value) for Normal, Chi Square, F and T distributions
50
+ * Basic set (pdf, cdf, p_value) for Normal, Chi Square, F and T distributions
@@ -17,6 +17,10 @@ data/template/distribution/gsl.erb
17
17
  data/template/distribution/ruby.erb
18
18
  data/template/spec.erb
19
19
  lib/distribution.rb
20
+ lib/distribution/beta.rb
21
+ lib/distribution/beta/gsl.rb
22
+ lib/distribution/beta/java.rb
23
+ lib/distribution/beta/ruby.rb
20
24
  lib/distribution/binomial.rb
21
25
  lib/distribution/binomial/gsl.rb
22
26
  lib/distribution/binomial/java.rb
@@ -39,6 +43,10 @@ lib/distribution/f/gsl.rb
39
43
  lib/distribution/f/java.rb
40
44
  lib/distribution/f/ruby.rb
41
45
  lib/distribution/f/statistics2.rb
46
+ lib/distribution/gamma.rb
47
+ lib/distribution/gamma/gsl.rb
48
+ lib/distribution/gamma/java.rb
49
+ lib/distribution/gamma/ruby.rb
42
50
  lib/distribution/hypergeometric.rb
43
51
  lib/distribution/hypergeometric/gsl.rb
44
52
  lib/distribution/hypergeometric/java.rb
@@ -46,6 +54,14 @@ lib/distribution/hypergeometric/ruby.rb
46
54
  lib/distribution/logistic.rb
47
55
  lib/distribution/logistic/ruby.rb
48
56
  lib/distribution/math_extension.rb
57
+ lib/distribution/math_extension/chebyshev_series.rb
58
+ lib/distribution/math_extension/erfc.rb
59
+ lib/distribution/math_extension/exponential_integral.rb
60
+ lib/distribution/math_extension/gammastar.rb
61
+ lib/distribution/math_extension/gsl_utilities.rb
62
+ lib/distribution/math_extension/incomplete_beta.rb
63
+ lib/distribution/math_extension/incomplete_gamma.rb
64
+ lib/distribution/math_extension/log_utilities.rb
49
65
  lib/distribution/normal.rb
50
66
  lib/distribution/normal/gsl.rb
51
67
  lib/distribution/normal/java.rb
@@ -60,12 +76,14 @@ lib/distribution/t/gsl.rb
60
76
  lib/distribution/t/java.rb
61
77
  lib/distribution/t/ruby.rb
62
78
  lib/distribution/t/statistics2.rb
79
+ spec/beta_spec.rb
63
80
  spec/binomial_spec.rb
64
81
  spec/bivariatenormal_spec.rb
65
82
  spec/chisquare_spec.rb
66
83
  spec/distribution_spec.rb
67
84
  spec/exponential_spec.rb
68
85
  spec/f_spec.rb
86
+ spec/gamma_spec.rb
69
87
  spec/hypergeometric_spec.rb
70
88
  spec/logistic_spec.rb
71
89
  spec/math_extension_spec.rb
data/README.txt CHANGED
@@ -4,7 +4,7 @@
4
4
 
5
5
  == DESCRIPTION:
6
6
 
7
- Statistical Distributions library. Includes Normal univariate and bivariate, T, F, Chi Square, Binomial, Hypergeometric, Exponential and Poisson.
7
+ Statistical Distributions library. Includes Normal univariate and bivariate, T, F, Chi Square, Binomial, Hypergeometric, Exponential, Poisson, Beta and Gamma.
8
8
 
9
9
  Uses Ruby by default and C (statistics2/GSL) or Java extensions where available.
10
10
 
@@ -44,6 +44,8 @@ Shortnames for distributions:
44
44
  * Hypergeometric: hypg
45
45
  * Exponential: expo
46
46
  * Poisson: pois
47
+ * Beta: beta
48
+ * Gamma: gamma
47
49
 
48
50
  For example
49
51
 
data/Rakefile CHANGED
@@ -17,5 +17,10 @@ Hoe.spec 'distribution' do
17
17
  self.extra_dev_deps << ["rspec",">=2.0"] << ["rubyforge",">=0"]
18
18
 
19
19
  end
20
+ # git log --pretty=format:"*%s[%cn]" v0.5.0..HEAD >> History.txt
21
+ desc "Open an irb session preloaded with distribution"
22
+ task :console do
23
+ sh "irb -rubygems -I lib -r distribution.rb"
24
+ end
20
25
 
21
26
  # vim: syntax=ruby
@@ -22,7 +22,8 @@
22
22
  # * Code of several Ruby engines came from statistics2.rb,
23
23
  # created by Shin-ichiro HARA(sinara@blade.nagaokaut.ac.jp).
24
24
  # Retrieve from http://blade.nagaokaut.ac.jp/~sinara/ruby/math/statistics2/
25
- #
25
+ # * Code of Beta and Gamma distribution came from GSL project.
26
+ # Ported by John O. Woods
26
27
  # Specific notices will be placed where there are appropiate
27
28
  #
28
29
  if !respond_to? :define_singleton_method
@@ -49,7 +50,7 @@ require 'distribution/math_extension'
49
50
  # Distribution::Normal.p_value(0.95)
50
51
  # => 1.64485364660836
51
52
  module Distribution
52
- VERSION="0.5.0"
53
+ VERSION="0.6.0"
53
54
 
54
55
  module Shorthand
55
56
  EQUIVALENCES={:p_value=>:p, :cdf=>:cdf, :pdf=>:pdf, :rng=>:r, :exact_pdf=>:epdf, :exact_cdf=>:ecdf, :exact_p_value=>:ep}
@@ -133,13 +134,15 @@ module Distribution
133
134
 
134
135
  end
135
136
  # create alias for common methods
136
- alias_method :inverse_cdf,:p_value if singleton_methods.include? :p_value
137
+ alias_method :inverse_cdf, :p_value if singleton_methods.include? :p_value
137
138
  end
138
139
 
139
140
  end
140
141
 
141
142
  autoload(:Normal, 'distribution/normal')
142
143
  autoload(:ChiSquare, 'distribution/chisquare')
144
+ autoload(:Gamma, 'distribution/gamma')
145
+ autoload(:Beta, 'distribution/beta')
143
146
  autoload(:T, 'distribution/t')
144
147
  autoload(:F, 'distribution/f')
145
148
  autoload(:BivariateNormal, 'distribution/bivariatenormal')
@@ -0,0 +1,34 @@
1
+ require 'distribution/beta/ruby'
2
+ require 'distribution/beta/gsl'
3
+ # no statistics2 functions for beta.
4
+ require 'distribution/beta/java'
5
+
6
+ module Distribution
7
+ # From Wikipedia:
8
+ # In probability theory and statistics, the beta distribution
9
+ # is a family of continuous probability distributions defined
10
+ # on the interval (0, 1) parameterized by two positive shape
11
+ # parameters, typically denoted by alpha and beta.
12
+ # This module calculate cdf and inverse cdf for Beta Distribution.
13
+ #
14
+ module Beta
15
+ extend Distributable
16
+ SHORTHAND='beta'
17
+ create_distribution_methods
18
+
19
+ ##
20
+ # :singleton-method: pdf(x,a,b)
21
+ # Returns PDF of of Beta distribution with parameters a and b
22
+
23
+
24
+ ##
25
+ # :singleton-method: cdf(x,a,b)
26
+ # Returns the integral of Beta distribution with parameters a and b
27
+
28
+ ##
29
+ # :singleton-method: p_value(qn,a,b)
30
+ # Return the quantile of the corresponding integral +qn+
31
+ # on a beta distribution's cdf with parameters a and b
32
+
33
+ end
34
+ end
@@ -0,0 +1,24 @@
1
+ module Distribution
2
+ module Beta
3
+ module GSL_
4
+ class << self
5
+ def pdf(x,a,b)
6
+ GSL::Ran::beta_pdf(x.to_f, a.to_f, b.to_f)
7
+ end
8
+ # Return the P-value of the corresponding integral with
9
+ # k degrees of freedom
10
+ def p_value(pr,a,b)
11
+ GSL::Cdf::beta_Pinv(pr.to_f, a.to_f, b.to_f)
12
+ end
13
+ # Beta cumulative distribution function (cdf).
14
+ #
15
+ # Returns the integral of Beta distribution
16
+ # with parameters +a+ and +b+ over [0, x]
17
+ #
18
+ def cdf(x,a,b)
19
+ GSL::Cdf::beta_P(x.to_f, a.to_f, b.to_f)
20
+ end
21
+ end
22
+ end
23
+ end
24
+ end
@@ -0,0 +1,9 @@
1
+ module Distribution
2
+ module Beta
3
+ # TODO
4
+ module Java_
5
+ class << self
6
+ end
7
+ end
8
+ end
9
+ end
@@ -0,0 +1,42 @@
1
+ # Added by John O. Woods, SciRuby project.
2
+ module Distribution
3
+ module Beta
4
+ module Ruby_
5
+ class << self
6
+
7
+ include Math
8
+ # Beta distribution probability density function
9
+ #
10
+ # Adapted from GSL-1.9 (apparently by Knuth originally), found in randist/beta.c
11
+ #
12
+ # Form: p(x) dx = (Gamma(a + b)/(Gamma(a) Gamma(b))) x^(a-1) (1-x)^(b-1) dx
13
+ #
14
+ # == References
15
+ # * http://www.gnu.org/s/gsl/manual/html_node/The-Gamma-Distribution.html
16
+ def pdf(x,a,b)
17
+ return 0 if x < 0 || x > 1
18
+
19
+ gab = Math.lgamma(a+b).first
20
+ ga = Math.lgamma(a).first
21
+ gb = Math.lgamma(b).first
22
+
23
+ if x == 0.0 || x == 1.0
24
+ Math.exp(gab - ga - gb) * x**(a-1) * (1-x)**(b-1)
25
+ else
26
+ Math.exp(gab - ga - gb + Math.log(x)*(a-1) + Math::Log.log1p(-x)*(b-1))
27
+ end
28
+ end
29
+
30
+ # Gamma cumulative distribution function
31
+ # Translated from GSL-1.9: cdf/beta.c gsl_cdf_beta_P
32
+ def cdf(x,a,b)
33
+ return 0.0 if x <= 0.0
34
+ return 1.0 if x >= 1.0
35
+ Math::IncompleteBeta.axpy(1.0, 0.0, a,b,x)
36
+ end
37
+
38
+
39
+ end
40
+ end
41
+ end
42
+ end
@@ -3,7 +3,7 @@ require 'distribution/binomial/gsl'
3
3
  require 'distribution/binomial/java'
4
4
  module Distribution
5
5
 
6
- # Calculate statisticals for T Distribution.
6
+ # Calculate statisticals for Binomial Distribution.
7
7
  module Binomial
8
8
  SHORTHAND = 'bino'
9
9
 
@@ -6,9 +6,11 @@ module Distribution
6
6
  raise "k>n" if k>n
7
7
  Math.binomial_coefficient(n,k)*(pr**k)*(1-pr)**(n-k)
8
8
  end
9
+ # TODO: Use exact_regularized_beta for
10
+ # small values and regularized_beta for bigger ones.
9
11
  def cdf(k,n,pr)
10
12
  #(0..x.floor).inject(0) {|ac,i| ac+pdf(i,n,pr)}
11
- Math.regularized_beta_function(1-pr,n - k,k+1)
13
+ Math.regularized_beta(1-pr,n - k,k+1)
12
14
  end
13
15
  def exact_cdf(k,n,pr)
14
16
  (0..k).inject(0) {|ac,i| ac+pdf(i,n,pr)}
@@ -3,7 +3,7 @@ require 'distribution/f/gsl'
3
3
  require 'distribution/f/statistics2'
4
4
  require 'distribution/f/java'
5
5
  module Distribution
6
- # Calculate cdf and inverse cdf for Chi Square Distribution.
6
+ # Calculate cdf and inverse cdf for F Distribution.
7
7
  #
8
8
  module F
9
9
  SHORTHAND='fdist'
@@ -0,0 +1,37 @@
1
+ require 'distribution/gamma/ruby'
2
+ require 'distribution/gamma/gsl'
3
+ # no statistics2 functions for gamma.
4
+ require 'distribution/gamma/java'
5
+
6
+ module Distribution
7
+ # From Wikipedia:
8
+ # The gamma distribution is a two-parameter family of
9
+ # continuous probability distributions. It has a scale parameter a
10
+ # and a shape parameter b.
11
+ #
12
+ # Calculate pdf, cdf and inverse cdf for Gamma Distribution.
13
+ #
14
+ module Gamma
15
+ extend Distributable
16
+ SHORTHAND='gamma'
17
+ create_distribution_methods
18
+
19
+ ##
20
+ # :singleton-method: pdf(x,a,b)
21
+ # Returns PDF of of Gamma distribution with +a+ as scale
22
+ # parameter and +b+ as shape parameter
23
+
24
+
25
+ ##
26
+ # :singleton-method: cdf(x,a,b)
27
+ # Returns the integral of Gamma distribution with +a+ as scale
28
+ # parameter and +b+ as shape parameter
29
+
30
+ ##
31
+ # :singleton-method: p_value(qn,a,b)
32
+ # Return the upper limit for the integral of a
33
+ # gamma distribution which returns +qn+
34
+ # with scale +a+ and shape +b+
35
+
36
+ end
37
+ end
@@ -0,0 +1,24 @@
1
+ module Distribution
2
+ module Gamma
3
+ module GSL_
4
+ class << self
5
+ def pdf(x,a,b)
6
+ GSL::Ran::gamma_pdf(x.to_f, a.to_f, b.to_f)
7
+ end
8
+ # Return the P-value of the corresponding integral with
9
+ # k degrees of freedom
10
+ def p_value(pr,a,b)
11
+ GSL::Cdf::gamma_Pinv(pr.to_f, a.to_f, b.to_f)
12
+ end
13
+ # Chi-square cumulative distribution function (cdf).
14
+ #
15
+ # Returns the integral of Chi-squared distribution
16
+ # with k degrees of freedom over [0, x]
17
+ #
18
+ def cdf(x,a,b)
19
+ GSL::Cdf::gamma_P(x.to_f, a.to_f, b.to_f)
20
+ end
21
+ end
22
+ end
23
+ end
24
+ end
@@ -0,0 +1,9 @@
1
+ module Distribution
2
+ module Gamma
3
+ # TODO
4
+ module Java_
5
+ class << self
6
+ end
7
+ end
8
+ end
9
+ end
@@ -0,0 +1,53 @@
1
+ # Added by John O. Woods, SciRuby project.
2
+ module Distribution
3
+ module Gamma
4
+ module Ruby_
5
+ class << self
6
+
7
+ include Math
8
+ # Gamma distribution probability density function
9
+ #
10
+ # If you're looking at Wikipedia's Gamma distribution page, the arguments for this pdf function correspond
11
+ # as follows:
12
+ #
13
+ # * +x+: same
14
+ # * +a+: alpha or k
15
+ # + +b+: theta or 1/beta
16
+ #
17
+ # This is confusing! But we're trying to most closely mirror the GSL function for the gamma distribution
18
+ # (see references).
19
+ #
20
+ # Adapted the function itself from GSL-1.9 in rng/gamma.c: gsl_ran_gamma_pdf
21
+ #
22
+ # ==References
23
+ # * http://www.gnu.org/software/gsl/manual/html_node/The-Gamma-Distribution.html
24
+ # * http://en.wikipedia.org/wiki/Gamma_distribution
25
+ def pdf(x,a,b)
26
+ return 0 if x < 0
27
+ if x == 0
28
+ return 1.quo(b) if a == 1
29
+ return 0
30
+ elsif a == 1
31
+ Math.exp(-x.quo(b)).quo(b)
32
+ else
33
+ Math.exp((a-1)*Math.log(x.quo(b)) - x.quo(b) - Math.lgamma(a).first).quo(b)
34
+ end
35
+ end
36
+
37
+ # Gamma cumulative distribution function
38
+ def cdf(x,a,b)
39
+ return 0.0 if x <= 0.0
40
+
41
+ y = x.quo(b)
42
+ return (1-Math::IncompleteGamma.q(a, y)) if y > a
43
+ return (Math::IncompleteGamma.p(a, y))
44
+ end
45
+
46
+ #def p_value(pr,a,b)
47
+ # cdf(1.0-pr,a,b)
48
+ #end
49
+
50
+ end
51
+ end
52
+ end
53
+ end
@@ -1,3 +1,15 @@
1
+ # The next few requires eventually probably need to go in their own gem. They're all functions and constants used by
2
+ # GSL-adapted pure Ruby math functions.
3
+ require "distribution/math_extension/chebyshev_series"
4
+ require "distribution/math_extension/erfc"
5
+ require "distribution/math_extension/exponential_integral"
6
+ require "distribution/math_extension/gammastar"
7
+ require "distribution/math_extension/gsl_utilities"
8
+ require "distribution/math_extension/incomplete_gamma"
9
+ require "distribution/math_extension/incomplete_beta"
10
+ require "distribution/math_extension/log_utilities"
11
+
12
+
1
13
  if RUBY_VERSION<"1.9"
2
14
  require 'mathn'
3
15
  def Prime.each(upper,&block)
@@ -30,6 +42,8 @@ module Distribution
30
42
  B12 = -691.0 / 2730.0
31
43
  B14 = 7.0 / 6.0
32
44
  B16 = -3617.0 / 510.0
45
+
46
+
33
47
  # From statistics2
34
48
  def loggamma(x)
35
49
  v = 1.0
@@ -203,28 +217,44 @@ module Distribution
203
217
  def beta(x,y)
204
218
  (gamma(x)*gamma(y)).quo(gamma(x+y))
205
219
  end
220
+
221
+ # Get pure-Ruby logbeta
206
222
  def logbeta(x,y)
207
- (loggamma(x)+loggamma(y))-loggamma(x+y)
223
+ Beta.log_beta(x,y).first
224
+ end
225
+
226
+ # Log beta function conforming to style of lgamma (returns sign in second array index)
227
+ def lbeta(x,y)
228
+ Beta.log_beta(x,y)
229
+ end
230
+
231
+ # I_x(a,b): Regularized incomplete beta function
232
+ # Fast version. For a exact calculation, based on factorial
233
+ # use exact_regularized_beta_function
234
+ def regularized_beta(x,a,b)
235
+ return 1 if x==1
236
+ IncompleteBeta.evaluate(a,b,x)
208
237
  end
209
238
  # I_x(a,b): Regularized incomplete beta function
210
239
  # TODO: Find a faster version.
211
240
  # Source:
212
241
  # * http://dlmf.nist.gov/8.17
213
- def regularized_beta_function(x,a,b)
242
+ def exact_regularized_beta(x,a,b)
214
243
  return 1 if x==1
215
- #incomplete_beta(x,a,b).quo(beta(a,b))
216
244
  m=a.to_i
217
245
  n=(b+a-1).to_i
218
246
  (m..n).inject(0) {|sum,j|
219
247
  sum+(binomial_coefficient(n,j)* x**j * (1-x)**(n-j))
220
248
  }
221
249
 
222
- end
223
- # B_x(a,b) : Incomplete beta function
224
- # Should be replaced by
225
- # http://lib.stat.cmu.edu/apstat/63
250
+ end
251
+ #
252
+ # Incomplete beta function: B(x;a,b)
253
+ # +a+ and +b+ are parameters and +x+ is
254
+ # integration upper limit.
226
255
  def incomplete_beta(x,a,b)
227
- raise "Doesn't work"
256
+ IncompleteBeta.evaluate(a,b,x)*beta(a,b)
257
+ #Math::IncompleteBeta.axpy(1.0, 0.0, a,b,x)
228
258
  end
229
259
 
230
260
  # Rising factorial
@@ -234,10 +264,26 @@ module Distribution
234
264
 
235
265
  # Ln of gamma
236
266
  def loggamma(x)
237
- lg=Math.lgamma(x)
238
- lg[0]*lg[1]
267
+ Math.lgamma(x).first
268
+ end
269
+
270
+ def incomplete_gamma(a, x = 0, with_error = false)
271
+ IncompleteGamma.p(a,x, with_error)
272
+ end
273
+ alias :gammp :incomplete_gamma
274
+
275
+ def gammq(a, x, with_error = false)
276
+ IncompleteGamma.q(a,x,with_error)
277
+ end
278
+
279
+ def unnormalized_incomplete_gamma(a, x, with_error = false)
280
+ IncompleteGamma.unnormalized(a, x, with_error)
281
+ end
282
+
283
+ # Not the same as erfc. This is the GSL version, which may have slightly different results.
284
+ def erfc_e x, with_error = false
285
+ Erfc.evaluate(x, with_error)
239
286
  end
240
-
241
287
 
242
288
  # Sequences without repetition. n^k'
243
289
  # Also called 'failing factorial'
@@ -306,13 +352,13 @@ end
306
352
 
307
353
  module Math
308
354
  include Distribution::MathExtension
309
- module_function :factorial, :beta, :loggamma, :binomial_coefficient, :binomial_coefficient_gamma, :regularized_beta_function, :incomplete_beta, :permutations, :rising_factorial , :fast_factorial, :combinations, :logbeta
355
+ module_function :factorial, :beta, :loggamma, :erfc_e, :unnormalized_incomplete_gamma, :incomplete_gamma, :gammp, :gammq, :binomial_coefficient, :binomial_coefficient_gamma, :exact_regularized_beta, :incomplete_beta, :regularized_beta, :permutations, :rising_factorial , :fast_factorial, :combinations, :logbeta, :lbeta
310
356
  end
311
357
 
312
358
  # Necessary on Ruby 1.9
313
359
  module CMath # :nodoc:
314
360
  include Distribution::MathExtension
315
- module_function :factorial, :beta, :loggamma, :binomial_coefficient, :binomial_coefficient_gamma, :regularized_beta_function, :incomplete_beta, :permutations, :rising_factorial, :fast_factorial, :combinations, :logbeta
361
+ module_function :factorial, :beta, :loggamma, :unnormalized_incomplete_gamma, :incomplete_gamma, :gammp, :gammq, :erfc_e, :binomial_coefficient, :binomial_coefficient_gamma, :incomplete_beta, :exact_regularized_beta, :regularized_beta, :permutations, :rising_factorial, :fast_factorial, :combinations, :logbeta, :lbeta
316
362
  end
317
363
 
318
364
  if RUBY_VERSION<"1.9"