RubyGems - ruby-statistics - Versions diffs - 0.5.0 → 1.0.0 - Mend

ruby-statistics 0.5.0 → 1.0.0

Files changed (6) hide show

checksums.yaml +4 -4
data/README.md +15 -6
data/lib/statistics/statistical_test/f_test.rb +83 -0
data/lib/statistics/statistical_test/t_test.rb +46 -0
data/lib/statistics/version.rb +1 -1
metadata +5 -3

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA1:
-  metadata.gz: 1c58bc877eb893a77eacc586ddbc2cfca7e3655d
-  data.tar.gz: e61823ec3df43e872d451668ad8148b29b3749b8
+  metadata.gz: a98a2e5755e14ddcfab200641afa2e3bc931a188
+  data.tar.gz: 8532c77ff003ee31a0ea3989e0e22f382c3272e7
 SHA512:
-  metadata.gz: 7fe62c3c65d455fdc40c3dca386c0bce4672025c56638cadea586720b7c10ea57b6d6cf91379772f5c41c080fa4a1cb3e4b7894fe328bde45f03e616dadf0c8d
-  data.tar.gz: 5de5004f6ff87991dc41d273a05db3ab2dbe1c87c001ec5ef3b6e93d5145ac5f7168b64ec55c6049de715b7aa82ac1cdd3dec0a771c2c9f31794067734ddb125
+  metadata.gz: 3582142dd14dd4076c9972b35d0d00599d55376659990111a7012ed574c098f3cf635274259bc013d3d1d6600d9308920274cc77e9ed8ee587d0f73b9f9834e0
+  data.tar.gz: cae3993eb2452cbcce670c6b1262b74e9b6c516a0e80820c1f3b20ec1ec0c0a65de6f138383814c8aae5aa82aa6e8b1a799a9848ed79564fe609bce5afe9e014

data/README.md CHANGED Viewed

@@ -1,8 +1,11 @@
-# Statistics
+# Ruby Statistics
-Welcome to your new gem! In this directory, you'll find the files you need to be able to package up your Ruby library into a gem. Put your Ruby code in the file `lib/statistics`. To experiment with that code, run `bin/console` for an interactive prompt.
+A basic ruby gem that implements some statistical methods, functions and concepts to be used in any ruby environment without depending on any mathematical software like `R`, `Matlab`, `Octave` or similar.
-TODO: Delete this and the text above, and describe your gem
+We got the inspiration from the folks at [JStat](https://github.com/jstat/jstat) and some interesting lectures about [Keystroke dynamics](http://www.biometric-solutions.com/keystroke-dynamics.html).
+Some logic and algorithms are extractions or adaptations from other authors, which are referenced in the comments.
+This software is released under the MIT License.
 ## Installation
@@ -24,7 +27,7 @@ Or install it yourself as:
 just require the `statistics` gem in order to load it. If you don't have defined the `Distribution` namespace, the gem will assign an alias, reducing the number of namespaces needed to use a class.
-Right know you can load:
+Right now you can load:
 * The whole statistics gem. `require 'statistics'`
 * A namespace. `require 'statistics/distribution'`
@@ -48,7 +51,7 @@ To install this gem onto your local machine, run `bundle exec rake install`. To
 ## Contributing
-Bug reports and pull requests are welcome on GitHub at https://github.com/[USERNAME]/statistics. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [Contributor Covenant](http://contributor-covenant.org) code of conduct.
+Bug reports and pull requests are welcome on GitHub at https://github.com/estebanz01/ruby-statistics. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [Contributor Covenant](http://contributor-covenant.org) code of conduct.
 ## License
@@ -56,4 +59,10 @@ The gem is available as open source under the terms of the [MIT License](http://
 ## Code of Conduct
-Everyone interacting in the Statistics project’s codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/[USERNAME]/statistics/blob/master/CODE_OF_CONDUCT.md).
+Everyone interacting in the Statistics project’s codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/estebanz01/ruby-statistics/blob/master/CODE_OF_CONDUCT.md).
+## Contact
+You can contact me via:
+* [Github](https://github.com/estebanz01)
+* [Twitter](https://twitter.com/estebanz01)

data/lib/statistics/statistical_test/f_test.rb ADDED Viewed

@@ -0,0 +1,83 @@
+module Statistics
+  module StatisticalTest
+    class FTest
+      # This method calculates the one-way ANOVA F-test statistic.
+      # We assume that all specified arguments are arrays.
+      # It returns an array with three elements:
+      #   [F-statistic or F-score, degrees of freedom numerator, degrees of freedom denominator].
+      #
+      # Formulas extracted from:
+      # https://courses.lumenlearning.com/boundless-statistics/chapter/one-way-anova/
+      # http://sphweb.bumc.bu.edu/otlt/MPH-Modules/BS/BS704_HypothesisTesting-ANOVA/BS704_HypothesisTesting-Anova_print.html
+      def self.anova_f_score(*args)
+        # If only two groups have been specified as arguments, we follow the classic F-Test for
+        # equality of variances, which is the ratio between the variances.
+        f_score = nil
+        df1 = nil
+        df2 = nil
+        if args.size == 2
+          variances = [args[0].variance, args[1].variance]
+          f_score = variances.max/variances.min.to_f
+          df1 = 1 # k-1 (k = 2)
+          df2 = args.flatten.size - 2 # N-k (k = 2)
+        elsif args.size > 2
+          total_groups = args.size
+          total_elements = args.flatten.size
+          overall_mean = args.flatten.mean
+          sample_sizes = args.map(&:size)
+          sample_means = args.map(&:mean)
+          sample_stds = args.map(&:standard_deviation)
+          # Variance between groups
+          iterator = sample_sizes.each_with_index
+          variance_between_groups = iterator.reduce(0) do |summation, (size, index)|
+            inner_calculation = size * ((sample_means[index] - overall_mean) ** 2)
+            summation += (inner_calculation / (total_groups - 1).to_f)
+          end
+          # Variance within groups
+          variance_within_groups = (0...total_groups).reduce(0) do |outer_summation, group_index|
+            outer_summation += args[group_index].reduce(0) do |inner_sumation, observation|
+              inner_calculation = ((observation - sample_means[group_index]) ** 2)
+              inner_sumation += (inner_calculation / (total_elements - total_groups).to_f)
+            end
+          end
+          f_score = variance_between_groups/variance_within_groups.to_f
+          df1 = total_groups - 1
+          df2 = total_elements - total_groups
+        end
+        [f_score, df1, df2]
+      end
+      # This method expects the alpha value and the groups to calculate the one-way ANOVA test.
+      # It returns a hash with multiple information and the test result (if reject the null hypotesis or not).
+      # Keep in mind that the values for the alternative key (true/false) does not imply that the alternative hypothesis
+      # is TRUE or FALSE. It's a minor notation advantage to decide if reject the null hypothesis or not.
+      def self.one_way_anova(alpha, *args)
+        f_score, df1, df2 = *self.anova_f_score(*args) # Splat array result
+        return if f_score.nil? || df1.nil? || df2.nil?
+        probability = Distribution::F.new(df1, df2).cumulative_function(f_score)
+        p_value = 1 - probability
+        # According to https://stats.stackexchange.com/questions/29158/do-you-reject-the-null-hypothesis-when-p-alpha-or-p-leq-alpha
+        # We can assume that if p_value <= alpha, we can safely reject the null hypothesis, ie. accept the alternative hypothesis.
+        { probability: probability,
+          p_value: p_value,
+          alpha: alpha,
+          null: alpha < p_value,
+          alternative: p_value <= alpha,
+          confidence_level: 1 - alpha }
+      end
+    end
+  end
+end

data/lib/statistics/statistical_test/t_test.rb ADDED Viewed

@@ -0,0 +1,46 @@
+module Statistics
+  module StatisticalTest
+    class TTest
+      # Perform a T-Test for one or two samples.
+      # For the tails param, we need a symbol: :one_tail or :two_tail
+      def self.perform(alpha, tails, *args)
+        return if args.size < 2
+        degrees_of_freedom = 0
+        t_score = if args[0].is_a? Numeric
+                    data_mean = args[1].mean
+                    data_std = args[1].standard_deviation
+                    comparison_mean = args[0]
+                    degrees_of_freedom = args[1].size
+                    (data_mean - comparison_mean)/(data_std / Math.sqrt(args[1].size).to_f).to_f
+                  else
+                    sample_left_mean = args[0].mean
+                    sample_left_variance = args[0].variance
+                    sample_right_variance = args[1].variance
+                    sample_right_mean = args[1].mean
+                    degrees_of_freedom = args.flatten.size - 2
+                    left_root = sample_left_variance/args[0].size.to_f
+                    right_root = sample_right_variance/args[1].size.to_f
+                    standard_error = Math.sqrt(left_root + right_root)
+                    (sample_left_mean - sample_right_mean)/standard_error.to_f
+                  end
+        probability = Distribution::TStudent.new(degrees_of_freedom).cumulative_function(t_score)
+        p_value = 1 - probability
+        p_value *= 2 if tails == :two_tail
+        { probability: probability,
+          p_value: p_value,
+          alpha: alpha,
+          null: alpha < p_value,
+          alternative: p_value <= alpha,
+          confidence_level: 1 - alpha }
+      end
+    end
+  end
+end

data/lib/statistics/version.rb CHANGED Viewed

@@ -1,3 +1,3 @@
 module Statistics
-  VERSION = "0.5.0"
+  VERSION = "1.0.0"
 end

metadata CHANGED Viewed

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: ruby-statistics
 version: !ruby/object:Gem::Version
-  version: 0.5.0
+  version: 1.0.0
 platform: ruby
 authors:
 - esteban zapata
 autorequire:
 bindir: exe
 cert_chain: []
-date: 2017-09-11 00:00:00.000000000 Z
+date: 2017-10-16 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: bundler
@@ -129,6 +129,8 @@ files:
 - lib/statistics/distribution/t_student.rb
 - lib/statistics/distribution/uniform.rb
 - lib/statistics/distribution/weibull.rb
+- lib/statistics/statistical_test/f_test.rb
+- lib/statistics/statistical_test/t_test.rb
 - lib/statistics/version.rb
 - ruby-statistics.gemspec
 homepage: https://github.com/estebanz01/ruby-statistics
@@ -151,7 +153,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
       version: '0'
 requirements: []
 rubyforge_project:
-rubygems_version: 2.6.13
+rubygems_version: 2.6.14
 signing_key:
 specification_version: 4
 summary: A ruby gem for som specific statistics. Inspired by the jStat js library.