RubyGems - ruby-statistics - Versions diffs - 0.5.0 → 1.0.0 - Mend

ruby-statistics 0.5.0 → 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

checksums.yaml +4 -4
data/README.md +15 -6
data/lib/statistics/statistical_test/f_test.rb +83 -0
data/lib/statistics/statistical_test/t_test.rb +46 -0
data/lib/statistics/version.rb +1 -1
metadata +5 -3

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA1:
-  metadata.gz: 1c58bc877eb893a77eacc586ddbc2cfca7e3655d
-  data.tar.gz: e61823ec3df43e872d451668ad8148b29b3749b8
+  metadata.gz: a98a2e5755e14ddcfab200641afa2e3bc931a188
+  data.tar.gz: 8532c77ff003ee31a0ea3989e0e22f382c3272e7
 SHA512:
-  metadata.gz: 7fe62c3c65d455fdc40c3dca386c0bce4672025c56638cadea586720b7c10ea57b6d6cf91379772f5c41c080fa4a1cb3e4b7894fe328bde45f03e616dadf0c8d
-  data.tar.gz: 5de5004f6ff87991dc41d273a05db3ab2dbe1c87c001ec5ef3b6e93d5145ac5f7168b64ec55c6049de715b7aa82ac1cdd3dec0a771c2c9f31794067734ddb125
+  metadata.gz: 3582142dd14dd4076c9972b35d0d00599d55376659990111a7012ed574c098f3cf635274259bc013d3d1d6600d9308920274cc77e9ed8ee587d0f73b9f9834e0
+  data.tar.gz: cae3993eb2452cbcce670c6b1262b74e9b6c516a0e80820c1f3b20ec1ec0c0a65de6f138383814c8aae5aa82aa6e8b1a799a9848ed79564fe609bce5afe9e014

data/README.md CHANGED Viewed

@@ -1,8 +1,11 @@
-# Statistics
+# Ruby Statistics
-Welcome to your new gem! In this directory, you'll find the files you need to be able to package up your Ruby library into a gem. Put your Ruby code in the file `lib/statistics`. To experiment with that code, run `bin/console` for an interactive prompt.
+A basic ruby gem that implements some statistical methods, functions and concepts to be used in any ruby environment without depending on any mathematical software like `R`, `Matlab`, `Octave` or similar.
-TODO: Delete this and the text above, and describe your gem
+We got the inspiration from the folks at [JStat](https://github.com/jstat/jstat) and some interesting lectures about [Keystroke dynamics](http://www.biometric-solutions.com/keystroke-dynamics.html).
+Some logic and algorithms are extractions or adaptations from other authors, which are referenced in the comments.
+This software is released under the MIT License.
 ## Installation
@@ -24,7 +27,7 @@ Or install it yourself as:
 just require the `statistics` gem in order to load it. If you don't have defined the `Distribution` namespace, the gem will assign an alias, reducing the number of namespaces needed to use a class.
-Right know you can load:
+Right now you can load:
 * The whole statistics gem. `require 'statistics'`
 * A namespace. `require 'statistics/distribution'`
@@ -48,7 +51,7 @@ To install this gem onto your local machine, run `bundle exec rake install`. To
 ## Contributing
-Bug reports and pull requests are welcome on GitHub at https://github.com/[USERNAME]/statistics. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [Contributor Covenant](http://contributor-covenant.org) code of conduct.
+Bug reports and pull requests are welcome on GitHub at https://github.com/estebanz01/ruby-statistics. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [Contributor Covenant](http://contributor-covenant.org) code of conduct.
 ## License
@@ -56,4 +59,10 @@ The gem is available as open source under the terms of the [MIT License](http://
 ## Code of Conduct
-Everyone interacting in the Statistics project’s codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/[USERNAME]/statistics/blob/master/CODE_OF_CONDUCT.md).
+Everyone interacting in the Statistics project’s codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/estebanz01/ruby-statistics/blob/master/CODE_OF_CONDUCT.md).
+## Contact
+You can contact me via:
+* [Github](https://github.com/estebanz01)
+* [Twitter](https://twitter.com/estebanz01)

data/lib/statistics/statistical_test/f_test.rb ADDED Viewed

@@ -0,0 +1,83 @@
+module Statistics
+  module StatisticalTest
+    class FTest
+      # This method calculates the one-way ANOVA F-test statistic.
+      # We assume that all specified arguments are arrays.
+      # It returns an array with three elements:
+      #   [F-statistic or F-score, degrees of freedom numerator, degrees of freedom denominator].
+      #
+      # Formulas extracted from:
+      # https://courses.lumenlearning.com/boundless-statistics/chapter/one-way-anova/
+      # http://sphweb.bumc.bu.edu/otlt/MPH-Modules/BS/BS704_HypothesisTesting-ANOVA/BS704_HypothesisTesting-Anova_print.html
+      def self.anova_f_score(*args)
+        # If only two groups have been specified as arguments, we follow the classic F-Test for
+        # equality of variances, which is the ratio between the variances.
+        f_score = nil
+        df1 = nil
+        df2 = nil
+        if args.size == 2
+          variances = [args[0].variance, args[1].variance]
+          f_score = variances.max/variances.min.to_f
+          df1 = 1 # k-1 (k = 2)
+          df2 = args.flatten.size - 2 # N-k (k = 2)
+        elsif args.size > 2
+          total_groups = args.size
+          total_elements = args.flatten.size
+          overall_mean = args.flatten.mean
+          sample_sizes = args.map(&:size)
+          sample_means = args.map(&:mean)
+          sample_stds = args.map(&:standard_deviation)
+          # Variance between groups
+          iterator = sample_sizes.each_with_index
+          variance_between_groups = iterator.reduce(0) do |summation, (size, index)|
+            inner_calculation = size * ((sample_means[index] - overall_mean) ** 2)
+            summation += (inner_calculation / (total_groups - 1).to_f)
+          end
+          # Variance within groups
+          variance_within_groups = (0...total_groups).reduce(0) do |outer_summation, group_index|
+            outer_summation += args[group_index].reduce(0) do |inner_sumation, observation|
+              inner_calculation = ((observation - sample_means[group_index]) ** 2)
+              inner_sumation += (inner_calculation / (total_elements - total_groups).to_f)
+            end
+          end
+          f_score = variance_between_groups/variance_within_groups.to_f
+          df1 = total_groups - 1
+          df2 = total_elements - total_groups
+        end
+        [f_score, df1, df2]
+      end
+      # This method expects the alpha value and the groups to calculate the one-way ANOVA test.
+      # It returns a hash with multiple information and the test result (if reject the null hypotesis or not).
+      # Keep in mind that the values for the alternative key (true/false) does not imply that the alternative hypothesis
+      # is TRUE or FALSE. It's a minor notation advantage to decide if reject the null hypothesis or not.
+      def self.one_way_anova(alpha, *args)
+        f_score, df1, df2 = *self.anova_f_score(*args) # Splat array result
+        return if f_score.nil? || df1.nil? || df2.nil?
+        probability = Distribution::F.new(df1, df2).cumulative_function(f_score)
+        p_value = 1 - probability
+        # According to https://stats.stackexchange.com/questions/29158/do-you-reject-the-null-hypothesis-when-p-alpha-or-p-leq-alpha
+        # We can assume that if p_value <= alpha, we can safely reject the null hypothesis, ie. accept the alternative hypothesis.
+        { probability: probability,
+          p_value: p_value,
+          alpha: alpha,
+          null: alpha < p_value,
+          alternative: p_value <= alpha,
+          confidence_level: 1 - alpha }
+      end
+    end
+  end
+end

data/lib/statistics/statistical_test/t_test.rb ADDED Viewed

@@ -0,0 +1,46 @@
+module Statistics
+  module StatisticalTest
+    class TTest
+      # Perform a T-Test for one or two samples.
+      # For the tails param, we need a symbol: :one_tail or :two_tail
+      def self.perform(alpha, tails, *args)
+        return if args.size < 2
+        degrees_of_freedom = 0
+        t_score = if args[0].is_a? Numeric
+                    data_mean = args[1].mean
+                    data_std = args[1].standard_deviation
+                    comparison_mean = args[0]
+                    degrees_of_freedom = args[1].size
+                    (data_mean - comparison_mean)/(data_std / Math.sqrt(args[1].size).to_f).to_f
+                  else
+                    sample_left_mean = args[0].mean
+                    sample_left_variance = args[0].variance
+                    sample_right_variance = args[1].variance
+                    sample_right_mean = args[1].mean
+                    degrees_of_freedom = args.flatten.size - 2
+                    left_root = sample_left_variance/args[0].size.to_f
+                    right_root = sample_right_variance/args[1].size.to_f
+                    standard_error = Math.sqrt(left_root + right_root)
+                    (sample_left_mean - sample_right_mean)/standard_error.to_f
+                  end
+        probability = Distribution::TStudent.new(degrees_of_freedom).cumulative_function(t_score)
+        p_value = 1 - probability
+        p_value *= 2 if tails == :two_tail
+        { probability: probability,
+          p_value: p_value,
+          alpha: alpha,
+          null: alpha < p_value,
+          alternative: p_value <= alpha,
+          confidence_level: 1 - alpha }
+      end
+    end
+  end
+end

data/lib/statistics/version.rb CHANGED Viewed

@@ -1,3 +1,3 @@
 module Statistics
-  VERSION = "0.5.0"
+  VERSION = "1.0.0"
 end

metadata CHANGED Viewed

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: ruby-statistics
 version: !ruby/object:Gem::Version
-  version: 0.5.0
+  version: 1.0.0
 platform: ruby
 authors:
 - esteban zapata
 autorequire:
 bindir: exe
 cert_chain: []
-date: 2017-09-11 00:00:00.000000000 Z
+date: 2017-10-16 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: bundler
@@ -129,6 +129,8 @@ files:
 - lib/statistics/distribution/t_student.rb
 - lib/statistics/distribution/uniform.rb
 - lib/statistics/distribution/weibull.rb
+- lib/statistics/statistical_test/f_test.rb
+- lib/statistics/statistical_test/t_test.rb
 - lib/statistics/version.rb
 - ruby-statistics.gemspec
 homepage: https://github.com/estebanz01/ruby-statistics
@@ -151,7 +153,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
       version: '0'
 requirements: []
 rubyforge_project:
-rubygems_version: 2.6.13
+rubygems_version: 2.6.14
 signing_key:
 specification_version: 4
 summary: A ruby gem for som specific statistics. Inspired by the jStat js library.