PyPI - AutoStatLib - Versions diffs - 0.2.0__tar.gz → 0.2.2__tar.gz - Mend

AutoStatLib 0.2.0tar.gz → 0.2.2tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of AutoStatLib might be problematic. Click here for more details.

Files changed (17) hide show

{autostatlib-0.2.0/src/AutoStatLib.egg-info → autostatlib-0.2.2}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.2
 Name: AutoStatLib
-Version: 0.2.0
+Version: 0.2.2
 Summary: AutoStatLib - a simple statistical analysis tool
 Author: Stemonitis, SciWare LLC
 Author-email: konung-yaropolk <yaropolk1995@gmail.com>
@@ -531,6 +531,7 @@ License-File: LICENSE
 Requires-Dist: numpy
 Requires-Dist: scipy
 Requires-Dist: statsmodels
+Requires-Dist: pandas
 # AutoStatLib - python library for automated statistical analysis
@@ -569,7 +570,7 @@ data_uniform = [list(np.random.uniform(i+3, i+1, n)) for i in range(groups)]
 # set the parameters:
-paired = False     # is groups dependend or not
+paired = False     # is groups dependent or not
 tails = 2          # two-tailed or one-tailed result
 popmean = 0        # population mean - only for single-sample tests needed
@@ -585,7 +586,7 @@ analysis.RunAuto()
 or you can choose specific tests:
 ```python
-# 2 groups independend:
+# 2 groups independent:
 analysis.RunTtest()
 analysis.RunMannWhitney()
@@ -594,10 +595,11 @@ analysis.RunTtestPaired()
 analysis.RunWilcoxon()
 # 3 and more independed groups comparison:
-analysis.RunAnova()
+analysis.RunOnewayAnova()
 analysis.RunKruskalWallis()
 # 3 and more depended groups comparison:
+analysis.RunOnewayAnovaRM()
 analysis.RunFriedman()
 # single group tests"
@@ -647,20 +649,40 @@ If errors occured, *GetResult()* returns an empty dictionary
 ---
-## Pre-Alpha dev status.
-### TODO:
---Kruskal-Wallis test - add Dunn's multiple comparisons
---Anova: add 2-way anova and 3-way(?)
-check:
---Wilcoxon signed-rank test and Mann-whitney - check mechanism of one-tailed calc, looks like it works wrong
-checked tests:
---Wilcoxon 2 tail - ok
---Mann-whitney 2 tail - ok
+## Pre-Alpha dev status.
+### TODO:
+-- Kruskal-Wallis test - add Dunn's multiple comparisons
+-- Anova: add 2-way anova and 3-way anova
+-- onevay Anova: add repeated measures (for normal dependent values) with and without Gaisser-Greenhouse correction
+-- onevay Anova: add Brown-Forsithe and Welch (for normal independent values with unequal SDs between groups)
+-- paired T-test: add ratio-paired t-test (ratios of paired values are consistent)
+-- add Welch test (for norm data unequal variances)
+-- add Kolmogorov-smirnov test (unpaired nonparametric 2 sample, compare cumulative distributions)
+-- add independent t-test with Welch correction (do not assume equal SDs in groups)
+-- add correlation test, correlation diagram
+-- add linear regression, regression diagram
+-- add QQ plot
+-- n-sample tests: add onetail option
+✅ done -- detailed normality test results
+checked tests:
+1-sample:
+--Wilcoxon 2,1 tails - ok
+--t-tests 2,1 tails -ok
+2-sample:
+--Wilcoxon 2,1 tails - ok
+--Mann-whitney 2,1 tails - ok
+--t-tests 2,1 tails -ok
+n-sample:
+--Kruskal-Wallis 2 tail - ok
+--Friedman 2 tail - ok
+--one-way ANOWA 2 tail - ok

{autostatlib-0.2.0 → autostatlib-0.2.2}/README.md RENAMED Viewed

@@ -35,7 +35,7 @@ data_uniform = [list(np.random.uniform(i+3, i+1, n)) for i in range(groups)]
 # set the parameters:
-paired = False     # is groups dependend or not
+paired = False     # is groups dependent or not
 tails = 2          # two-tailed or one-tailed result
 popmean = 0        # population mean - only for single-sample tests needed
@@ -51,7 +51,7 @@ analysis.RunAuto()
 or you can choose specific tests:
 ```python
-# 2 groups independend:
+# 2 groups independent:
 analysis.RunTtest()
 analysis.RunMannWhitney()
@@ -60,10 +60,11 @@ analysis.RunTtestPaired()
 analysis.RunWilcoxon()
 # 3 and more independed groups comparison:
-analysis.RunAnova()
+analysis.RunOnewayAnova()
 analysis.RunKruskalWallis()
 # 3 and more depended groups comparison:
+analysis.RunOnewayAnovaRM()
 analysis.RunFriedman()
 # single group tests"
@@ -113,20 +114,40 @@ If errors occured, *GetResult()* returns an empty dictionary
 ---
-## Pre-Alpha dev status.
-### TODO:
---Kruskal-Wallis test - add Dunn's multiple comparisons
---Anova: add 2-way anova and 3-way(?)
-check:
---Wilcoxon signed-rank test and Mann-whitney - check mechanism of one-tailed calc, looks like it works wrong
-checked tests:
---Wilcoxon 2 tail - ok
---Mann-whitney 2 tail - ok
+## Pre-Alpha dev status.
+### TODO:
+-- Kruskal-Wallis test - add Dunn's multiple comparisons
+-- Anova: add 2-way anova and 3-way anova
+-- onevay Anova: add repeated measures (for normal dependent values) with and without Gaisser-Greenhouse correction
+-- onevay Anova: add Brown-Forsithe and Welch (for normal independent values with unequal SDs between groups)
+-- paired T-test: add ratio-paired t-test (ratios of paired values are consistent)
+-- add Welch test (for norm data unequal variances)
+-- add Kolmogorov-smirnov test (unpaired nonparametric 2 sample, compare cumulative distributions)
+-- add independent t-test with Welch correction (do not assume equal SDs in groups)
+-- add correlation test, correlation diagram
+-- add linear regression, regression diagram
+-- add QQ plot
+-- n-sample tests: add onetail option
+✅ done -- detailed normality test results
+checked tests:
+1-sample:
+--Wilcoxon 2,1 tails - ok
+--t-tests 2,1 tails -ok
+2-sample:
+--Wilcoxon 2,1 tails - ok
+--Mann-whitney 2,1 tails - ok
+--t-tests 2,1 tails -ok
+n-sample:
+--Kruskal-Wallis 2 tail - ok
+--Friedman 2 tail - ok
+--one-way ANOWA 2 tail - ok

autostatlib-0.2.0/src/AutoStatLib.egg-info/requires.txt → autostatlib-0.2.2/requirements.txt RENAMED Viewed

@@ -1,3 +1,4 @@
 numpy
 scipy
 statsmodels
+pandas

{autostatlib-0.2.0 → autostatlib-0.2.2}/src/AutoStatLib/AutoStatLib.py RENAMED Viewed

@@ -1,6 +1,8 @@
 import numpy as np
+import pandas as pd
 from statsmodels.stats.diagnostic import lilliefors
-from scipy.stats import ttest_rel, ttest_ind, ttest_1samp, wilcoxon, mannwhitneyu, f_oneway, kruskal, friedmanchisquare, shapiro, kstest, anderson, normaltest
+from statsmodels.stats.anova import AnovaRM
+from scipy.stats import ttest_rel, ttest_ind, ttest_1samp, wilcoxon, mannwhitneyu, f_oneway, kruskal, friedmanchisquare, shapiro, anderson, normaltest
 class __StatisticalTests():
@@ -8,37 +10,113 @@ class __StatisticalTests():
         Statistical tests mixin
     '''
-    def anova(self):
+    def run_test_auto(self):
+        if self.n_groups == 1:
+            if self.parametric:
+                self.run_test_by_id('t_test_single_sample')
+            else:
+                self.run_test_by_id('wilcoxon_single_sample')
+        elif self.n_groups == 2:
+            if self.paired:
+                if self.parametric:
+                    self.run_test_by_id('t_test_paired')
+                else:
+                    self.run_test_by_id('wilcoxon')
+            else:
+                if self.parametric:
+                    self.run_test_by_id('t_test_independent')
+                else:
+                    self.run_test_by_id('mann_whitney')
+        elif self.n_groups >= 3:
+            if self.paired:
+                if self.parametric:
+                    self.run_test_by_id('anova_1w_rm')
+                else:
+                    self.run_test_by_id('friedman')
+            else:
+                if self.parametric:
+                    self.run_test_by_id('anova_1w_ordinary')
+                else:
+                    self.run_test_by_id('kruskal_wallis')
+        else:
+            pass
+    def run_test_by_id(self, test_id):
+        test_names_dict = {
+            'anova_1w_ordinary': 'Ordinary One-Way ANOVA',
+            'anova_1w_rm': 'Repeated Measures One-Way ANOVA',
+            'friedman': 'Friedman test',
+            'kruskal_wallis': 'Kruskal-Wallis test',
+            'mann_whitney': 'Mann-Whitney U test',
+            't_test_independent': 't-test for independent samples',
+            't_test_paired': 't-test for paired samples',
+            't_test_single_sample': 'Single-sample t-test',
+            'wilcoxon': 'Wilcoxon signed-rank test',
+            'wilcoxon_single_sample': 'Wilcoxon signed-rank test for single sample',
+        }
+        match test_id:
+            case 'anova_1w_ordinary': stat, p_value = self.anova_1w_ordinary()
+            case 'anova_1w_rm': stat, p_value = self.anova_1w_rm()
+            case 'friedman': stat, p_value = self.friedman()
+            case 'kruskal_wallis': stat, p_value = self.kruskal_wallis()
+            case 'mann_whitney': stat, p_value = self.mann_whitney()
+            case 't_test_independent': stat, p_value = self.t_test_independent()
+            case 't_test_paired': stat, p_value = self.t_test_paired()
+            case 't_test_single_sample': stat, p_value = self.t_test_single_sample()
+            case 'wilcoxon': stat, p_value = self.wilcoxon()
+            case 'wilcoxon_single_sample': stat, p_value = self.wilcoxon_single_sample()
+        if test_id in self.test_ids_dependent:
+            self.paired = True
+        else:
+            self.paired = False
+        self.test_name = test_names_dict[test_id]
+        self.test_id = test_id
+        self.test_stat = stat
+        self.p_value = p_value
+    def anova_1w_ordinary(self):
         stat, p_value = f_oneway(*self.data)
         self.tails = 2
         # if self.tails == 1 and p_value > 0.5:
         #     p_value /= 2
         # if self.tails == 1:
         #     p_value /= 2
-        self.test_name = 'ANOVA'
-        self.test_id = 'anova'
-        self.paired = False
-        self.test_stat = stat
-        self.p_value = p_value
+        return stat, p_value
+    def anova_1w_rm(self):
+        """
+        Perform repeated measures one-way ANOVA test.
+        Parameters:
+        data: list of lists, where each sublist represents repeated measures for a subject
+        """
+        df = self.matrix_to_dataframe(self.data)
+        res = AnovaRM(df, 'Value', 'Row', within=['Col']).fit()
+        stat = res.anova_table['F Value'][0]
+        p_value = res.anova_table['Pr > F'][0]
+        self.tails = 2
+        return stat, p_value
-    def friedman_test(self):
+    def friedman(self):
         stat, p_value = friedmanchisquare(*self.data)
         self.tails = 2
-        self.test_name = 'Friedman test'
-        self.test_id = 'friedman'
-        self.paired = True
-        self.test_stat = stat
-        self.p_value = p_value
+        return stat, p_value
-    def kruskal_wallis_test(self):
+    def kruskal_wallis(self):
         stat, p_value = kruskal(*self.data)
-        self.test_name = 'Kruskal-Wallis test'
-        self.test_id = 'kruskal_wallis'
-        self.paired = False
-        self.test_stat = stat
-        self.p_value = p_value
+        return stat, p_value
-    def mann_whitney_u_test(self):
+    def mann_whitney(self):
         stat, p_value = mannwhitneyu(
             self.data[0], self.data[1], alternative='two-sided')
         if self.tails == 1:
@@ -49,78 +127,53 @@ class __StatisticalTests():
         #     self.data[0], self.data[1], alternative='two-sided' if self.tails == 2 else 'less')
         # if self.tails == 1 and p_value > 0.5:
         #     p_value = 1-p_value
+        return stat, p_value
-        self.test_name = 'Mann-Whitney U test'
-        self.test_id = 'mann_whitney'
-        self.paired = False
-        self.test_stat = stat
-        self.p_value = p_value
-    def t_test_independend(self):
-        t_stat, t_p_value = ttest_ind(
+    def t_test_independent(self):
+        stat, p_value = ttest_ind(
             self.data[0], self.data[1])
         if self.tails == 1:
-            t_p_value /= 2
-        self.test_name = 't-test for independend samples'
-        self.test_id = 't_test_independend'
-        self.paired = False
-        self.test_stat = t_stat
-        self.p_value = t_p_value
+            p_value /= 2
+        return stat, p_value
     def t_test_paired(self):
-        t_stat, t_p_value = ttest_rel(
+        stat, p_value = ttest_rel(
             self.data[0], self.data[1])
         if self.tails == 1:
-            t_p_value /= 2
-        self.test_name = 't-test for paired samples'
-        self.test_id = 't_test_paired'
-        self.paired = True
-        self.test_stat = t_stat
-        self.p_value = t_p_value
+            p_value /= 2
+        return stat, p_value
     def t_test_single_sample(self):
         if self.popmean == None:
             self.popmean = 0
             self.AddWarning('no_pop_mean_set')
-        t_stat, t_p_value = ttest_1samp(self.data[0], self.popmean)
+        stat, p_value = ttest_1samp(self.data[0], self.popmean)
         if self.tails == 1:
-            t_p_value /= 2
-        self.test_name = 'Single-sample t-test'
-        self.test_id = 't_test_single_sample'
-        self.paired = False
-        self.test_stat = t_stat
-        self.p_value = t_p_value
+            p_value /= 2
+        return stat, p_value
+    def wilcoxon(self):
+        stat, p_value = wilcoxon(self.data[0], self.data[1])
+        if self.tails == 1:
+            p_value /= 2
+        return stat, p_value
     def wilcoxon_single_sample(self):
         if self.popmean == None:
             self.popmean = 0
             self.AddWarning('no_pop_mean_set')
         data = [i - self.popmean for i in self.data[0]]
-        w_stat, p_value = wilcoxon(data)
+        stat, p_value = wilcoxon(data)
         if self.tails == 1:
             p_value /= 2
-        self.test_name = 'Wilcoxon signed-rank test for single sample'
-        self.test_id = 'wilcoxon_single_sample'
-        self.paired = False
-        self.test_stat = w_stat
-        self.p_value = p_value
-    def wilcoxon(self):
-        stat, p_value = wilcoxon(self.data[0], self.data[1])
-        if self.tails == 1:
-            p_value /= 2
-        self.test_name = 'Wilcoxon signed-rank test'
-        self.test_id = 'wilcoxon'
-        self.paired = True
-        self.test_stat = stat
-        self.p_value = p_value
+        return stat, p_value
 class __NormalityTests():
     '''
         Normality tests mixin
-        see the article about minimum sample size for tests:
+        see the article about minimal sample size for tests:
         Power comparisons of Shapiro-Wilk, Kolmogorov-Smirnov,
         Lilliefors and Anderson-Darling tests, Nornadiah Mohd Razali1, Yap Bee Wah1
     '''
@@ -171,7 +224,7 @@ class __NormalityTests():
     def anderson_get_p(self, data, dist='norm'):
         '''
-            calculating p-value for Anderson-Darling test using the method described here:
+            calculating p-value for Anderson-Darling test using the method described here:
             Computation of Probability Associated with Anderson-Darling Statistic
             Lorentz Jantschi and Sorana D. Bolboaca, 2018 - Mathematics
@@ -199,6 +252,65 @@ class __NormalityTests():
         return ad, p
+class __Helpers():
+    def matrix_to_dataframe(self, matrix):
+        data = []
+        cols = []
+        rows = []
+        order_number = 1
+        for i, row in enumerate(matrix):
+            for j, value in enumerate(row):
+                data.append(value)
+                cols.append(i)
+                rows.append(j)
+                order_number += 1
+        df = pd.DataFrame(
+            {'Row': rows, 'Col': cols, 'Value': data})
+        return df
+    def create_results_dict(self) -> dict:
+        self.stars_int = self.make_stars()
+        self.stars_str = '*' * self.stars_int if self.stars_int else 'ns'
+        return {
+            'p-value': self.make_p_value_printed(),
+            'Significance(p<0.05)':  True if self.p_value.item() < 0.05 else False,
+            'Stars_Printed': self.stars_str,
+            'Test_Name': self.test_name,
+            'Groups_Compared': self.n_groups,
+            'Population_Mean': self.popmean if self.n_groups == 1 else 'N/A',
+            'Data_Normaly_Distributed': self.parametric,
+            'Parametric_Test_Applied': True if self.test_id in self.test_ids_parametric else False,
+            'Paired_Test_Applied': self.paired,
+            'Tails': self.tails,
+            'p-value_exact': self.p_value.item(),
+            'Stars':  self.stars_int,
+            # 'Stat_Value': self.test_stat.item(),
+            'Warnings': self.warnings,
+            'Groups_N': [len(self.data[i]) for i in range(len(self.data))],
+            'Groups_Median': [np.median(self.data[i]).item() for i in range(len(self.data))],
+            'Groups_Mean': [np.mean(self.data[i]).item() for i in range(len(self.data))],
+            'Groups_SD': [np.std(self.data[i]).item() for i in range(len(self.data))],
+            'Groups_SE': [np.std(self.data[i]).item() / np.sqrt(len(self.data)).item() for i in range(len(self.data))],
+            # actually returns list of lists of numpy dtypes of float64, next make it return regular floats:
+            'Samples': self.data,
+        }
+    def log(self, *args, **kwargs):
+        message = ' '.join(map(str, args))
+        # print(message, **kwargs)
+        self.summary += '\n' + message
+    def AddWarning(self, warning_id):
+        message = self.warning_ids_all[warning_id]
+        self.log(message)
+        self.warnings.append(message)
 class __TextFormatting():
     '''
         Text formatting mixin
@@ -293,45 +405,6 @@ class __TextFormatting():
             else:
                 self.log(i, ':', ' ' * shift, self.results[i])
-    def create_results_dict(self) -> dict:
-        self.stars_int = self.make_stars()
-        self.stars_str = '*' * self.stars_int if self.stars_int else 'ns'
-        return {
-            'p-value': self.make_p_value_printed(),
-            'Significance(p<0.05)':  True if self.p_value.item() < 0.05 else False,
-            'Stars_Printed': self.stars_str,
-            'Test_Name': self.test_name,
-            'Groups_Compared': self.n_groups,
-            'Population_Mean': self.popmean if self.n_groups == 1 else 'N/A',
-            'Data_Normaly_Distributed': self.parametric,
-            'Parametric_Test_Applied': True if self.test_id in self.test_ids_parametric else False,
-            'Paired_Test_Applied': self.paired,
-            'Tails': self.tails,
-            'p-value_exact': self.p_value.item(),
-            'Stars':  self.stars_int,
-            # 'Stat_Value': self.test_stat.item(),
-            'Warnings': self.warnings,
-            'Groups_N': [len(self.data[i]) for i in range(len(self.data))],
-            'Groups_Median': [np.median(self.data[i]).item() for i in range(len(self.data))],
-            'Groups_Mean': [np.mean(self.data[i]).item() for i in range(len(self.data))],
-            'Groups_SD': [np.std(self.data[i]).item() for i in range(len(self.data))],
-            'Groups_SE': [np.std(self.data[i]).item() / np.sqrt(len(self.data)).item() for i in range(len(self.data))],
-            # actually returns list of lists of numpy dtypes of float64, next make it return regular floats:
-            'Samples': self.data,
-        }
-    def log(self, *args, **kwargs):
-        message = ' '.join(map(str, args))
-        # print(message, **kwargs)
-        self.summary += '\n' + message
-    def AddWarning(self, warning_id):
-        message = self.warning_ids_all[warning_id]
-        self.log(message)
-        self.warnings.append(message)
 class __InputFormatting():
     def floatify_recursive(self, data):
@@ -349,7 +422,7 @@ class __InputFormatting():
                 return None
-class StatisticalAnalysis(__StatisticalTests, __NormalityTests, __TextFormatting, __InputFormatting):
+class StatisticalAnalysis(__StatisticalTests, __NormalityTests, __TextFormatting, __InputFormatting, __Helpers):
     '''
         The main class
         *documentation placeholder*
@@ -372,21 +445,49 @@ class StatisticalAnalysis(__StatisticalTests, __NormalityTests, __TextFormatting
         self.n_groups = len(self.groups_list)
         self.warning_flag_non_numeric_data = False
         self.summary = ''
-        self.test_ids_parametric = ['anova',
-                                    't_test_independend',
-                                    't_test_paired',
-                                    't_test_single_sample',]
+        # test IDs classification:
         self.test_ids_all = [  # in aplhabetical order
-            'anova',
+            'anova_1w_ordinary',
+            'anova_1w_rm',
             'friedman',
             'kruskal_wallis',
             'mann_whitney',
-            't_test_independend',
+            't_test_independent',
             't_test_paired',
             't_test_single_sample',
             'wilcoxon',
             'wilcoxon_single_sample',
         ]
+        self.test_ids_parametric = [
+            'anova_1w_ordinary',
+            'anova_1w_rm'
+            't_test_independent',
+            't_test_paired',
+            't_test_single_sample',
+        ]
+        self.test_ids_dependent = [
+            'anova_1w_rm',
+            'friedman',
+            't_test_paired',
+            'wilcoxon',
+        ]
+        self.test_ids_3sample = [
+            'anova_1w_ordinary',
+            'anova_1w_rm',
+            'friedman',
+            'kruskal_wallis',
+        ]
+        self.test_ids_2sample = [
+            'mann_whitney',
+            't_test_independent',
+            't_test_paired',
+            'wilcoxon',
+        ]
+        self.test_ids_1sample = [
+            't_test_single_sample',
+            'wilcoxon_single_sample',
+        ]
         self.warning_ids_all = {
             # 'not-numeric':                     '\nWarning: Non-numeric data was found in input and ignored.\n         Make sure the input data is correct to get the correct results\n',
             'param_test_with_non-normal_data': '\nWarning: Parametric test was manualy chosen for Not-Normaly distributed data.\n         The results might be skewed. \n         Please, run non-parametric test or preform automatic test selection.\n',
@@ -425,28 +526,18 @@ class StatisticalAnalysis(__StatisticalTests, __NormalityTests, __TextFormatting
             assert self.data, 'There is no input data'
             assert self.tails in [1, 2], 'Tails parameter can be 1 or 2 only'
             assert test in self.test_ids_all or test == 'auto', 'Wrong test id choosen, ensure you called correct function'
-            assert not (self.n_groups > 1
-                        and (test == 't_test_single_sample'
-                             or test == 'wilcoxon_single_sample')), 'Only one group of data must be given for single-group tests'
             assert all(len(
                 group) >= 4 for group in self.data), 'Each group must contain at least four values'
-            assert not (self.paired == True and not all(len(lst) == len(
-                self.data[0]) for lst in self.data)), 'Paired groups must be the same length'
-            assert not (test == 'friedman' and not all(len(lst) == len(
-                self.data[0]) for lst in self.data)), 'Paired groups must be the same length for Friedman Chi Square test'
-            assert not (test == 't_test_paired' and not all(len(lst) == len(
-                self.data[0]) for lst in self.data)), 'Paired groups must be the same length for Paired t-test'
-            assert not (test == 'wilcoxon' and not all(len(lst) == len(
-                self.data[0]) for lst in self.data)), 'Paired groups must be the same length for Wilcoxon signed-rank test'
-            assert not (test == 'friedman' and self.n_groups <
-                        3), 'At least three groups of data must be given for 3-groups tests'
-            assert not ((test == 'anova'
-                         or test == 'kruskal_wallis') and self.n_groups < 2), 'At least two groups of data must be given for ANOVA or Kruskal Wallis tests'
-            assert not ((test == 'wilcoxon'
-                         or test == 't_test_independend'
-                         or test == 't_test_paired'
-                         or test == 'mann_whitney')
-                        and self.n_groups != 2), 'Only two groups of data must be given for 2-groups tests'
+            assert not (self.paired == True
+                        and not all(len(lst) == len(self.data[0]) for lst in self.data)), 'Paired groups must have the same length'
+            assert not (test in self.test_ids_dependent
+                        and not all(len(lst) == len(self.data[0]) for lst in self.data)), 'Groups must have the same length for dependent groups test'
+            assert not (test in self.test_ids_2sample
+                        and self.n_groups != 2), f'Only two groups of data must be given for 2-groups tests, got {self.n_groups}'
+            assert not (test in self.test_ids_1sample
+                        and self.n_groups > 1), f'Only one group of data must be given for single-group tests, got {self.n_groups}'
+            assert not (test in self.test_ids_3sample
+                        and self.n_groups < 3), f'At least three groups of data must be given for multi-groups tests, got {self.n_groups}'
         except AssertionError as error:
             self.log('\nTest  :', test)
             self.log('Error :', error)
@@ -490,27 +581,13 @@ class StatisticalAnalysis(__StatisticalTests, __NormalityTests, __TextFormatting
         if not test == 'auto' and self.parametric and not test in self.test_ids_parametric:
             self.AddWarning('non-param_test_with_normal_data')
-        if test == 'anova':
-            self.anova()
-        elif test == 'friedman':
-            self.friedman_test()
-        elif test == 'kruskal_wallis':
-            self.kruskal_wallis_test()
-        elif test == 'mann_whitney':
-            self.mann_whitney_u_test()
-        elif test == 't_test_independend':
-            self.t_test_independend()
-        elif test == 't_test_paired':
-            self.t_test_paired()
-        elif test == 't_test_single_sample':
-            self.t_test_single_sample()
-        elif test == 'wilcoxon':
-            self.wilcoxon()
-        elif test == 'wilcoxon_single_sample':
-            self.wilcoxon_single_sample()
+        # run the test
+        if test in self.test_ids_all:
+            self.run_test_by_id(test)
         else:
-            self.log('Automatic test selection preformed.')
-            self.__auto()
+            self.run_test_auto()
         # print the results
         self.results = self.create_results_dict()
@@ -523,32 +600,7 @@ class StatisticalAnalysis(__StatisticalTests, __NormalityTests, __TextFormatting
         if self.verbose == True:
             print(self.summary)
-    def __auto(self):
-        if self.n_groups == 2:
-            if self.paired:
-                if self.parametric:
-                    return self.t_test_paired()
-                else:
-                    return self.wilcoxon()
-            else:
-                if self.parametric:
-                    return self.t_test_independend()
-                else:
-                    return self.mann_whitney_u_test()
-        elif self.n_groups == 1:
-            if self.parametric:
-                return self.t_test_single_sample()
-            else:
-                return self.wilcoxon_single_sample()
-        else:
-            if self.paired:
-                return self.friedman_test()
-            else:
-                if self.parametric:
-                    return self.anova()
-                else:
-                    return self.kruskal_wallis_test()
     # public methods:
     def RunAuto(self):
@@ -557,8 +609,11 @@ class StatisticalAnalysis(__StatisticalTests, __NormalityTests, __TextFormatting
     def RunManual(self, test):
         self.__run_test(test)
-    def RunAnova(self):
-        self.__run_test(test='anova')
+    def RunOnewayAnova(self):
+        self.__run_test(test='anova_1w_ordinary')
+    def RunOnewayAnovaRM(self):
+        self.__run_test(test='anova_1w_rm')
     def RunFriedman(self):
         self.__run_test(test='friedman')
@@ -570,7 +625,7 @@ class StatisticalAnalysis(__StatisticalTests, __NormalityTests, __TextFormatting
         self.__run_test(test='mann_whitney')
     def RunTtest(self):
-        self.__run_test(test='t_test_independend')
+        self.__run_test(test='t_test_independent')
     def RunTtestPaired(self):
         self.__run_test(test='t_test_paired')
@@ -603,6 +658,9 @@ class StatisticalAnalysis(__StatisticalTests, __NormalityTests, __TextFormatting
         else:
             return self.summary
+    def GetTestIDs(self):
+        return self.test_ids_all
     def PrintSummary(self):
         print(self.summary)

{autostatlib-0.2.0 → autostatlib-0.2.2}/src/AutoStatLib/_version.py RENAMED Viewed

@@ -1,2 +1,2 @@
 # AutoStatLib package version:
-__version__ = "0.2.0"
+__version__ = "0.2.2"

{autostatlib-0.2.0 → autostatlib-0.2.2/src/AutoStatLib.egg-info}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.2
 Name: AutoStatLib
-Version: 0.2.0
+Version: 0.2.2
 Summary: AutoStatLib - a simple statistical analysis tool
 Author: Stemonitis, SciWare LLC
 Author-email: konung-yaropolk <yaropolk1995@gmail.com>
@@ -531,6 +531,7 @@ License-File: LICENSE
 Requires-Dist: numpy
 Requires-Dist: scipy
 Requires-Dist: statsmodels
+Requires-Dist: pandas
 # AutoStatLib - python library for automated statistical analysis
@@ -569,7 +570,7 @@ data_uniform = [list(np.random.uniform(i+3, i+1, n)) for i in range(groups)]
 # set the parameters:
-paired = False     # is groups dependend or not
+paired = False     # is groups dependent or not
 tails = 2          # two-tailed or one-tailed result
 popmean = 0        # population mean - only for single-sample tests needed
@@ -585,7 +586,7 @@ analysis.RunAuto()
 or you can choose specific tests:
 ```python
-# 2 groups independend:
+# 2 groups independent:
 analysis.RunTtest()
 analysis.RunMannWhitney()
@@ -594,10 +595,11 @@ analysis.RunTtestPaired()
 analysis.RunWilcoxon()
 # 3 and more independed groups comparison:
-analysis.RunAnova()
+analysis.RunOnewayAnova()
 analysis.RunKruskalWallis()
 # 3 and more depended groups comparison:
+analysis.RunOnewayAnovaRM()
 analysis.RunFriedman()
 # single group tests"
@@ -647,20 +649,40 @@ If errors occured, *GetResult()* returns an empty dictionary
 ---
-## Pre-Alpha dev status.
-### TODO:
---Kruskal-Wallis test - add Dunn's multiple comparisons
---Anova: add 2-way anova and 3-way(?)
-check:
---Wilcoxon signed-rank test and Mann-whitney - check mechanism of one-tailed calc, looks like it works wrong
-checked tests:
---Wilcoxon 2 tail - ok
---Mann-whitney 2 tail - ok
+## Pre-Alpha dev status.
+### TODO:
+-- Kruskal-Wallis test - add Dunn's multiple comparisons
+-- Anova: add 2-way anova and 3-way anova
+-- onevay Anova: add repeated measures (for normal dependent values) with and without Gaisser-Greenhouse correction
+-- onevay Anova: add Brown-Forsithe and Welch (for normal independent values with unequal SDs between groups)
+-- paired T-test: add ratio-paired t-test (ratios of paired values are consistent)
+-- add Welch test (for norm data unequal variances)
+-- add Kolmogorov-smirnov test (unpaired nonparametric 2 sample, compare cumulative distributions)
+-- add independent t-test with Welch correction (do not assume equal SDs in groups)
+-- add correlation test, correlation diagram
+-- add linear regression, regression diagram
+-- add QQ plot
+-- n-sample tests: add onetail option
+✅ done -- detailed normality test results
+checked tests:
+1-sample:
+--Wilcoxon 2,1 tails - ok
+--t-tests 2,1 tails -ok
+2-sample:
+--Wilcoxon 2,1 tails - ok
+--Mann-whitney 2,1 tails - ok
+--t-tests 2,1 tails -ok
+n-sample:
+--Kruskal-Wallis 2 tail - ok
+--Friedman 2 tail - ok
+--one-way ANOWA 2 tail - ok