PyPI - AutoStatLib - Versions diffs - 0.2.21__tar.gz → 0.2.23__tar.gz - Mend

AutoStatLib 0.2.21tar.gz → 0.2.23tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (23) hide show

autostatlib-0.2.23/PKG-INFO ADDED Viewed

@@ -0,0 +1,192 @@
+Metadata-Version: 2.4
+Name: AutoStatLib
+Version: 0.2.23
+Summary: AutoStatLib - a simple statistical analysis tool
+Author: Stemonitis, SciWare LLC
+Author-email: konung-yaropolk <yaropolk1995@gmail.com>
+License-Expression: LGPL-2.1-or-later
+Project-URL: Homepage, https://github.com/konung-yaropolk/AutoStatLib
+Project-URL: Repository, https://github.com/konung-yaropolk/AutoStatLib.git
+Project-URL: Issues, https://github.com/konung-yaropolk/AutoStatLib/issues
+Keywords: Science,Statistics
+Classifier: Programming Language :: Python
+Classifier: Programming Language :: Python :: 3
+Classifier: Programming Language :: Python :: 3.10
+Classifier: Operating System :: OS Independent
+Classifier: Development Status :: 4 - Beta
+Classifier: Intended Audience :: Developers
+Classifier: Intended Audience :: Science/Research
+Classifier: Natural Language :: English
+Classifier: Topic :: Software Development :: Libraries
+Classifier: Topic :: Software Development :: Libraries :: Python Modules
+Classifier: Topic :: Scientific/Engineering
+Classifier: Topic :: Scientific/Engineering :: Information Analysis
+Requires-Python: >=3.10
+Description-Content-Type: text/markdown
+License-File: LICENSE
+Requires-Dist: numpy
+Requires-Dist: scipy
+Requires-Dist: statsmodels
+Requires-Dist: matplotlib
+Requires-Dist: seaborn
+Requires-Dist: scikit-posthocs
+Requires-Dist: pandas
+Dynamic: license-file
+# AutoStatLib - python library for automated statistical analysis
+[![pypi_version](https://img.shields.io/pypi/v/AutoStatLib?label=PyPI&color=green)](https://pypi.org/project/AutoStatLib)
+[![GitHub Release](https://img.shields.io/github/v/release/konung-yaropolk/AutoStatLib?label=GitHub&color=green&link=https%3A%2F%2Fgithub.com%2Fkonung-yaropolk%2FAutoStatLib)](https://github.com/konung-yaropolk/AutoStatLib)
+[![PyPI - License](https://img.shields.io/pypi/l/AutoStatLib)](https://pypi.org/project/AutoStatLib)
+[![Python](https://img.shields.io/badge/Python-v3.10%5E-green?logo=python)](https://pypi.org/project/AutoStatLib)
+[![PyPI - Downloads](https://img.shields.io/pypi/dm/AutoStatLib?label=PyPI%20stats&color=blue)](https://pypi.org/project/AutoStatLib)
+### To install run the command:
+```bash
+pip install autostatlib
+```
+### Example use case:
+See the /demo directory on Git repo or
+use the following example:
+```python
+import numpy as np
+import AutoStatLib
+# generate random data:
+groups = 2
+n = 30
+# normal data
+data_norm = [list(np.random.normal(.5*i + 4, abs(1-.2*i), n))
+        for i in range(groups)]
+# non-normal data
+data_uniform = [list(np.random.uniform(i+3, i+1, n)) for i in range(groups)]
+# set the parameters:
+paired = False     # is groups dependent or not
+tails = 2          # two-tailed or one-tailed result
+popmean = 0        # population mean - only for single-sample tests needed
+# initiate the analysis
+analysis = AutoStatLib.StatisticalAnalysis(
+    data_norm, paired=paired, tails=tails, popmean=popmean)
+```
+now you can preform automated statistical test selection:
+```python
+analysis.RunAuto()
+```
+or you can choose specific tests:
+```python
+# 2 groups independent:
+analysis.RunTtest()
+analysis.RunMannWhitney()
+# 2 groups paired"
+analysis.RunTtestPaired()
+analysis.RunWilcoxon()
+# 3 and more independed groups comparison:
+analysis.RunOnewayAnova()
+analysis.RunKruskalWallis()
+# 3 and more depended groups comparison:
+analysis.RunOnewayAnovaRM()
+analysis.RunFriedman()
+# single group tests"
+analysis.RunTtestSingleSample()
+analysis.RunWilcoxonSingleSample()
+```
+Test summary will be printed to the console.
+You can also get it as a python string via *GetSummary()* method.
+---
+Test results are accessible as a dictionary via *GetResult()* method:
+```python
+results = analysis.GetResult()
+```
+The results dictionary keys with representing value types:
+```
+{
+    'p_value' :                    String
+    'Significance(p<0.05)' :       Boolean
+    'Stars_Printed' :              String
+    'Test_Name' :                  String
+    'Groups_Compared' :            Integer
+    'Population_Mean' :            Float   (taken from the input)
+    'Data_Normaly_Distributed' :   Boolean
+    'Parametric_Test_Applied' :    Boolean
+    'Paired_Test_Applied' :        Boolean
+    'Tails' :                      Integer (taken from the input)
+    'p_value_exact' :              Float
+    'Stars' :                      Integer
+    'Warnings' :                   String
+    'Groups_N' :                   List of integers
+    'Groups_Median' :              List of floats
+    'Groups_Mean' :                List of floats
+    'Groups_SD' :                  List of floats
+    'Groups_SE' :                  List of floats
+    'Samples' :                    List of input values by groups
+                                           (taken from the input)
+    'Posthoc_Matrix' :             2D List of floats
+    'Posthoc_Matrix_bool' :        2D List of Boolean
+    'Posthoc_Matrix_printed':      2D List of String
+    'Posthoc_Matrix_stars':        2D List of String
+}
+```
+If errors occured, *GetResult()* returns an empty dictionary
+---
+## Alpha dev status.
+### TODO:
+-- Anova: posthocs
+-- Anova: add 2-way anova and 3-way anova
+-- onevay Anova: add repeated measures (for normal dependent values) with and without Gaisser-Greenhouse correction
+-- onevay Anova: add Brown-Forsithe and Welch (for normal independent values with unequal SDs between groups)
+-- paired T-test: add ratio-paired t-test (ratios of paired values are consistent)
+-- add Welch test (for norm data unequal variances)
+-- add Kolmogorov-smirnov test (unpaired nonparametric 2 sample, compare cumulative distributions)
+-- add independent t-test with Welch correction (do not assume equal SDs in groups)
+-- add correlation test, correlation diagram
+-- add linear regression, regression diagram
+-- add QQ plot
+-- n-sample tests: add onetail option
+✅ done -- detailed normality test results
+✅ done -- added posthoc: Kruskal-Wallis Dunn's multiple comparisons
+tests check:
+1-sample:
+✅ok --Wilcoxon 2,1 tails
+✅ok --t-tests 2,1 tails
+2-sample:
+✅ok --Wilcoxon 2,1 tails
+✅ok --Mann-whitney 2,1 tails
+✅ok --t-tests 2,1 tails
+n-sample:
+✅ok --Kruskal-Wallis 2 tail
+✅ok --Dunn's multiple comparisons
+✅ok --Friedman 2 tail
+✅ok --one-way ANOVA 2-tailed
+✅ok --Tukey`s multiple comparisons

{autostatlib-0.2.21 → autostatlib-0.2.23}/pyproject.toml RENAMED Viewed

@@ -5,7 +5,8 @@ build-backend = "setuptools.build_meta"
 [project]
 name = "AutoStatLib"
 dynamic = ["version", "dependencies"]
-license = {file = "LICENSE"}
+license = "LGPL-2.1-or-later"
+# license = {file = "LICENSE"}
 authors = [
   { name="konung-yaropolk", email="yaropolk1995@gmail.com" },
   { name="Stemonitis"},
@@ -18,8 +19,8 @@ requires-python = ">=3.10"
 classifiers = [
     "Programming Language :: Python",
     "Programming Language :: Python :: 3",
-    "Programming Language :: Python :: 3.12",
-    "License :: OSI Approved :: GNU Lesser General Public License v2 or later (LGPLv2+)",
+    "Programming Language :: Python :: 3.10",
+    # "License :: OSI Approved :: GNU Lesser General Public License v2 or later (LGPLv2+)",
     "Operating System :: OS Independent",
     "Development Status :: 4 - Beta",
     "Intended Audience :: Developers",

{autostatlib-0.2.21 → autostatlib-0.2.23}/src/AutoStatLib/AutoStatLib.py RENAMED Viewed

@@ -19,7 +19,9 @@ class StatisticalAnalysis(StatisticalTests, NormalityTests, TextFormatting, Help
                  popmean=None,
                  posthoc=False,
                  verbose=True,
-                 groups_name=[]):
+                 raise_errors=False,
+                 groups_name=[],
+                 subgrouping=[]):
         self.results = None
         self.error = False
         self.groups_list = groups_list
@@ -28,10 +30,11 @@ class StatisticalAnalysis(StatisticalTests, NormalityTests, TextFormatting, Help
         self.popmean = popmean
         self.posthoc = posthoc
         self.verbose = verbose
+        self.raise_errors = raise_errors
         self.n_groups = len(self.groups_list)
         self.groups_name = [groups_name[i % len(groups_name)]
-                             for i in range(self.n_groups)] if groups_name and groups_name != [''] else [f'Group {i+1}' for i in range(self.n_groups)]
+                            for i in range(self.n_groups)] if groups_name and groups_name != [''] else [f'Group {i+1}' for i in range(self.n_groups)]
+        self.subgrouping = subgrouping if subgrouping else [0]
         self.warning_flag_non_numeric_data = False
         self.summary = 'AutoStatLib v{}'.format(__version__)
@@ -99,7 +102,6 @@ class StatisticalAnalysis(StatisticalTests, NormalityTests, TextFormatting, Help
             'no_pop_mean_set':                 '\nWarning: No Population Mean was set up for single-sample test, used default 0 value.\n         The results might be skewed. \n         Please, set the Population Mean and run the test again.\n',
         }
     def run_test(self, test='auto'):
         # reset values from previous tests
@@ -111,6 +113,7 @@ class StatisticalAnalysis(StatisticalTests, NormalityTests, TextFormatting, Help
         self.test_id = None
         self.test_stat = None
         self.p_value = None
+        self.parametric = None
         self.posthoc_matrix_df = None
         self.posthoc_matrix = []
         self.posthoc_name = ''
@@ -128,7 +131,7 @@ class StatisticalAnalysis(StatisticalTests, NormalityTests, TextFormatting, Help
         # delete the empty cols from input
         self.data = [col for col in self.data if any(
             x is not None for x in col)]
         # User input assertion block
         try:
             assert self.data, 'There is no input data'
@@ -137,9 +140,9 @@ class StatisticalAnalysis(StatisticalTests, NormalityTests, TextFormatting, Help
             assert all(len(
                 group) >= 4 for group in self.data), 'Each group must contain at least four values'
             assert not (self.paired is True
-                        and not all(len(lst) == len(self.data[0]) for lst in self.data)), 'Paired groups must have the same length'
+                        and not all(len(lst) == len(self.data[0]) for lst in self.data)), 'Paired samples must have the same length'
             assert not (test in self.test_ids_dependent
-                        and not all(len(lst) == len(self.data[0]) for lst in self.data)), 'Groups must have the same length for dependent groups test'
+                        and not all(len(lst) == len(self.data[0]) for lst in self.data)), 'Samples must have the same length for the dependend statistics test'
             assert not (test in self.test_ids_2sample
                         and self.n_groups != 2), f'Only two groups of data must be given for 2-groups tests, got {self.n_groups}'
             assert not (test in self.test_ids_1sample
@@ -147,11 +150,22 @@ class StatisticalAnalysis(StatisticalTests, NormalityTests, TextFormatting, Help
             assert not (test in self.test_ids_3sample
                         and self.n_groups < 3), f'At least three groups of data must be given for multi-groups tests, got {self.n_groups}'
         except AssertionError as error:
-            self.log('\nTest  :', test)
-            self.log('Error :', error)
-            self.log('-'*67 + '\n')
-            self.error = True
-            print(self.summary)
+            self.run_test_by_id('none')
+            self.results = self.create_results_dict()
+            if self.raise_errors:
+                raise ValueError(error)
+            # Print errmessage:
+            if self.verbose:
+                self.log('\nTest  :', test)
+                self.log('Error :', error)
+                self.log('-'*67 + '\n')
+                self.error = True
+                print(self.summary)
+            else:
+                print('AutoStatLib Error :', error)
             return
         # Print the data
@@ -165,7 +179,7 @@ class StatisticalAnalysis(StatisticalTests, NormalityTests, TextFormatting, Help
         self.log('Shapiro-Wilk, Lilliefors, Anderson-Darling, D\'Agostino-Pearson')
         self.log(
             '[+] -positive, [-] -negative, [ ] -too small group for the test\n')
-        self.log('        Test   :   SW  LF  AD  AP  ')
+        self.log('                   SW  LF  AD  AP  ')
         for i, data in enumerate(self.data):
             poll = self.check_normality(data)
             isnormal = any(poll)
@@ -173,7 +187,7 @@ class StatisticalAnalysis(StatisticalTests, NormalityTests, TextFormatting, Help
                 '+' if x is True else '-' if x is False else ' ' if x is None else 'e' for x in poll)
             self.normals.append(isnormal)
             self.log(
-                f'        {self.groups_name[i].ljust(7, ' ')[:7]}:    {poll_print[0]}   {poll_print[1]}   {poll_print[2]}   {poll_print[3]}   so disrtibution seems {"normal" if isnormal else "not normal"}')
+                f'    {self.groups_name[i].ljust(11, ' ')[:11]}:    {poll_print[0]}   {poll_print[1]}   {poll_print[2]}   {poll_print[3]}   so disrtibution seems {"normal" if isnormal else "not normal"}')
         self.parametric = all(self.normals)
         # print test choosen

AutoStatLib 0.2.21__tar.gz → 0.2.23__tar.gz

AutoStatLib 0.2.21tar.gz → 0.2.23tar.gz