qdesc 1.0.7__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,191 @@
1
+ GNU GENERAL PUBLIC LICENSE
2
+ Version 3, 29 June 2007
3
+
4
+ Copyright (C) 2007 Free Software Foundation, Inc. <https://fsf.org/>
5
+ Everyone is permitted to copy and distribute verbatim copies
6
+ of this license document, but changing it is not allowed.
7
+
8
+ Preamble
9
+
10
+ The GNU General Public License is a free, copyleft license for
11
+ software and other kinds of works.
12
+
13
+ The licenses for most software and other practical works are designed
14
+ to take away your freedom to share and change the works. By contrast,
15
+ the GNU General Public License is intended to guarantee your freedom to
16
+ share and change all versions of a program--to make sure it remains free
17
+ software for all its users. We, the Free Software Foundation, use the
18
+ GNU General Public License for most of our software; it applies also to
19
+ any other work released this way by its authors. You can apply it to
20
+ your programs, too.
21
+
22
+ When we speak of free software, we are referring to freedom, not
23
+ price. Our General Public Licenses are designed to make sure that you
24
+ have the freedom to distribute copies of free software (and charge for
25
+ them if you wish), that you receive source code or can get it if you
26
+ want it, that you can change the software or use pieces of it in new
27
+ free programs, and that you know you can do these things.
28
+
29
+ To protect your rights, we need to prevent others from denying you
30
+ these rights or asking you to surrender the rights. Therefore, you have
31
+ certain responsibilities if you distribute copies of the software, or if
32
+ you modify it: responsibilities to respect the freedom of others.
33
+
34
+ For example, if you distribute copies of such a program, whether
35
+ gratis or for a fee, you must pass on to the recipients the same
36
+ freedoms that you received. You must make sure that they, too, receive
37
+ or can get the source code. And you must show them these terms so they
38
+ know their rights.
39
+
40
+ Developers that use the GNU GPL protect your rights with two steps:
41
+ (1) assert copyright on the software, and (2) offer you this License
42
+ giving you legal permission to copy, distribute and/or modify it.
43
+
44
+ For the developers' and authors' protection, the GPL clearly explains
45
+ that there is no warranty for this free software. For both users' and
46
+ authors' sake, the GPL requires that modified versions be marked as
47
+ changed, so that their problems will not be attributed erroneously to
48
+ authors of previous versions.
49
+
50
+ Some devices are designed to deny users access to install or run
51
+ modified versions of the software inside them, although the manufacturer
52
+ can do so. This is fundamentally incompatible with the aim of protecting
53
+ users' freedom to change the software. The systematic pattern of such
54
+ abuse occurs in the area of products for individuals to use, which is
55
+ precisely where it is most unacceptable. Therefore, we have designed
56
+ this version of the GPL to prohibit the practice for those products. If
57
+ such problems arise substantially in other domains, we stand ready to
58
+ extend this provision to those domains in future versions of the GPL, as
59
+ needed to protect the freedom of users.
60
+
61
+ Finally, every program is threatened constantly by software patents.
62
+ States should not allow patents to restrict development and use of
63
+ software on general-purpose computers, but in those that do, we wish to
64
+ avoid the special danger that patents applied to a free program could
65
+ make it effectively proprietary. To prevent this, the GPL assures that
66
+ patents cannot be used to render the program non-free.
67
+
68
+ The precise terms and conditions for copying, distribution and
69
+ modification follow.
70
+
71
+ TERMS AND CONDITIONS
72
+
73
+ 0. Definitions.
74
+
75
+ "This License" refers to version 3 of the GNU General Public License.
76
+
77
+ "Copyright" also means copyright-like laws that apply to other kinds
78
+ of works, such as semiconductor masks.
79
+
80
+ "The Program" refers to any copyrightable work licensed under this
81
+ License. Each licensee is addressed as "you". "Licensees" and
82
+ "recipients" may be individuals or organizations.
83
+
84
+ To "modify" a work means to copy from or adapt all or part of the work
85
+ in a fashion requiring copyright permission, other than the making of an
86
+ exact copy. The resulting work is called a "modified version" of the
87
+ earlier work or a work "based on" the earlier work.
88
+
89
+ A "covered work" means either the unmodified Program or a work based
90
+ on the Program.
91
+
92
+ To "propagate" a work means to do anything with it that, without
93
+ permission, would make you directly or secondarily liable for
94
+ infringement under applicable copyright law, except executing it on a
95
+ computer or modifying a private copy. Propagation includes copying,
96
+ distribution (with or without modification), making available to the
97
+ public, and in some countries other activities as well.
98
+
99
+ To "convey" a work means any kind of propagation that enables other
100
+ parties to make or receive copies. Mere interaction with a user through
101
+ a computer network, with no transfer of a copy, is not conveying.
102
+
103
+ An interactive user interface displays "Appropriate Legal Notices"
104
+ to the extent that it includes a convenient and prominently visible
105
+ feature that (1) displays an appropriate copyright notice, and (2)
106
+ tells the user that there is no warranty for the work (except to the
107
+ extent that warranties are provided), that licensees may convey the
108
+ work under this License, and how to view a copy of this License. If
109
+ the interface presents a list of user commands or options, such as a
110
+ menu, a prominent item in the list meets this criterion.
111
+
112
+ 1. Source Code.
113
+
114
+ The "source code" for a work means the preferred form of the work
115
+ for making modifications to it. "Object code" means any non-source
116
+ form of a work.
117
+
118
+ A "Standard Interface" means an interface that either is an official
119
+ standard defined by a recognized standards body, or, in the case of
120
+ interfaces specified for a particular programming language, one that
121
+ is widely used among developers working in that language.
122
+
123
+ The "System Libraries" of an executable work include anything, other
124
+ than the work as a whole, that (a) is included in the normal form of
125
+ packaging a Major Component, but which is not part of that Major
126
+ Component, and (b) serves only to enable use of the work with that
127
+ Major Component, or to implement a Standard Interface for which an
128
+ implementation is available to the public in source code form. A
129
+ "Major Component", in this context, means a major essential component
130
+ (kernel, window system, and so on) of the specific operating system
131
+ (if any) on which the executable work runs, or a compiler used to
132
+ produce the work, or an object code interpreter used to run it.
133
+
134
+ The "Corresponding Source" for a work in object code form means all
135
+ the source code needed to generate, install, and (for an executable
136
+ work) run the object code and to modify the work, including scripts to
137
+ control those activities. However, it does not include the work's
138
+ System Libraries, or general-purpose tools or generally available free
139
+ programs which are used unmodified in performing those activities but
140
+ which are not part of the work. For example, Corresponding Source
141
+ includes interface definition files associated with source files for
142
+ the work, and the source code for shared libraries and dynamically
143
+ linked subprograms that the work is specifically designed to require,
144
+ such as by intimate data communication or control flow between those
145
+ subprograms and other parts of the work.
146
+
147
+ The Corresponding Source need not include anything that users
148
+ can regenerate automatically from other parts of the Corresponding
149
+ Source.
150
+
151
+ The Corresponding Source for a work in source code form is that
152
+ same work.
153
+
154
+ 2. Basic Permissions.
155
+
156
+ All rights granted under this License are granted for the term of
157
+ copyright on the Program, and are irrevocable provided the stated
158
+ conditions are met. This License explicitly affirms your unlimited
159
+ permission to run the unmodified Program. The output from running a
160
+ covered work is covered by this License only if the output, given its
161
+ content, constitutes a covered work. This License acknowledges your
162
+ rights of fair use or other equivalent, as provided by copyright law.
163
+
164
+ You may make, run and propagate covered works that you do not
165
+ convey, without conditions so long as your license otherwise remains
166
+ in force. You may convey covered works to others for the sole purpose
167
+ of having them make modifications exclusively for you, or provide you
168
+ with facilities for running those works, provided that you comply with
169
+ the terms of this License in conveying all material for which you do
170
+ not control copyright. Those thus making or running the covered works
171
+ for you must do so exclusively on your behalf, under your direction
172
+ and control, on terms that prohibit them from making any copies of
173
+ your copyrighted material outside their relationship with you.
174
+
175
+ Conveying under any other circumstances is permitted solely under
176
+ the conditions stated below. Sublicensing is not allowed; section 10
177
+ makes it unnecessary.
178
+
179
+ 3. Protecting Users' Legal Rights From Anti-Circumvention Law.
180
+
181
+ No covered work shall be deemed part of an effective technological
182
+ measure under any applicable law fulfilling obligations under article
183
+ 11 of the WIPO copyright treaty adopted on 20 December 1996, or
184
+ similar laws prohibiting or restricting circumvention of such
185
+ measures.
186
+
187
+ When you convey a covered work, you waive any legal power to forbid
188
+ circumvention of technological measures to the extent such circumvention
189
+ is effected by exercising rights under this License with respect to
190
+ the covered work, and you disclaim any intention to limit operation or
191
+ modification of the work as a means of enforcing
qdesc-1.0.7/PKG-INFO ADDED
@@ -0,0 +1,170 @@
1
+ Metadata-Version: 2.4
2
+ Name: qdesc
3
+ Version: 1.0.7
4
+ Summary: Quick and Easy way to do descriptive analysis.
5
+ Author: Paolo Hilado
6
+ Author-email: datasciencepgh@proton.me
7
+ Description-Content-Type: text/markdown
8
+ License-File: LICENCE.txt
9
+ Requires-Dist: pandas
10
+ Requires-Dist: numpy
11
+ Requires-Dist: scipy
12
+ Requires-Dist: seaborn
13
+ Requires-Dist: matplotlib
14
+ Requires-Dist: statsmodels
15
+ Dynamic: author
16
+ Dynamic: author-email
17
+ Dynamic: description
18
+ Dynamic: description-content-type
19
+ Dynamic: license-file
20
+ Dynamic: requires-dist
21
+ Dynamic: summary
22
+
23
+ # <font face = 'Impact' color = '#274472' > qdesc : Quick and Easy Descriptive Analysis </font>
24
+ ![QDesc](https://raw.githubusercontent.com/Dcroix/qdesc/refs/heads/main/QDesc%20logo.png)
25
+
26
+ ![Package Version](https://img.shields.io/badge/version-1.0.7-pink)
27
+ ![Downloads](https://pepy.tech/badge/qdesc)
28
+ ![Python Version](https://img.shields.io/badge/python-3.8%2B-blue)
29
+ [![DOI](https://zenodo.org/badge/990715642.svg)](https://doi.org/10.5281/zenodo.15834554)
30
+ ![License: GPL v3.0](https://img.shields.io/badge/license-GPL%20v3.0-blue)
31
+
32
+ ## <font face = 'Calibri' color = '#274472' > Installation </font>
33
+ ```sh
34
+ pip install qdesc
35
+ ```
36
+
37
+ ## <font face = 'Calibri' color = '#274472' > Overview </font>
38
+ Qdesc is a package for quick and easy descriptive analysis. It is a powerful Python package designed for quick and easy descriptive analysis of quantitative data. It provides essential statistics like mean and standard deviation for normal distribution and median and raw median absolute deviation for skewed data. With built-in functions for frequency distributions, users can effortlessly analyze categorical variables and export results to a spreadsheet. The package also includes a normality check dashboard, featuring Anderson-Darling statistics and visualizations like histograms and Q-Q plots. Whether you're handling structured datasets or exploring statistical trends, qdesc streamlines the process with efficiency and clarity.
39
+
40
+ ## <font face = 'Calibri' color = '#274472' > Creating a sample dataframe</font>
41
+ ```python
42
+ import pandas as pd
43
+ import numpy as np
44
+
45
+ # Create sample data
46
+ data = {
47
+ "Age": np.random.randint(18, 60, size=15), # Continuous variable
48
+ "Salary": np.random.randint(30000, 120000, size=15), # Continuous variable
49
+ "Department": np.random.choice(["HR", "Finance", "IT", "Marketing"], size=15), # Categorical variable
50
+ "Gender": np.random.choice(["Male", "Female"], size=15), # Categorical variable
51
+ }
52
+ # Create DataFrame
53
+ df = pd.DataFrame(data)
54
+ ```
55
+ ## <font face = 'Calibri' color = '#274472' > qd.desc Function</font>
56
+ The function qd.desc(df) generates the following statistics:
57
+ * count - number of observations
58
+ * mean - measure of central tendency for normal distribution
59
+ * std - measure of spread for normal distribution
60
+ * median - measure of central tendency for skewed distributions or those with outliers
61
+ * MAD - measure of spread for skewed distributions or those with outliers; this is manual Median Absolute Deviation (MAD) which is more robust when dealing with non-normal distributions.
62
+ * min - lowest observed value
63
+ * max - highest observed value
64
+ * AD_stat - Anderson - Darling Statistic
65
+ * 5% crit_value - critical value for a 5% Significance Level
66
+ * 1% crit_value - critical value for a 1% Significance Level
67
+
68
+ ```python
69
+ import qdesc as qd
70
+ qd.desc(df)
71
+
72
+ | Variable | Count | Mean | Std Dev | Median | MAD | Min | Max | AD Stat | 5% Crit Value |
73
+ |----------|-------|-------|---------|--------|-------|-------|--------|---------|---------------|
74
+ | Age | 15.0 | 37.87 | 13.51 | 38.0 | 12.0 | 20.0 | 59.0 | 0.41 | 0.68 |
75
+ | Salary | 15.0 | 72724 | 29483 | 67660 | 26311 | 34168 | 119590 | 0.40 | 0.68 |
76
+ ```
77
+
78
+ ## <font face = 'Calibri' color = '#274472' > qd.grp_desc Function</font>
79
+ This function, qd.grp_desc(df, "Continuous Var", "Group Var") creates a table for descriptive statistics similar to the qd.desc function but has the measures
80
+ presented for each level of the grouping variable. It allows one to check whether these measures, for each group, are approximately normal or not. Combining it
81
+ with qd.normcheck_dashboard allows one to decide on the appropriate measure of central tendency and spread.
82
+
83
+ ```python
84
+ import qdesc as qd
85
+ qd.grp_desc(df, "Salary", "Gender")
86
+
87
+ | Gender | Count | Mean - | Std Dev | Median | MAD | Min | Max | AD Stat | 5% Crit Value |
88
+ |---------|-------|-----------|-----------|----------|----------|--------|---------|---------|---------------|
89
+ | Female | 7 | 84,871.14 | 32,350.37 | 93,971.0 | 25,619.0 | 40,476 | 119,590 | 0.36 | 0.74 |
90
+ | Male | 8 | 62,096.12 | 23,766.82 | 60,347.0 | 14,278.5 | 34,168 | 106,281 | 0.24 | 0.71 |
91
+ ```
92
+
93
+ ## <font face = 'Calibri' color = '#274472' > qd.freqdist Function</font>
94
+ Run the function qd.freqdist(df, "Variable Name") to easily create a frequency distribution for your chosen categorical variable with the following:
95
+ * Variable Levels (i.e., for Sex Variable: Male and Female)
96
+ * Counts - the number of observations
97
+ * Percentage - percentage of observations from total.
98
+
99
+ ```python
100
+ import qdesc as qd
101
+ qd.freqdist(df, "Department")
102
+
103
+ | Department | Count | Percentage |
104
+ |------------|-------|------------|
105
+ | IT | 5 | 33.33 |
106
+ | HR | 5 | 33.33 |
107
+ | Marketing | 3 | 20.00 |
108
+ | Finance | 2 | 13.33 |
109
+ ```
110
+
111
+ ## <font face = 'Calibri' color = '#274472' > qd.freqdist_a Function</font>
112
+ Run the function qd.freqdist_a(df, ascending = FALSE) to easily create frequency distribution tables, arranged in descending manner (default) or ascending (TRUE), for all the categorical variables in your data frame. The resulting table will include columns such as:
113
+ * Variable levels (i.e., for Satisfaction: Very Low, Low, Moderate, High, Very High)
114
+ * Counts - the number of observations
115
+ * Percentage - percentage of observations from total.
116
+
117
+ ```python
118
+ import qdesc as qd
119
+ qd.freqdist_a(df)
120
+
121
+ | Column | Value | Count | Percentage |
122
+ |------------|----------|-------|------------|
123
+ | Department | IT | 5 | 33.33% |
124
+ | Department | HR | 5 | 33.33% |
125
+ | Department | Marketing| 3 | 20.00% |
126
+ | Department | Finance | 2 | 13.33% |
127
+ | Gender | Male | 8 | 53.33% |
128
+ | Gender | Female | 7 | 46.67% |
129
+ ```
130
+
131
+ ## <font face = 'Calibri' color = '#274472' > qd.freqdist_to_excel Function</font>
132
+ Run the function qd.freqdist_to_excel(df, "Filename.xlsx", ascending = FALSE ) to easily create frequency distribution tables, arranged in descending manner (default) or ascending (TRUE), for all the categorical variables in your data frame and SAVED as separate sheets in the .xlsx File. The resulting table will include columns such as:
133
+ * Variable levels (i.e., for Satisfaction: Very Low, Low, Moderate, High, Very High)
134
+ * Counts - the number of observations
135
+ * Percentage - percentage of observations from total.
136
+
137
+ ```python
138
+ import qdesc as qd
139
+ qd.freqdist_to_excel(df, "Results.xlsx")
140
+
141
+ Frequency distributions written to Results.xlsx
142
+ ```
143
+
144
+ ## <font face = 'Calibri' color = '#274472' > qd.normcheck_dashboard Function</font>
145
+ Run the function qd.normcheck_dashboard(df) to efficiently check each numeric variable for normality of its distribution. It will compute the Anderson-Darling statistic and create visualizations (i.e., qq-plot, histogram, and boxplots) for checking whether the distribution is approximately normal.
146
+
147
+ ```python
148
+ import qdesc as qd
149
+ qd.normcheck_dashboard(df)
150
+ ```
151
+ ![Descriptive Statistics](https://raw.githubusercontent.com/Dcroix/qdesc/refs/heads/main/qd.normcheck_dashboard.png)
152
+
153
+
154
+ ## <font face = 'Calibri' color = '#3D5B59' > License</font>
155
+ This project is licensed under the GPL-3 License. See the LICENSE file for more details.
156
+
157
+ ## <font face = 'Calibri' color = '#3D5B59' > Acknowledgements</font>
158
+ Acknowledgement of the libraries used by this package...
159
+
160
+ ### <font face = 'Calibri' color = '#3D5B59' > Pandas</font>
161
+ Pandas is distributed under the BSD 3-Clause License, pandas is developed by Pandas contributors. Copyright (c) 2008-2024, the pandas development team All rights reserved.
162
+ ### <font face = 'Calibri' color = '#3D5B59' > Numpy</font>
163
+ NumPy is distributed under the BSD 3-Clause License, numpy is developed by NumPy contributors. Copyright (c) 2005-2024, NumPy Developers. All rights reserved.
164
+ ### <font face = 'Calibri' color = '#3D5B59' > SciPy</font>
165
+ SciPy is distributed under the BSD License, scipy is developed by SciPy contributors. Copyright (c) 2001-2024, SciPy Developers. All rights reserved.
166
+
167
+
168
+
169
+
170
+
qdesc-1.0.7/README.md ADDED
@@ -0,0 +1,148 @@
1
+ # <font face = 'Impact' color = '#274472' > qdesc : Quick and Easy Descriptive Analysis </font>
2
+ ![QDesc](https://raw.githubusercontent.com/Dcroix/qdesc/refs/heads/main/QDesc%20logo.png)
3
+
4
+ ![Package Version](https://img.shields.io/badge/version-1.0.7-pink)
5
+ ![Downloads](https://pepy.tech/badge/qdesc)
6
+ ![Python Version](https://img.shields.io/badge/python-3.8%2B-blue)
7
+ [![DOI](https://zenodo.org/badge/990715642.svg)](https://doi.org/10.5281/zenodo.15834554)
8
+ ![License: GPL v3.0](https://img.shields.io/badge/license-GPL%20v3.0-blue)
9
+
10
+ ## <font face = 'Calibri' color = '#274472' > Installation </font>
11
+ ```sh
12
+ pip install qdesc
13
+ ```
14
+
15
+ ## <font face = 'Calibri' color = '#274472' > Overview </font>
16
+ Qdesc is a package for quick and easy descriptive analysis. It is a powerful Python package designed for quick and easy descriptive analysis of quantitative data. It provides essential statistics like mean and standard deviation for normal distribution and median and raw median absolute deviation for skewed data. With built-in functions for frequency distributions, users can effortlessly analyze categorical variables and export results to a spreadsheet. The package also includes a normality check dashboard, featuring Anderson-Darling statistics and visualizations like histograms and Q-Q plots. Whether you're handling structured datasets or exploring statistical trends, qdesc streamlines the process with efficiency and clarity.
17
+
18
+ ## <font face = 'Calibri' color = '#274472' > Creating a sample dataframe</font>
19
+ ```python
20
+ import pandas as pd
21
+ import numpy as np
22
+
23
+ # Create sample data
24
+ data = {
25
+ "Age": np.random.randint(18, 60, size=15), # Continuous variable
26
+ "Salary": np.random.randint(30000, 120000, size=15), # Continuous variable
27
+ "Department": np.random.choice(["HR", "Finance", "IT", "Marketing"], size=15), # Categorical variable
28
+ "Gender": np.random.choice(["Male", "Female"], size=15), # Categorical variable
29
+ }
30
+ # Create DataFrame
31
+ df = pd.DataFrame(data)
32
+ ```
33
+ ## <font face = 'Calibri' color = '#274472' > qd.desc Function</font>
34
+ The function qd.desc(df) generates the following statistics:
35
+ * count - number of observations
36
+ * mean - measure of central tendency for normal distribution
37
+ * std - measure of spread for normal distribution
38
+ * median - measure of central tendency for skewed distributions or those with outliers
39
+ * MAD - measure of spread for skewed distributions or those with outliers; this is manual Median Absolute Deviation (MAD) which is more robust when dealing with non-normal distributions.
40
+ * min - lowest observed value
41
+ * max - highest observed value
42
+ * AD_stat - Anderson - Darling Statistic
43
+ * 5% crit_value - critical value for a 5% Significance Level
44
+ * 1% crit_value - critical value for a 1% Significance Level
45
+
46
+ ```python
47
+ import qdesc as qd
48
+ qd.desc(df)
49
+
50
+ | Variable | Count | Mean | Std Dev | Median | MAD | Min | Max | AD Stat | 5% Crit Value |
51
+ |----------|-------|-------|---------|--------|-------|-------|--------|---------|---------------|
52
+ | Age | 15.0 | 37.87 | 13.51 | 38.0 | 12.0 | 20.0 | 59.0 | 0.41 | 0.68 |
53
+ | Salary | 15.0 | 72724 | 29483 | 67660 | 26311 | 34168 | 119590 | 0.40 | 0.68 |
54
+ ```
55
+
56
+ ## <font face = 'Calibri' color = '#274472' > qd.grp_desc Function</font>
57
+ This function, qd.grp_desc(df, "Continuous Var", "Group Var") creates a table for descriptive statistics similar to the qd.desc function but has the measures
58
+ presented for each level of the grouping variable. It allows one to check whether these measures, for each group, are approximately normal or not. Combining it
59
+ with qd.normcheck_dashboard allows one to decide on the appropriate measure of central tendency and spread.
60
+
61
+ ```python
62
+ import qdesc as qd
63
+ qd.grp_desc(df, "Salary", "Gender")
64
+
65
+ | Gender | Count | Mean - | Std Dev | Median | MAD | Min | Max | AD Stat | 5% Crit Value |
66
+ |---------|-------|-----------|-----------|----------|----------|--------|---------|---------|---------------|
67
+ | Female | 7 | 84,871.14 | 32,350.37 | 93,971.0 | 25,619.0 | 40,476 | 119,590 | 0.36 | 0.74 |
68
+ | Male | 8 | 62,096.12 | 23,766.82 | 60,347.0 | 14,278.5 | 34,168 | 106,281 | 0.24 | 0.71 |
69
+ ```
70
+
71
+ ## <font face = 'Calibri' color = '#274472' > qd.freqdist Function</font>
72
+ Run the function qd.freqdist(df, "Variable Name") to easily create a frequency distribution for your chosen categorical variable with the following:
73
+ * Variable Levels (i.e., for Sex Variable: Male and Female)
74
+ * Counts - the number of observations
75
+ * Percentage - percentage of observations from total.
76
+
77
+ ```python
78
+ import qdesc as qd
79
+ qd.freqdist(df, "Department")
80
+
81
+ | Department | Count | Percentage |
82
+ |------------|-------|------------|
83
+ | IT | 5 | 33.33 |
84
+ | HR | 5 | 33.33 |
85
+ | Marketing | 3 | 20.00 |
86
+ | Finance | 2 | 13.33 |
87
+ ```
88
+
89
+ ## <font face = 'Calibri' color = '#274472' > qd.freqdist_a Function</font>
90
+ Run the function qd.freqdist_a(df, ascending = FALSE) to easily create frequency distribution tables, arranged in descending manner (default) or ascending (TRUE), for all the categorical variables in your data frame. The resulting table will include columns such as:
91
+ * Variable levels (i.e., for Satisfaction: Very Low, Low, Moderate, High, Very High)
92
+ * Counts - the number of observations
93
+ * Percentage - percentage of observations from total.
94
+
95
+ ```python
96
+ import qdesc as qd
97
+ qd.freqdist_a(df)
98
+
99
+ | Column | Value | Count | Percentage |
100
+ |------------|----------|-------|------------|
101
+ | Department | IT | 5 | 33.33% |
102
+ | Department | HR | 5 | 33.33% |
103
+ | Department | Marketing| 3 | 20.00% |
104
+ | Department | Finance | 2 | 13.33% |
105
+ | Gender | Male | 8 | 53.33% |
106
+ | Gender | Female | 7 | 46.67% |
107
+ ```
108
+
109
+ ## <font face = 'Calibri' color = '#274472' > qd.freqdist_to_excel Function</font>
110
+ Run the function qd.freqdist_to_excel(df, "Filename.xlsx", ascending = FALSE ) to easily create frequency distribution tables, arranged in descending manner (default) or ascending (TRUE), for all the categorical variables in your data frame and SAVED as separate sheets in the .xlsx File. The resulting table will include columns such as:
111
+ * Variable levels (i.e., for Satisfaction: Very Low, Low, Moderate, High, Very High)
112
+ * Counts - the number of observations
113
+ * Percentage - percentage of observations from total.
114
+
115
+ ```python
116
+ import qdesc as qd
117
+ qd.freqdist_to_excel(df, "Results.xlsx")
118
+
119
+ Frequency distributions written to Results.xlsx
120
+ ```
121
+
122
+ ## <font face = 'Calibri' color = '#274472' > qd.normcheck_dashboard Function</font>
123
+ Run the function qd.normcheck_dashboard(df) to efficiently check each numeric variable for normality of its distribution. It will compute the Anderson-Darling statistic and create visualizations (i.e., qq-plot, histogram, and boxplots) for checking whether the distribution is approximately normal.
124
+
125
+ ```python
126
+ import qdesc as qd
127
+ qd.normcheck_dashboard(df)
128
+ ```
129
+ ![Descriptive Statistics](https://raw.githubusercontent.com/Dcroix/qdesc/refs/heads/main/qd.normcheck_dashboard.png)
130
+
131
+
132
+ ## <font face = 'Calibri' color = '#3D5B59' > License</font>
133
+ This project is licensed under the GPL-3 License. See the LICENSE file for more details.
134
+
135
+ ## <font face = 'Calibri' color = '#3D5B59' > Acknowledgements</font>
136
+ Acknowledgement of the libraries used by this package...
137
+
138
+ ### <font face = 'Calibri' color = '#3D5B59' > Pandas</font>
139
+ Pandas is distributed under the BSD 3-Clause License, pandas is developed by Pandas contributors. Copyright (c) 2008-2024, the pandas development team All rights reserved.
140
+ ### <font face = 'Calibri' color = '#3D5B59' > Numpy</font>
141
+ NumPy is distributed under the BSD 3-Clause License, numpy is developed by NumPy contributors. Copyright (c) 2005-2024, NumPy Developers. All rights reserved.
142
+ ### <font face = 'Calibri' color = '#3D5B59' > SciPy</font>
143
+ SciPy is distributed under the BSD License, scipy is developed by SciPy contributors. Copyright (c) 2001-2024, SciPy Developers. All rights reserved.
144
+
145
+
146
+
147
+
148
+
@@ -0,0 +1,213 @@
1
+ from .update_checker import check_for_update
2
+ import sys
3
+ import threading
4
+ from .update_checker import check_for_update
5
+ import threading
6
+
7
+ # Run update check automatically on import
8
+ threading.Thread(target=lambda: check_for_update("qdesc"), daemon=True).start()
9
+
10
+ # replace with your current version
11
+
12
+ def desc(df):
13
+ import pandas as pd
14
+ import numpy as np
15
+ from scipy.stats import anderson
16
+
17
+ x = np.round(df.describe().T, 2)
18
+ x = x.iloc[:, [0, 1, 2, 5, 3, 7]]
19
+ x.rename(columns={'50%': 'median'}, inplace=True)
20
+
21
+ mad_raw = {}
22
+ mad_norm = {}
23
+
24
+ for column in df.select_dtypes(include=[np.number]):
25
+ clean_col = df[column].dropna()
26
+
27
+ if len(clean_col) == 0:
28
+ mad_raw[column] = np.nan
29
+ mad_norm[column] = np.nan
30
+ continue
31
+ median = np.median(clean_col)
32
+ abs_dev = np.abs(clean_col - median)
33
+ raw = np.median(abs_dev)
34
+ mad_raw[column] = raw
35
+ mad_norm[column] = 1.4826 * raw # normalized MAD
36
+ mad_df = pd.DataFrame({
37
+ 'MAD_raw': mad_raw,
38
+ 'MAD_norm': mad_norm
39
+ })
40
+ results = {}
41
+ for column in df.select_dtypes(include=[np.number]):
42
+ clean_col = df[column].dropna()
43
+
44
+ if len(clean_col) < 5:
45
+ results[column] = {'AD_stat': np.nan, '5% crit_value': np.nan}
46
+ continue
47
+
48
+ result = anderson(clean_col)
49
+ results[column] = {
50
+ 'AD_stat': result.statistic,
51
+ '5% crit_value': result.critical_values[2]
52
+ }
53
+ anderson_df = pd.DataFrame.from_dict(results, orient='index')
54
+ xl = x.iloc[:, :4]
55
+ xr = x.iloc[:, 4:]
56
+ x_df = np.round(pd.concat([xl, mad_df, xr, anderson_df], axis=1), 2)
57
+ return x_df
58
+
59
+ def grp_desc(df, numeric_col, group_col):
60
+ import pandas as pd
61
+ import numpy as np
62
+ from scipy.stats import median_abs_deviation, anderson
63
+ results = []
64
+ for group, group_df in df.groupby(group_col):
65
+ data = group_df[numeric_col].dropna()
66
+ if len(data) < 2:
67
+
68
+ stats = {
69
+ group_col: group,
70
+ 'count': len(data),
71
+ 'mean': np.nan,
72
+ 'std': np.nan,
73
+ 'median': np.nan,
74
+ 'mad': np.nan,
75
+ 'min': np.nan,
76
+ 'max': np.nan,
77
+ 'anderson_stat': np.nan,
78
+ 'crit_5%': np.nan
79
+ }
80
+ else:
81
+ ad_result = anderson(data, dist='norm')
82
+ stats = {
83
+ group_col: group,
84
+ 'count': len(data),
85
+ 'mean': data.mean(),
86
+ 'std': data.std(),
87
+ 'median': data.median(),
88
+ 'mad_raw': median_abs_deviation(data),
89
+ 'mad_norm': median_abs_deviation(data)*1.4826,
90
+ 'min': data.min(),
91
+ 'max': data.max(),
92
+ 'AD_stat': ad_result.statistic,
93
+ 'crit_5%': ad_result.critical_values[2], # 5% is the third value
94
+ }
95
+ results.append(stats)
96
+ return np.round(pd.DataFrame(results),2)
97
+
98
+ def freqdist(df, column_name):
99
+ import pandas as pd
100
+ import numpy as np
101
+ if column_name not in df.columns:
102
+ raise ValueError(f"Column '{column_name}' not found in DataFrame.")
103
+
104
+ if df[column_name].dtype not in ['object', 'category']:
105
+ raise ValueError(f"Column '{column_name}' is not a categorical column.")
106
+
107
+ freq_dist = df[column_name].value_counts().reset_index()
108
+ freq_dist.columns = [column_name, 'Count']
109
+ freq_dist['Percentage'] = np.round((freq_dist['Count'] / len(df)) * 100,2)
110
+ return freq_dist
111
+
112
+ def freqdist_a(df, ascending=False):
113
+ import pandas as pd
114
+ import numpy as np
115
+ results = []
116
+ for column in df.select_dtypes(include=['object', 'category']).columns:
117
+ frequency_table = df[column].value_counts()
118
+ percentage_table = np.round(df[column].value_counts(normalize=True) * 100,2)
119
+
120
+ distribution = pd.DataFrame({
121
+ 'Column': column,
122
+ 'Value': frequency_table.index,
123
+ 'Count': frequency_table.values,
124
+ 'Percentage': percentage_table.values
125
+ })
126
+ distribution = distribution.sort_values(by='Percentage', ascending=ascending)
127
+ results.append(distribution)
128
+ final_df = pd.concat(results, ignore_index=True)
129
+ return final_df
130
+
131
+ def clean_sheet_name(name):
132
+ import re
133
+ # Remove invalid characters
134
+ name = re.sub(r'[:\\/?*\[\]]', '', name)
135
+ # Limit to 31 characters
136
+ name = name.strip()[:31]
137
+ return name
138
+
139
+ def freqdist_to_excel(df, output_path, sort_by='Percentage', ascending=False, top_n=None):
140
+ import pandas as pd
141
+ import numpy as np
142
+ used_names = set()
143
+ with pd.ExcelWriter(output_path, engine='xlsxwriter') as writer:
144
+ for column in df.select_dtypes(include=['object', 'category']).columns:
145
+ frequency_table = df[column].value_counts()
146
+ percentage_table = df[column].value_counts(normalize=True) * 100
147
+
148
+ distribution = pd.DataFrame({
149
+ 'Value': frequency_table.index,
150
+ 'Count': frequency_table.values,
151
+ 'Percentage': percentage_table.values
152
+ })
153
+ distribution = distribution.sort_values(by=sort_by, ascending=ascending)
154
+ if top_n is not None:
155
+ distribution = distribution.head(top_n)
156
+ # Generate safe sheet name
157
+ base_name = clean_sheet_name(column)
158
+ sheet_name = base_name
159
+ count = 1
160
+ while sheet_name.lower() in used_names:
161
+ sheet_name = f"{base_name[:28]}_{count}" # stay within 31 char limit
162
+ count += 1
163
+ used_names.add(sheet_name.lower())
164
+ distribution.to_excel(writer, sheet_name=sheet_name, index=False)
165
+ print(f"Frequency distributions written to {output_path}")
166
+
167
+ def normcheck_dashboard(df, significance_level=0.05, figsize=(18, 5)):
168
+ import pandas as pd
169
+ import numpy as np
170
+ import matplotlib.pyplot as plt
171
+ import seaborn as sns
172
+ import statsmodels.api as sm
173
+ from scipy.stats import anderson
174
+ import math
175
+ numeric_cols = df.select_dtypes(include=[np.number]).columns
176
+ if len(numeric_cols) == 0:
177
+ print("No numeric columns to analyze.")
178
+ return
179
+ for col in numeric_cols:
180
+ data = df[col].dropna()
181
+ print(f"\n--- Variable: {col} ---")
182
+ if len(data) < 8:
183
+ print("Not enough data to perform Anderson-Darling test or meaningful plots.")
184
+ continue
185
+ # Anderson-Darling Test
186
+ test_result = anderson(data, dist='norm')
187
+ stat = test_result.statistic
188
+ sig_levels = test_result.significance_level
189
+ crit_values = test_result.critical_values
190
+ level_diff = [abs(sl - (significance_level * 100)) for sl in sig_levels]
191
+ closest_index = level_diff.index(min(level_diff))
192
+ used_sig = sig_levels[closest_index]
193
+ crit_val = crit_values[closest_index]
194
+ decision = "Fail to Reject Null" if stat <= crit_val else "Reject Null"
195
+ # Print Summary
196
+ print(f" Anderson-Darling Statistic : {stat:.4f}")
197
+ print(f" Critical Value (@ {used_sig}%) : {crit_val:.4f}")
198
+ print(f" Decision : {decision}")
199
+ # Plots (QQ, Histogram, Boxplot)
200
+ fig, axes = plt.subplots(1, 3, figsize=figsize)
201
+ # QQ Plot
202
+ sm.qqplot(data, line='s', ax=axes[0])
203
+ axes[0].set_title(f"QQ Plot - {col}")
204
+ # Histogram (No KDE)
205
+ sns.histplot(data, bins=30, kde=False, color='gray', alpha=0.3, ax=axes[1])
206
+ axes[1].set_title(f"Histogram - {col}")
207
+ # Boxplot
208
+ sns.boxplot(x=data, ax=axes[2], color='lightblue')
209
+ axes[2].set_title(f"Boxplot - {col}")
210
+ axes[2].set_xlabel(col)
211
+ plt.suptitle(f"Normality Assessment - {col}", fontsize=14, y=1.05)
212
+ plt.tight_layout()
213
+ plt.show()
@@ -0,0 +1,79 @@
1
+ import os
2
+ import json
3
+ import threading
4
+ import requests
5
+ from packaging import version
6
+ from pathlib import Path
7
+ from pkg_resources import get_distribution
8
+
9
+ # Optional: colored printing for console and notebook
10
+ try:
11
+ from IPython.display import display, Markdown
12
+ IN_NOTEBOOK = True
13
+ except ImportError:
14
+ IN_NOTEBOOK = False
15
+
16
+ CHECK_INTERVAL_DAYS = 7
17
+
18
+ def _should_check(cache_file: Path) -> bool:
19
+ if not cache_file.exists():
20
+ return True
21
+ try:
22
+ with open(cache_file, "r") as f:
23
+ data = json.load(f)
24
+ last_check = data.get("last_check", 0)
25
+ except Exception:
26
+ return True
27
+
28
+ import time
29
+ days_since = (time.time() - last_check) / (60 * 60 * 24)
30
+ return days_since >= CHECK_INTERVAL_DAYS
31
+
32
+
33
+ def _write_check_timestamp(cache_file: Path):
34
+ try:
35
+ with open(cache_file, "w") as f:
36
+ json.dump({"last_check": __import__("time").time()}, f)
37
+ except Exception:
38
+ pass
39
+
40
+ def _print_update_message(pkg_name, installed, latest):
41
+ message = (
42
+ f"📈 A new version of '{pkg_name}' is available!\n"
43
+ f"Installed version: {installed}\n"
44
+ f"Latest version: {latest}\n"
45
+ f"Update with:\n"
46
+ f" pip install --upgrade {pkg_name}\n"
47
+ )
48
+ if IN_NOTEBOOK:
49
+ # Display nicely in Jupyter Notebook
50
+ display(Markdown(f"**{message}**"))
51
+ else:
52
+ # Regular console print
53
+ print(message)
54
+
55
+ def _check_now(package_name: str):
56
+ cache_file = Path.home() / f".{package_name}_update_check.json"
57
+ if not _should_check(cache_file):
58
+ return
59
+ _write_check_timestamp(cache_file)
60
+ if os.getenv("PYLIB_DISABLE_UPDATE_CHECK") == "1":
61
+ return
62
+ try:
63
+ installed_version = get_distribution(package_name).version
64
+ except Exception:
65
+ return
66
+ try:
67
+ url = f"https://pypi.org/pypi/{package_name}/json"
68
+ response = requests.get(url, timeout=3)
69
+ latest_version = response.json()["info"]["version"]
70
+ except Exception:
71
+ return
72
+
73
+ if version.parse(latest_version) > version.parse(installed_version):
74
+ _print_update_message(package_name, installed_version, latest_version)
75
+
76
+ def check_for_update(package_name: str):
77
+ # Run in background thread
78
+ thread = threading.Thread(target=_check_now, args=(package_name,), daemon=True)
79
+ thread.start()
@@ -0,0 +1,170 @@
1
+ Metadata-Version: 2.4
2
+ Name: qdesc
3
+ Version: 1.0.7
4
+ Summary: Quick and Easy way to do descriptive analysis.
5
+ Author: Paolo Hilado
6
+ Author-email: datasciencepgh@proton.me
7
+ Description-Content-Type: text/markdown
8
+ License-File: LICENCE.txt
9
+ Requires-Dist: pandas
10
+ Requires-Dist: numpy
11
+ Requires-Dist: scipy
12
+ Requires-Dist: seaborn
13
+ Requires-Dist: matplotlib
14
+ Requires-Dist: statsmodels
15
+ Dynamic: author
16
+ Dynamic: author-email
17
+ Dynamic: description
18
+ Dynamic: description-content-type
19
+ Dynamic: license-file
20
+ Dynamic: requires-dist
21
+ Dynamic: summary
22
+
23
+ # <font face = 'Impact' color = '#274472' > qdesc : Quick and Easy Descriptive Analysis </font>
24
+ ![QDesc](https://raw.githubusercontent.com/Dcroix/qdesc/refs/heads/main/QDesc%20logo.png)
25
+
26
+ ![Package Version](https://img.shields.io/badge/version-1.0.7-pink)
27
+ ![Downloads](https://pepy.tech/badge/qdesc)
28
+ ![Python Version](https://img.shields.io/badge/python-3.8%2B-blue)
29
+ [![DOI](https://zenodo.org/badge/990715642.svg)](https://doi.org/10.5281/zenodo.15834554)
30
+ ![License: GPL v3.0](https://img.shields.io/badge/license-GPL%20v3.0-blue)
31
+
32
+ ## <font face = 'Calibri' color = '#274472' > Installation </font>
33
+ ```sh
34
+ pip install qdesc
35
+ ```
36
+
37
+ ## <font face = 'Calibri' color = '#274472' > Overview </font>
38
+ Qdesc is a package for quick and easy descriptive analysis. It is a powerful Python package designed for quick and easy descriptive analysis of quantitative data. It provides essential statistics like mean and standard deviation for normal distribution and median and raw median absolute deviation for skewed data. With built-in functions for frequency distributions, users can effortlessly analyze categorical variables and export results to a spreadsheet. The package also includes a normality check dashboard, featuring Anderson-Darling statistics and visualizations like histograms and Q-Q plots. Whether you're handling structured datasets or exploring statistical trends, qdesc streamlines the process with efficiency and clarity.
39
+
40
+ ## <font face = 'Calibri' color = '#274472' > Creating a sample dataframe</font>
41
+ ```python
42
+ import pandas as pd
43
+ import numpy as np
44
+
45
+ # Create sample data
46
+ data = {
47
+ "Age": np.random.randint(18, 60, size=15), # Continuous variable
48
+ "Salary": np.random.randint(30000, 120000, size=15), # Continuous variable
49
+ "Department": np.random.choice(["HR", "Finance", "IT", "Marketing"], size=15), # Categorical variable
50
+ "Gender": np.random.choice(["Male", "Female"], size=15), # Categorical variable
51
+ }
52
+ # Create DataFrame
53
+ df = pd.DataFrame(data)
54
+ ```
55
+ ## <font face = 'Calibri' color = '#274472' > qd.desc Function</font>
56
+ The function qd.desc(df) generates the following statistics:
57
+ * count - number of observations
58
+ * mean - measure of central tendency for normal distribution
59
+ * std - measure of spread for normal distribution
60
+ * median - measure of central tendency for skewed distributions or those with outliers
61
+ * MAD - measure of spread for skewed distributions or those with outliers; this is manual Median Absolute Deviation (MAD) which is more robust when dealing with non-normal distributions.
62
+ * min - lowest observed value
63
+ * max - highest observed value
64
+ * AD_stat - Anderson - Darling Statistic
65
+ * 5% crit_value - critical value for a 5% Significance Level
66
+ * 1% crit_value - critical value for a 1% Significance Level
67
+
68
+ ```python
69
+ import qdesc as qd
70
+ qd.desc(df)
71
+
72
+ | Variable | Count | Mean | Std Dev | Median | MAD | Min | Max | AD Stat | 5% Crit Value |
73
+ |----------|-------|-------|---------|--------|-------|-------|--------|---------|---------------|
74
+ | Age | 15.0 | 37.87 | 13.51 | 38.0 | 12.0 | 20.0 | 59.0 | 0.41 | 0.68 |
75
+ | Salary | 15.0 | 72724 | 29483 | 67660 | 26311 | 34168 | 119590 | 0.40 | 0.68 |
76
+ ```
77
+
78
+ ## <font face = 'Calibri' color = '#274472' > qd.grp_desc Function</font>
79
+ This function, qd.grp_desc(df, "Continuous Var", "Group Var") creates a table for descriptive statistics similar to the qd.desc function but has the measures
80
+ presented for each level of the grouping variable. It allows one to check whether these measures, for each group, are approximately normal or not. Combining it
81
+ with qd.normcheck_dashboard allows one to decide on the appropriate measure of central tendency and spread.
82
+
83
+ ```python
84
+ import qdesc as qd
85
+ qd.grp_desc(df, "Salary", "Gender")
86
+
87
+ | Gender | Count | Mean - | Std Dev | Median | MAD | Min | Max | AD Stat | 5% Crit Value |
88
+ |---------|-------|-----------|-----------|----------|----------|--------|---------|---------|---------------|
89
+ | Female | 7 | 84,871.14 | 32,350.37 | 93,971.0 | 25,619.0 | 40,476 | 119,590 | 0.36 | 0.74 |
90
+ | Male | 8 | 62,096.12 | 23,766.82 | 60,347.0 | 14,278.5 | 34,168 | 106,281 | 0.24 | 0.71 |
91
+ ```
92
+
93
+ ## <font face = 'Calibri' color = '#274472' > qd.freqdist Function</font>
94
+ Run the function qd.freqdist(df, "Variable Name") to easily create a frequency distribution for your chosen categorical variable with the following:
95
+ * Variable Levels (i.e., for Sex Variable: Male and Female)
96
+ * Counts - the number of observations
97
+ * Percentage - percentage of observations from total.
98
+
99
+ ```python
100
+ import qdesc as qd
101
+ qd.freqdist(df, "Department")
102
+
103
+ | Department | Count | Percentage |
104
+ |------------|-------|------------|
105
+ | IT | 5 | 33.33 |
106
+ | HR | 5 | 33.33 |
107
+ | Marketing | 3 | 20.00 |
108
+ | Finance | 2 | 13.33 |
109
+ ```
110
+
111
+ ## <font face = 'Calibri' color = '#274472' > qd.freqdist_a Function</font>
112
+ Run the function qd.freqdist_a(df, ascending = FALSE) to easily create frequency distribution tables, arranged in descending manner (default) or ascending (TRUE), for all the categorical variables in your data frame. The resulting table will include columns such as:
113
+ * Variable levels (i.e., for Satisfaction: Very Low, Low, Moderate, High, Very High)
114
+ * Counts - the number of observations
115
+ * Percentage - percentage of observations from total.
116
+
117
+ ```python
118
+ import qdesc as qd
119
+ qd.freqdist_a(df)
120
+
121
+ | Column | Value | Count | Percentage |
122
+ |------------|----------|-------|------------|
123
+ | Department | IT | 5 | 33.33% |
124
+ | Department | HR | 5 | 33.33% |
125
+ | Department | Marketing| 3 | 20.00% |
126
+ | Department | Finance | 2 | 13.33% |
127
+ | Gender | Male | 8 | 53.33% |
128
+ | Gender | Female | 7 | 46.67% |
129
+ ```
130
+
131
+ ## <font face = 'Calibri' color = '#274472' > qd.freqdist_to_excel Function</font>
132
+ Run the function qd.freqdist_to_excel(df, "Filename.xlsx", ascending = FALSE ) to easily create frequency distribution tables, arranged in descending manner (default) or ascending (TRUE), for all the categorical variables in your data frame and SAVED as separate sheets in the .xlsx File. The resulting table will include columns such as:
133
+ * Variable levels (i.e., for Satisfaction: Very Low, Low, Moderate, High, Very High)
134
+ * Counts - the number of observations
135
+ * Percentage - percentage of observations from total.
136
+
137
+ ```python
138
+ import qdesc as qd
139
+ qd.freqdist_to_excel(df, "Results.xlsx")
140
+
141
+ Frequency distributions written to Results.xlsx
142
+ ```
143
+
144
+ ## <font face = 'Calibri' color = '#274472' > qd.normcheck_dashboard Function</font>
145
+ Run the function qd.normcheck_dashboard(df) to efficiently check each numeric variable for normality of its distribution. It will compute the Anderson-Darling statistic and create visualizations (i.e., qq-plot, histogram, and boxplots) for checking whether the distribution is approximately normal.
146
+
147
+ ```python
148
+ import qdesc as qd
149
+ qd.normcheck_dashboard(df)
150
+ ```
151
+ ![Descriptive Statistics](https://raw.githubusercontent.com/Dcroix/qdesc/refs/heads/main/qd.normcheck_dashboard.png)
152
+
153
+
154
+ ## <font face = 'Calibri' color = '#3D5B59' > License</font>
155
+ This project is licensed under the GPL-3 License. See the LICENSE file for more details.
156
+
157
+ ## <font face = 'Calibri' color = '#3D5B59' > Acknowledgements</font>
158
+ Acknowledgement of the libraries used by this package...
159
+
160
+ ### <font face = 'Calibri' color = '#3D5B59' > Pandas</font>
161
+ Pandas is distributed under the BSD 3-Clause License, pandas is developed by Pandas contributors. Copyright (c) 2008-2024, the pandas development team All rights reserved.
162
+ ### <font face = 'Calibri' color = '#3D5B59' > Numpy</font>
163
+ NumPy is distributed under the BSD 3-Clause License, numpy is developed by NumPy contributors. Copyright (c) 2005-2024, NumPy Developers. All rights reserved.
164
+ ### <font face = 'Calibri' color = '#3D5B59' > SciPy</font>
165
+ SciPy is distributed under the BSD License, scipy is developed by SciPy contributors. Copyright (c) 2001-2024, SciPy Developers. All rights reserved.
166
+
167
+
168
+
169
+
170
+
@@ -0,0 +1,10 @@
1
+ LICENCE.txt
2
+ README.md
3
+ setup.py
4
+ qdesc/__init__.py
5
+ qdesc/update_checker.py
6
+ qdesc.egg-info/PKG-INFO
7
+ qdesc.egg-info/SOURCES.txt
8
+ qdesc.egg-info/dependency_links.txt
9
+ qdesc.egg-info/requires.txt
10
+ qdesc.egg-info/top_level.txt
@@ -0,0 +1,6 @@
1
+ pandas
2
+ numpy
3
+ scipy
4
+ seaborn
5
+ matplotlib
6
+ statsmodels
@@ -0,0 +1 @@
1
+ qdesc
qdesc-1.0.7/setup.cfg ADDED
@@ -0,0 +1,4 @@
1
+ [egg_info]
2
+ tag_build =
3
+ tag_date = 0
4
+
qdesc-1.0.7/setup.py ADDED
@@ -0,0 +1,25 @@
1
+ from setuptools import setup, find_packages
2
+ from pathlib import Path
3
+
4
+ # Read the contents of the README file
5
+ this_directory = Path(__file__).parent
6
+ long_description = (this_directory / "README.md").read_text()
7
+
8
+ setup(
9
+ name='qdesc',
10
+ version='1.0.7',
11
+ packages=find_packages(),
12
+ install_requires=[
13
+ 'pandas',
14
+ 'numpy',
15
+ 'scipy',
16
+ 'seaborn',
17
+ 'matplotlib',
18
+ 'statsmodels'
19
+ ],
20
+ author='Paolo Hilado',
21
+ author_email='datasciencepgh@proton.me',
22
+ description= 'Quick and Easy way to do descriptive analysis.',
23
+ long_description=long_description,
24
+ long_description_content_type='text/markdown', # or 'text/x-rst' for reStructuredText # other metadata fields... )
25
+ )