clustering-imputation 1.0.0__tar.gz → 1.0.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (25) hide show
  1. clustering_imputation-1.0.1/PKG-INFO +63 -0
  2. clustering_imputation-1.0.1/README.md +46 -0
  3. clustering_imputation-1.0.1/clustering_imputation.egg-info/PKG-INFO +63 -0
  4. {clustering_imputation-1.0.0 → clustering_imputation-1.0.1}/setup.py +1 -1
  5. clustering_imputation-1.0.0/PKG-INFO +0 -18
  6. clustering_imputation-1.0.0/README.md +0 -1
  7. clustering_imputation-1.0.0/clustering_imputation.egg-info/PKG-INFO +0 -18
  8. {clustering_imputation-1.0.0 → clustering_imputation-1.0.1}/LICENSE +0 -0
  9. {clustering_imputation-1.0.0 → clustering_imputation-1.0.1}/clustering_imputation/__init__.py +0 -0
  10. {clustering_imputation-1.0.0 → clustering_imputation-1.0.1}/clustering_imputation/basicImputer/__init__.py +0 -0
  11. {clustering_imputation-1.0.0 → clustering_imputation-1.0.1}/clustering_imputation/basicImputer/em.py +0 -0
  12. {clustering_imputation-1.0.0 → clustering_imputation-1.0.1}/clustering_imputation/basicImputer/mice.py +0 -0
  13. {clustering_imputation-1.0.0 → clustering_imputation-1.0.1}/clustering_imputation/basicImputer/sice.py +0 -0
  14. {clustering_imputation-1.0.0 → clustering_imputation-1.0.1}/clustering_imputation/clusterBase/__init__.py +0 -0
  15. {clustering_imputation-1.0.0 → clustering_imputation-1.0.1}/clustering_imputation/clusterBase/clustering.py +0 -0
  16. {clustering_imputation-1.0.0 → clustering_imputation-1.0.1}/clustering_imputation/clusterBase/ohe.py +0 -0
  17. {clustering_imputation-1.0.0 → clustering_imputation-1.0.1}/clustering_imputation/dummyData/__init__.py +0 -0
  18. {clustering_imputation-1.0.0 → clustering_imputation-1.0.1}/clustering_imputation/dummyData/dataCreation.py +0 -0
  19. {clustering_imputation-1.0.0 → clustering_imputation-1.0.1}/clustering_imputation/getClusters.py +0 -0
  20. {clustering_imputation-1.0.0 → clustering_imputation-1.0.1}/clustering_imputation/main.py +0 -0
  21. {clustering_imputation-1.0.0 → clustering_imputation-1.0.1}/clustering_imputation.egg-info/SOURCES.txt +0 -0
  22. {clustering_imputation-1.0.0 → clustering_imputation-1.0.1}/clustering_imputation.egg-info/dependency_links.txt +0 -0
  23. {clustering_imputation-1.0.0 → clustering_imputation-1.0.1}/clustering_imputation.egg-info/requires.txt +0 -0
  24. {clustering_imputation-1.0.0 → clustering_imputation-1.0.1}/clustering_imputation.egg-info/top_level.txt +0 -0
  25. {clustering_imputation-1.0.0 → clustering_imputation-1.0.1}/setup.cfg +0 -0
@@ -0,0 +1,63 @@
1
+ Metadata-Version: 2.1
2
+ Name: clustering_imputation
3
+ Version: 1.0.1
4
+ Summary: Adding correlation to handle MNAR
5
+ Author: MRINAL KANGSA BANIK
6
+ Author-email: <manukbanik30@gmail.com>
7
+ Keywords: python,imputation,MNAR
8
+ Classifier: Development Status :: 1 - Planning
9
+ Classifier: Intended Audience :: Developers
10
+ Classifier: Programming Language :: Python :: 3
11
+ Classifier: Operating System :: Unix
12
+ Classifier: Operating System :: MacOS :: MacOS X
13
+ Classifier: Operating System :: Microsoft :: Windows
14
+ Description-Content-Type: text/markdown
15
+ License-File: LICENSE
16
+
17
+
18
+ # Clustering Imputation
19
+ ## Installation
20
+ To install the package, run:
21
+ ```bash
22
+ pip install clustering-imputation==1.0.0
23
+ ```
24
+ ## Usage
25
+
26
+ ```python
27
+ from clustering_imputation import clusterImputer
28
+ df = ... # Load your dataset
29
+ x = clusterImputer(df, "mice", "mean", 0.4, 10)
30
+ x.impute()
31
+ ```
32
+ # About the Package
33
+
34
+ ## Problem Statement
35
+
36
+ * Traditional imputation techniques face several challenges:
37
+
38
+ * High-Dimensional and Sparse Data: Existing methods struggle with large, sparse datasets; efficient techniques for such cases are needed.
39
+
40
+ * Temporal Dependencies: Current methods often overlook temporal correlations in data.
41
+ ## Need to develop a new algo
42
+ * Non-Random Missingness: Few methods address non-random missing patterns; improvements here could boost real-world application accuracy. We aim to develop an imputation method that considers "Missing Not at Random" (MNAR).
43
+
44
+ * Computational Complexity: MICE and EM methods are computationally expensive for high-dimensional data. Our approach aims to reduce time complexity.
45
+
46
+ ## Philosophy of Our Solution: Clustered MICE/EM
47
+
48
+ We propose a clustering-based approach:
49
+
50
+ * Identify correlations between features.
51
+
52
+ * Apply MICE/EM within clusters rather than on the entire dataset.
53
+
54
+ * Combine results to reconstruct the dataset.
55
+
56
+ * This method effectively handles MNAR data by leveraging feature correlations.
57
+ For further details refer this [ppt](https://docs.google.com/presentation/d/1UZ2uDkleSgB2ZttjG1D6nmQhqk7uz5FQRW5UmSkB0Sg/edit?usp=sharing)
58
+ ## Contributing
59
+
60
+ Pull requests are welcome. For major changes, please open an issue first
61
+ to discuss what you would like to change.
62
+
63
+ Please make sure to update tests as appropriate.
@@ -0,0 +1,46 @@
1
+ # Clustering Imputation
2
+ ## Installation
3
+ To install the package, run:
4
+ ```bash
5
+ pip install clustering-imputation==1.0.0
6
+ ```
7
+ ## Usage
8
+
9
+ ```python
10
+ from clustering_imputation import clusterImputer
11
+ df = ... # Load your dataset
12
+ x = clusterImputer(df, "mice", "mean", 0.4, 10)
13
+ x.impute()
14
+ ```
15
+ # About the Package
16
+
17
+ ## Problem Statement
18
+
19
+ * Traditional imputation techniques face several challenges:
20
+
21
+ * High-Dimensional and Sparse Data: Existing methods struggle with large, sparse datasets; efficient techniques for such cases are needed.
22
+
23
+ * Temporal Dependencies: Current methods often overlook temporal correlations in data.
24
+ ## Need to develop a new algo
25
+ * Non-Random Missingness: Few methods address non-random missing patterns; improvements here could boost real-world application accuracy. We aim to develop an imputation method that considers "Missing Not at Random" (MNAR).
26
+
27
+ * Computational Complexity: MICE and EM methods are computationally expensive for high-dimensional data. Our approach aims to reduce time complexity.
28
+
29
+ ## Philosophy of Our Solution: Clustered MICE/EM
30
+
31
+ We propose a clustering-based approach:
32
+
33
+ * Identify correlations between features.
34
+
35
+ * Apply MICE/EM within clusters rather than on the entire dataset.
36
+
37
+ * Combine results to reconstruct the dataset.
38
+
39
+ * This method effectively handles MNAR data by leveraging feature correlations.
40
+ For further details refer this [ppt](https://docs.google.com/presentation/d/1UZ2uDkleSgB2ZttjG1D6nmQhqk7uz5FQRW5UmSkB0Sg/edit?usp=sharing)
41
+ ## Contributing
42
+
43
+ Pull requests are welcome. For major changes, please open an issue first
44
+ to discuss what you would like to change.
45
+
46
+ Please make sure to update tests as appropriate.
@@ -0,0 +1,63 @@
1
+ Metadata-Version: 2.1
2
+ Name: clustering-imputation
3
+ Version: 1.0.1
4
+ Summary: Adding correlation to handle MNAR
5
+ Author: MRINAL KANGSA BANIK
6
+ Author-email: <manukbanik30@gmail.com>
7
+ Keywords: python,imputation,MNAR
8
+ Classifier: Development Status :: 1 - Planning
9
+ Classifier: Intended Audience :: Developers
10
+ Classifier: Programming Language :: Python :: 3
11
+ Classifier: Operating System :: Unix
12
+ Classifier: Operating System :: MacOS :: MacOS X
13
+ Classifier: Operating System :: Microsoft :: Windows
14
+ Description-Content-Type: text/markdown
15
+ License-File: LICENSE
16
+
17
+
18
+ # Clustering Imputation
19
+ ## Installation
20
+ To install the package, run:
21
+ ```bash
22
+ pip install clustering-imputation==1.0.0
23
+ ```
24
+ ## Usage
25
+
26
+ ```python
27
+ from clustering_imputation import clusterImputer
28
+ df = ... # Load your dataset
29
+ x = clusterImputer(df, "mice", "mean", 0.4, 10)
30
+ x.impute()
31
+ ```
32
+ # About the Package
33
+
34
+ ## Problem Statement
35
+
36
+ * Traditional imputation techniques face several challenges:
37
+
38
+ * High-Dimensional and Sparse Data: Existing methods struggle with large, sparse datasets; efficient techniques for such cases are needed.
39
+
40
+ * Temporal Dependencies: Current methods often overlook temporal correlations in data.
41
+ ## Need to develop a new algo
42
+ * Non-Random Missingness: Few methods address non-random missing patterns; improvements here could boost real-world application accuracy. We aim to develop an imputation method that considers "Missing Not at Random" (MNAR).
43
+
44
+ * Computational Complexity: MICE and EM methods are computationally expensive for high-dimensional data. Our approach aims to reduce time complexity.
45
+
46
+ ## Philosophy of Our Solution: Clustered MICE/EM
47
+
48
+ We propose a clustering-based approach:
49
+
50
+ * Identify correlations between features.
51
+
52
+ * Apply MICE/EM within clusters rather than on the entire dataset.
53
+
54
+ * Combine results to reconstruct the dataset.
55
+
56
+ * This method effectively handles MNAR data by leveraging feature correlations.
57
+ For further details refer this [ppt](https://docs.google.com/presentation/d/1UZ2uDkleSgB2ZttjG1D6nmQhqk7uz5FQRW5UmSkB0Sg/edit?usp=sharing)
58
+ ## Contributing
59
+
60
+ Pull requests are welcome. For major changes, please open an issue first
61
+ to discuss what you would like to change.
62
+
63
+ Please make sure to update tests as appropriate.
@@ -7,7 +7,7 @@ here = os.path.abspath(os.path.dirname(__file__))
7
7
  with codecs.open(os.path.join(here, "README.md"), encoding="utf-8") as fh:
8
8
  long_description = "\n" + fh.read()
9
9
 
10
- VERSION = '1.0.0'
10
+ VERSION = '1.0.1'
11
11
  DESCRIPTION = 'Adding correlation to handle MNAR'
12
12
  LONG_DESCRIPTION = 'A package that allows us to impute for all types of missingness(MAR , MCAR , MNAR)'
13
13
 
@@ -1,18 +0,0 @@
1
- Metadata-Version: 2.1
2
- Name: clustering_imputation
3
- Version: 1.0.0
4
- Summary: Adding correlation to handle MNAR
5
- Author: MRINAL KANGSA BANIK
6
- Author-email: <manukbanik30@gmail.com>
7
- Keywords: python,imputation,MNAR
8
- Classifier: Development Status :: 1 - Planning
9
- Classifier: Intended Audience :: Developers
10
- Classifier: Programming Language :: Python :: 3
11
- Classifier: Operating System :: Unix
12
- Classifier: Operating System :: MacOS :: MacOS X
13
- Classifier: Operating System :: Microsoft :: Windows
14
- Description-Content-Type: text/markdown
15
- License-File: LICENSE
16
-
17
-
18
- Hey we will write it later
@@ -1 +0,0 @@
1
- Hey we will write it later
@@ -1,18 +0,0 @@
1
- Metadata-Version: 2.1
2
- Name: clustering-imputation
3
- Version: 1.0.0
4
- Summary: Adding correlation to handle MNAR
5
- Author: MRINAL KANGSA BANIK
6
- Author-email: <manukbanik30@gmail.com>
7
- Keywords: python,imputation,MNAR
8
- Classifier: Development Status :: 1 - Planning
9
- Classifier: Intended Audience :: Developers
10
- Classifier: Programming Language :: Python :: 3
11
- Classifier: Operating System :: Unix
12
- Classifier: Operating System :: MacOS :: MacOS X
13
- Classifier: Operating System :: Microsoft :: Windows
14
- Description-Content-Type: text/markdown
15
- License-File: LICENSE
16
-
17
-
18
- Hey we will write it later