clustering-imputation 1.0.0__tar.gz → 1.0.1__tar.gz

Sign up to get free protection for your applications and to get access to all the features.
Files changed (25) hide show
  1. clustering_imputation-1.0.1/PKG-INFO +63 -0
  2. clustering_imputation-1.0.1/README.md +46 -0
  3. clustering_imputation-1.0.1/clustering_imputation.egg-info/PKG-INFO +63 -0
  4. {clustering_imputation-1.0.0 → clustering_imputation-1.0.1}/setup.py +1 -1
  5. clustering_imputation-1.0.0/PKG-INFO +0 -18
  6. clustering_imputation-1.0.0/README.md +0 -1
  7. clustering_imputation-1.0.0/clustering_imputation.egg-info/PKG-INFO +0 -18
  8. {clustering_imputation-1.0.0 → clustering_imputation-1.0.1}/LICENSE +0 -0
  9. {clustering_imputation-1.0.0 → clustering_imputation-1.0.1}/clustering_imputation/__init__.py +0 -0
  10. {clustering_imputation-1.0.0 → clustering_imputation-1.0.1}/clustering_imputation/basicImputer/__init__.py +0 -0
  11. {clustering_imputation-1.0.0 → clustering_imputation-1.0.1}/clustering_imputation/basicImputer/em.py +0 -0
  12. {clustering_imputation-1.0.0 → clustering_imputation-1.0.1}/clustering_imputation/basicImputer/mice.py +0 -0
  13. {clustering_imputation-1.0.0 → clustering_imputation-1.0.1}/clustering_imputation/basicImputer/sice.py +0 -0
  14. {clustering_imputation-1.0.0 → clustering_imputation-1.0.1}/clustering_imputation/clusterBase/__init__.py +0 -0
  15. {clustering_imputation-1.0.0 → clustering_imputation-1.0.1}/clustering_imputation/clusterBase/clustering.py +0 -0
  16. {clustering_imputation-1.0.0 → clustering_imputation-1.0.1}/clustering_imputation/clusterBase/ohe.py +0 -0
  17. {clustering_imputation-1.0.0 → clustering_imputation-1.0.1}/clustering_imputation/dummyData/__init__.py +0 -0
  18. {clustering_imputation-1.0.0 → clustering_imputation-1.0.1}/clustering_imputation/dummyData/dataCreation.py +0 -0
  19. {clustering_imputation-1.0.0 → clustering_imputation-1.0.1}/clustering_imputation/getClusters.py +0 -0
  20. {clustering_imputation-1.0.0 → clustering_imputation-1.0.1}/clustering_imputation/main.py +0 -0
  21. {clustering_imputation-1.0.0 → clustering_imputation-1.0.1}/clustering_imputation.egg-info/SOURCES.txt +0 -0
  22. {clustering_imputation-1.0.0 → clustering_imputation-1.0.1}/clustering_imputation.egg-info/dependency_links.txt +0 -0
  23. {clustering_imputation-1.0.0 → clustering_imputation-1.0.1}/clustering_imputation.egg-info/requires.txt +0 -0
  24. {clustering_imputation-1.0.0 → clustering_imputation-1.0.1}/clustering_imputation.egg-info/top_level.txt +0 -0
  25. {clustering_imputation-1.0.0 → clustering_imputation-1.0.1}/setup.cfg +0 -0
@@ -0,0 +1,63 @@
1
+ Metadata-Version: 2.1
2
+ Name: clustering_imputation
3
+ Version: 1.0.1
4
+ Summary: Adding correlation to handle MNAR
5
+ Author: MRINAL KANGSA BANIK
6
+ Author-email: <manukbanik30@gmail.com>
7
+ Keywords: python,imputation,MNAR
8
+ Classifier: Development Status :: 1 - Planning
9
+ Classifier: Intended Audience :: Developers
10
+ Classifier: Programming Language :: Python :: 3
11
+ Classifier: Operating System :: Unix
12
+ Classifier: Operating System :: MacOS :: MacOS X
13
+ Classifier: Operating System :: Microsoft :: Windows
14
+ Description-Content-Type: text/markdown
15
+ License-File: LICENSE
16
+
17
+
18
+ # Clustering Imputation
19
+ ## Installation
20
+ To install the package, run:
21
+ ```bash
22
+ pip install clustering-imputation==1.0.0
23
+ ```
24
+ ## Usage
25
+
26
+ ```python
27
+ from clustering_imputation import clusterImputer
28
+ df = ... # Load your dataset
29
+ x = clusterImputer(df, "mice", "mean", 0.4, 10)
30
+ x.impute()
31
+ ```
32
+ # About the Package
33
+
34
+ ## Problem Statement
35
+
36
+ * Traditional imputation techniques face several challenges:
37
+
38
+ * High-Dimensional and Sparse Data: Existing methods struggle with large, sparse datasets; efficient techniques for such cases are needed.
39
+
40
+ * Temporal Dependencies: Current methods often overlook temporal correlations in data.
41
+ ## Need to develop a new algo
42
+ * Non-Random Missingness: Few methods address non-random missing patterns; improvements here could boost real-world application accuracy. We aim to develop an imputation method that considers "Missing Not at Random" (MNAR).
43
+
44
+ * Computational Complexity: MICE and EM methods are computationally expensive for high-dimensional data. Our approach aims to reduce time complexity.
45
+
46
+ ## Philosophy of Our Solution: Clustered MICE/EM
47
+
48
+ We propose a clustering-based approach:
49
+
50
+ * Identify correlations between features.
51
+
52
+ * Apply MICE/EM within clusters rather than on the entire dataset.
53
+
54
+ * Combine results to reconstruct the dataset.
55
+
56
+ * This method effectively handles MNAR data by leveraging feature correlations.
57
+ For further details refer this [ppt](https://docs.google.com/presentation/d/1UZ2uDkleSgB2ZttjG1D6nmQhqk7uz5FQRW5UmSkB0Sg/edit?usp=sharing)
58
+ ## Contributing
59
+
60
+ Pull requests are welcome. For major changes, please open an issue first
61
+ to discuss what you would like to change.
62
+
63
+ Please make sure to update tests as appropriate.
@@ -0,0 +1,46 @@
1
+ # Clustering Imputation
2
+ ## Installation
3
+ To install the package, run:
4
+ ```bash
5
+ pip install clustering-imputation==1.0.0
6
+ ```
7
+ ## Usage
8
+
9
+ ```python
10
+ from clustering_imputation import clusterImputer
11
+ df = ... # Load your dataset
12
+ x = clusterImputer(df, "mice", "mean", 0.4, 10)
13
+ x.impute()
14
+ ```
15
+ # About the Package
16
+
17
+ ## Problem Statement
18
+
19
+ * Traditional imputation techniques face several challenges:
20
+
21
+ * High-Dimensional and Sparse Data: Existing methods struggle with large, sparse datasets; efficient techniques for such cases are needed.
22
+
23
+ * Temporal Dependencies: Current methods often overlook temporal correlations in data.
24
+ ## Need to develop a new algo
25
+ * Non-Random Missingness: Few methods address non-random missing patterns; improvements here could boost real-world application accuracy. We aim to develop an imputation method that considers "Missing Not at Random" (MNAR).
26
+
27
+ * Computational Complexity: MICE and EM methods are computationally expensive for high-dimensional data. Our approach aims to reduce time complexity.
28
+
29
+ ## Philosophy of Our Solution: Clustered MICE/EM
30
+
31
+ We propose a clustering-based approach:
32
+
33
+ * Identify correlations between features.
34
+
35
+ * Apply MICE/EM within clusters rather than on the entire dataset.
36
+
37
+ * Combine results to reconstruct the dataset.
38
+
39
+ * This method effectively handles MNAR data by leveraging feature correlations.
40
+ For further details refer this [ppt](https://docs.google.com/presentation/d/1UZ2uDkleSgB2ZttjG1D6nmQhqk7uz5FQRW5UmSkB0Sg/edit?usp=sharing)
41
+ ## Contributing
42
+
43
+ Pull requests are welcome. For major changes, please open an issue first
44
+ to discuss what you would like to change.
45
+
46
+ Please make sure to update tests as appropriate.
@@ -0,0 +1,63 @@
1
+ Metadata-Version: 2.1
2
+ Name: clustering-imputation
3
+ Version: 1.0.1
4
+ Summary: Adding correlation to handle MNAR
5
+ Author: MRINAL KANGSA BANIK
6
+ Author-email: <manukbanik30@gmail.com>
7
+ Keywords: python,imputation,MNAR
8
+ Classifier: Development Status :: 1 - Planning
9
+ Classifier: Intended Audience :: Developers
10
+ Classifier: Programming Language :: Python :: 3
11
+ Classifier: Operating System :: Unix
12
+ Classifier: Operating System :: MacOS :: MacOS X
13
+ Classifier: Operating System :: Microsoft :: Windows
14
+ Description-Content-Type: text/markdown
15
+ License-File: LICENSE
16
+
17
+
18
+ # Clustering Imputation
19
+ ## Installation
20
+ To install the package, run:
21
+ ```bash
22
+ pip install clustering-imputation==1.0.0
23
+ ```
24
+ ## Usage
25
+
26
+ ```python
27
+ from clustering_imputation import clusterImputer
28
+ df = ... # Load your dataset
29
+ x = clusterImputer(df, "mice", "mean", 0.4, 10)
30
+ x.impute()
31
+ ```
32
+ # About the Package
33
+
34
+ ## Problem Statement
35
+
36
+ * Traditional imputation techniques face several challenges:
37
+
38
+ * High-Dimensional and Sparse Data: Existing methods struggle with large, sparse datasets; efficient techniques for such cases are needed.
39
+
40
+ * Temporal Dependencies: Current methods often overlook temporal correlations in data.
41
+ ## Need to develop a new algo
42
+ * Non-Random Missingness: Few methods address non-random missing patterns; improvements here could boost real-world application accuracy. We aim to develop an imputation method that considers "Missing Not at Random" (MNAR).
43
+
44
+ * Computational Complexity: MICE and EM methods are computationally expensive for high-dimensional data. Our approach aims to reduce time complexity.
45
+
46
+ ## Philosophy of Our Solution: Clustered MICE/EM
47
+
48
+ We propose a clustering-based approach:
49
+
50
+ * Identify correlations between features.
51
+
52
+ * Apply MICE/EM within clusters rather than on the entire dataset.
53
+
54
+ * Combine results to reconstruct the dataset.
55
+
56
+ * This method effectively handles MNAR data by leveraging feature correlations.
57
+ For further details refer this [ppt](https://docs.google.com/presentation/d/1UZ2uDkleSgB2ZttjG1D6nmQhqk7uz5FQRW5UmSkB0Sg/edit?usp=sharing)
58
+ ## Contributing
59
+
60
+ Pull requests are welcome. For major changes, please open an issue first
61
+ to discuss what you would like to change.
62
+
63
+ Please make sure to update tests as appropriate.
@@ -7,7 +7,7 @@ here = os.path.abspath(os.path.dirname(__file__))
7
7
  with codecs.open(os.path.join(here, "README.md"), encoding="utf-8") as fh:
8
8
  long_description = "\n" + fh.read()
9
9
 
10
- VERSION = '1.0.0'
10
+ VERSION = '1.0.1'
11
11
  DESCRIPTION = 'Adding correlation to handle MNAR'
12
12
  LONG_DESCRIPTION = 'A package that allows us to impute for all types of missingness(MAR , MCAR , MNAR)'
13
13
 
@@ -1,18 +0,0 @@
1
- Metadata-Version: 2.1
2
- Name: clustering_imputation
3
- Version: 1.0.0
4
- Summary: Adding correlation to handle MNAR
5
- Author: MRINAL KANGSA BANIK
6
- Author-email: <manukbanik30@gmail.com>
7
- Keywords: python,imputation,MNAR
8
- Classifier: Development Status :: 1 - Planning
9
- Classifier: Intended Audience :: Developers
10
- Classifier: Programming Language :: Python :: 3
11
- Classifier: Operating System :: Unix
12
- Classifier: Operating System :: MacOS :: MacOS X
13
- Classifier: Operating System :: Microsoft :: Windows
14
- Description-Content-Type: text/markdown
15
- License-File: LICENSE
16
-
17
-
18
- Hey we will write it later
@@ -1 +0,0 @@
1
- Hey we will write it later
@@ -1,18 +0,0 @@
1
- Metadata-Version: 2.1
2
- Name: clustering-imputation
3
- Version: 1.0.0
4
- Summary: Adding correlation to handle MNAR
5
- Author: MRINAL KANGSA BANIK
6
- Author-email: <manukbanik30@gmail.com>
7
- Keywords: python,imputation,MNAR
8
- Classifier: Development Status :: 1 - Planning
9
- Classifier: Intended Audience :: Developers
10
- Classifier: Programming Language :: Python :: 3
11
- Classifier: Operating System :: Unix
12
- Classifier: Operating System :: MacOS :: MacOS X
13
- Classifier: Operating System :: Microsoft :: Windows
14
- Description-Content-Type: text/markdown
15
- License-File: LICENSE
16
-
17
-
18
- Hey we will write it later