imageatlas 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (58) hide show
  1. imageatlas-0.1.0/CHANGELOG.md +16 -0
  2. imageatlas-0.1.0/CONTRIBUTING.md +55 -0
  3. imageatlas-0.1.0/LICENSE +21 -0
  4. imageatlas-0.1.0/MANIFEST.in +62 -0
  5. imageatlas-0.1.0/PKG-INFO +203 -0
  6. imageatlas-0.1.0/README.md +139 -0
  7. imageatlas-0.1.0/examples/example_apis.ipynb +376 -0
  8. imageatlas-0.1.0/examples/example_complete_workflow.py +162 -0
  9. imageatlas-0.1.0/imageatlas/__init__.py +42 -0
  10. imageatlas-0.1.0/imageatlas/clustering/__init__.py +14 -0
  11. imageatlas-0.1.0/imageatlas/clustering/base.py +129 -0
  12. imageatlas-0.1.0/imageatlas/clustering/factory.py +43 -0
  13. imageatlas-0.1.0/imageatlas/clustering/gmm.py +165 -0
  14. imageatlas-0.1.0/imageatlas/clustering/hdbscan_clustering.py +175 -0
  15. imageatlas-0.1.0/imageatlas/clustering/kmeans.py +148 -0
  16. imageatlas-0.1.0/imageatlas/core/__init__.py +15 -0
  17. imageatlas-0.1.0/imageatlas/core/clusterer.py +377 -0
  18. imageatlas-0.1.0/imageatlas/core/results.py +362 -0
  19. imageatlas-0.1.0/imageatlas/features/__init__.py +18 -0
  20. imageatlas-0.1.0/imageatlas/features/adapter.py +0 -0
  21. imageatlas-0.1.0/imageatlas/features/batch.py +142 -0
  22. imageatlas-0.1.0/imageatlas/features/cache.py +257 -0
  23. imageatlas-0.1.0/imageatlas/features/extractors/__init__.py +20 -0
  24. imageatlas-0.1.0/imageatlas/features/extractors/base.py +73 -0
  25. imageatlas-0.1.0/imageatlas/features/extractors/clip.py +26 -0
  26. imageatlas-0.1.0/imageatlas/features/extractors/convnext.py +58 -0
  27. imageatlas-0.1.0/imageatlas/features/extractors/dinov2.py +42 -0
  28. imageatlas-0.1.0/imageatlas/features/extractors/efficientnet.py +54 -0
  29. imageatlas-0.1.0/imageatlas/features/extractors/factory.py +47 -0
  30. imageatlas-0.1.0/imageatlas/features/extractors/mobilenet.py +58 -0
  31. imageatlas-0.1.0/imageatlas/features/extractors/resnet.py +63 -0
  32. imageatlas-0.1.0/imageatlas/features/extractors/swin.py +60 -0
  33. imageatlas-0.1.0/imageatlas/features/extractors/vgg.py +46 -0
  34. imageatlas-0.1.0/imageatlas/features/extractors/vit.py +67 -0
  35. imageatlas-0.1.0/imageatlas/features/loaders.py +187 -0
  36. imageatlas-0.1.0/imageatlas/features/metadata.py +81 -0
  37. imageatlas-0.1.0/imageatlas/features/pipeline.py +347 -0
  38. imageatlas-0.1.0/imageatlas/reduction/__init__.py +20 -0
  39. imageatlas-0.1.0/imageatlas/reduction/base.py +131 -0
  40. imageatlas-0.1.0/imageatlas/reduction/factory.py +51 -0
  41. imageatlas-0.1.0/imageatlas/reduction/pca.py +148 -0
  42. imageatlas-0.1.0/imageatlas/reduction/tsne.py +173 -0
  43. imageatlas-0.1.0/imageatlas/reduction/umap_reducer.py +110 -0
  44. imageatlas-0.1.0/imageatlas/visualization/__init__.py +10 -0
  45. imageatlas-0.1.0/imageatlas/visualization/grids.py +197 -0
  46. imageatlas-0.1.0/imageatlas.egg-info/PKG-INFO +203 -0
  47. imageatlas-0.1.0/imageatlas.egg-info/SOURCES.txt +56 -0
  48. imageatlas-0.1.0/imageatlas.egg-info/dependency_links.txt +1 -0
  49. imageatlas-0.1.0/imageatlas.egg-info/requires.txt +14 -0
  50. imageatlas-0.1.0/imageatlas.egg-info/top_level.txt +1 -0
  51. imageatlas-0.1.0/pyproject.toml +75 -0
  52. imageatlas-0.1.0/requirements.txt +9 -0
  53. imageatlas-0.1.0/setup.cfg +4 -0
  54. imageatlas-0.1.0/tests/test_batch_processing.py +130 -0
  55. imageatlas-0.1.0/tests/test_core_api.py +357 -0
  56. imageatlas-0.1.0/tests/test_features_pipeline.py +139 -0
  57. imageatlas-0.1.0/tests/test_reduction_module.py +262 -0
  58. imageatlas-0.1.0/tests/test_visualization.py +379 -0
@@ -0,0 +1,16 @@
1
+ # Changelog
2
+
3
+ All notable changes to this project will be documented in this file.
4
+
5
+ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
+ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
+
8
+ ## [0.1.0] - 2024-01-22
9
+ ### Added
10
+ - Core `ImageClusterer` API for high-level clustering workflows.
11
+ - Feature extraction support for DINOv2, ResNet, ViT, CLIP, Swin, and more.
12
+ - Clustering algorithms: K-Means, GMM, and HDBSCAN.
13
+ - Dimensionality reduction wrappers for PCA, UMAP, and t-SNE.
14
+ - Visualization tools (`GridVisualizer`) for creating image grids from clusters.
15
+ - HDF5 caching system for efficient feature storage.
16
+ - Export functionality to CSV, JSON, and Excel.
@@ -0,0 +1,55 @@
1
+ # Contributing to Image Clustering and Visualization Project
2
+
3
+
4
+ Thank you for considering contributing to this project! Your contributions are vital in making this tool more effective and accessible. This document outlines the guidelines for contributing to the project.
5
+
6
+
7
+ ## Getting Started
8
+
9
+ Before you begin contributing, make sure to follow the steps below to set up the project locally and familiarize yourself with the codebase.
10
+
11
+
12
+ ### Prerequisites
13
+
14
+ - Python 3.8+
15
+ - Pip or Conda for package management
16
+ - GPU (optional, but recommended for faster feature extraction)
17
+
18
+
19
+ ### Installation
20
+
21
+ 1. Clone the repository:
22
+
23
+ ```bash
24
+ git clone https://github.com/ahmadjaved97/ImageClusterViz.git
25
+ cd ImageClusterViz
26
+ 2. Set up a virtual environment:
27
+
28
+ ```bash
29
+ python -m venv venv
30
+ source venv/bin/activate
31
+ 3. Install the required dependencies:
32
+
33
+ ```bash
34
+ pip install -r requirements.txt
35
+ ## How to Contribute
36
+
37
+ We encourage you to contribute to the project by reporting bugs, suggesting new features, improving documentation, or submitting code contributions.
38
+
39
+ ### Reporting Issues
40
+
41
+ If you encounter any bugs or have suggestions for improvements, please create an issue on the [GitHub issue tracker](https://github.com/ahmadjaved97/ImageClusterViz/issues). Please provide a clear and concise description of the issue, including steps to reproduce it if it's a bug.
42
+
43
+ ### Suggesting Features
44
+
45
+ Feature requests are also welcome! Please open an issue for feature suggestions and provide details on how the feature would enhance the project. When possible, describe the use case to help us understand your needs.
46
+
47
+
48
+ ### Community Guidelines
49
+ - Be respectful and constructive in your communication.
50
+ - Provide clear and concise commit messages and issue descriptions.
51
+ - Ensure that any feedback provided to other contributors is thoughtful and - actionable.
52
+
53
+ ## License
54
+
55
+ By contributing to this project, you agree that your contributions will be licensed under the MIT License.
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2024 Ahmad Javed
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,62 @@
1
+ # MANIFEST.in - Specifies additional files to include in the distribution
2
+
3
+ # Include important documentation files
4
+ include README.md
5
+ include LICENSE
6
+ include CONTRIBUTING.md
7
+ include CHANGELOG.md
8
+ include requirements.txt
9
+ include pyproject.toml
10
+
11
+ # Include all Python files in the package
12
+ recursive-include imageatlas *.py
13
+
14
+ # Include type hints marker
15
+ include imageatlas/py.typed
16
+
17
+ # Include test files
18
+ recursive-include tests *.py
19
+
20
+ # Include example files
21
+ recursive-include examples *.py
22
+ recursive-include examples *.ipynb
23
+
24
+ # Exclude compiled Python files
25
+ global-exclude __pycache__
26
+ global-exclude *.py[cod]
27
+ global-exclude *.so
28
+ global-exclude .DS_Store
29
+
30
+ # Exclude cache and build artifacts
31
+ global-exclude *.egg-info
32
+ global-exclude .pytest_cache
33
+ global-exclude .tox
34
+ global-exclude .coverage
35
+ global-exclude htmlcov
36
+
37
+ # Exclude version control files
38
+ global-exclude .git*
39
+ global-exclude .gitignore
40
+
41
+ # Exclude IDE and editor files
42
+ global-exclude .vscode
43
+ global-exclude .idea
44
+ global-exclude *.swp
45
+ global-exclude *.swo
46
+ global-exclude *~
47
+
48
+ # Exclude output directories
49
+ global-exclude output
50
+ global-exclude output_grids
51
+ global-exclude output_clusters
52
+
53
+ # Exclude virtual environment directories
54
+ global-exclude venv
55
+ global-exclude .venv
56
+ global-exclude env
57
+
58
+ # Exclude HDF5 cache files and pickle files
59
+ global-exclude *.h5
60
+ global-exclude *.hdf5
61
+ global-exclude *.pkl
62
+ global-exclude *.pickle
@@ -0,0 +1,203 @@
1
+ Metadata-Version: 2.4
2
+ Name: imageatlas
3
+ Version: 0.1.0
4
+ Summary: ImageAtlas: A toolkit for organizing, cleaning and analysing your image datasets.
5
+ Author-email: Ahmad Javed <ahmadjaved97@gmail.com>
6
+ Maintainer-email: Ahmad Javed <ahmadjaved97@gmail.com>
7
+ License: MIT License
8
+
9
+ Copyright (c) 2024 Ahmad Javed
10
+
11
+ Permission is hereby granted, free of charge, to any person obtaining a copy
12
+ of this software and associated documentation files (the "Software"), to deal
13
+ in the Software without restriction, including without limitation the rights
14
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
15
+ copies of the Software, and to permit persons to whom the Software is
16
+ furnished to do so, subject to the following conditions:
17
+
18
+ The above copyright notice and this permission notice shall be included in all
19
+ copies or substantial portions of the Software.
20
+
21
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
22
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
23
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
24
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
25
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
26
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
27
+ SOFTWARE.
28
+
29
+ Project-URL: Homepage, https://github.com/ahmadjaved97/imageatlas
30
+ Project-URL: Documentation, https://github.com/ahmadjaved97/imageatlas
31
+ Project-URL: Repository, https://github.com/ahmadjaved97/imageatlas
32
+ Project-URL: Issues, https://github.com/ahmadjaved97/imageatlas/issues
33
+ Project-URL: Changelog, https://github.com/ahmadjaved97/imageatlas/blob/main/CHANGELOG.md
34
+ Keywords: machine-learning,computer-vision,clustering,embeddings,feature-extraction,dataset-visualization,deep-learning,image-processing,data-science,pytorch
35
+ Classifier: Development Status :: 4 - Beta
36
+ Classifier: Intended Audience :: Developers
37
+ Classifier: Intended Audience :: Science/Research
38
+ Classifier: License :: OSI Approved :: MIT License
39
+ Classifier: Operating System :: OS Independent
40
+ Classifier: Programming Language :: Python :: 3
41
+ Classifier: Programming Language :: Python :: 3.10
42
+ Classifier: Programming Language :: Python :: 3.11
43
+ Classifier: Programming Language :: Python :: 3.12
44
+ Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
45
+ Classifier: Topic :: Scientific/Engineering :: Image Recognition
46
+ Classifier: Topic :: Scientific/Engineering :: Visualization
47
+ Requires-Python: >=3.8
48
+ Description-Content-Type: text/markdown
49
+ License-File: LICENSE
50
+ Requires-Dist: torch>=2.0.1
51
+ Requires-Dist: torchvision>=0.15.2
52
+ Requires-Dist: numpy>=1.24.2
53
+ Requires-Dist: Pillow>=9.5.0
54
+ Requires-Dist: opencv-python>=4.8.0.74
55
+ Requires-Dist: scikit-learn>=1.3.0
56
+ Requires-Dist: tqdm>=4.67.1
57
+ Requires-Dist: h5py>=3.15.1
58
+ Requires-Dist: pandas>=2.3.3
59
+ Provides-Extra: full
60
+ Requires-Dist: umap-learn; extra == "full"
61
+ Requires-Dist: hdbscan; extra == "full"
62
+ Requires-Dist: openpyxl; extra == "full"
63
+ Dynamic: license-file
64
+
65
+ # ImageAtlas
66
+
67
+ ## Overview
68
+
69
+ ImageAtlas is a comprehensive toolkit designed to organize, clean, and analyze image datasets.
70
+
71
+ ⚠️ Note: ImageAtlas is currently in active development. The current version focuses on clustering and visualization functionality, with additional features coming soon.
72
+
73
+ Perfect for dataset curation, duplicate detection, quality control, and exploratory data analysis.
74
+
75
+ ## 📦 Installation
76
+
77
+ **Basic Installation**
78
+
79
+ ```
80
+ pip install imageatlas
81
+ ```
82
+
83
+ **Full Installation**
84
+
85
+ ```
86
+ pip install imageatlas[full]
87
+ ```
88
+
89
+ **From Source**
90
+ ```
91
+ git clone https://github.com/ahmadjaved97/ImageAtlas.git
92
+ cd ImageAtlas
93
+ pip install -e .
94
+ ```
95
+
96
+ ## 🚀 Quick Start
97
+
98
+ ### Minimal Working Example
99
+
100
+ ```python
101
+ import os
102
+ from imageatlas import ImageClusterer
103
+
104
+ # Initialize clusterer
105
+ clusterer = ImageClusterer(
106
+ model='dinov2', # State-of-the-art features
107
+ clustering_method='kmeans',
108
+ n_clusters=10,
109
+ device='cuda' # or 'cpu'
110
+ )
111
+
112
+ # Run clustering on your images
113
+ results = clusterer.fit("./path/to/images")
114
+
115
+ # Save results to JSON
116
+ results.to_json("./output/clustering_results.json")
117
+
118
+ # Create visual grids for each cluster
119
+ results.create_grids(
120
+ image_dir="./path/to/images",
121
+ output_dir="./output/grids"
122
+ )
123
+
124
+ # Organize images into cluster folders
125
+ results.create_cluster_folders(
126
+ image_dir="./path/to/images",
127
+ output_dir="./output/clusters"
128
+ )
129
+ ```
130
+
131
+ That's it! Your images are now clustered, visualized, and organized.
132
+
133
+ ## Available Models & Algorithms
134
+
135
+ ### Feature Extraction Models
136
+
137
+ | Model | Variants |
138
+ | ---------------- | --------------------------------------------------- |
139
+ | **DINOv2** | `vits14`, `vitb14`, `vitl14`, `vitg14` |
140
+ | **ViT** | `b_16`, `b_32`, `l_16`, `l_32`, `h_14` |
141
+ | **ResNet** | `18`, `34`, `50`, `101`, `152` |
142
+ | **EfficientNet** | `s`, `m`, `l` |
143
+ | **CLIP** | `RN50`, `RN101`, `ViT-B/32`, `ViT-B/16`, `ViT-L/14` |
144
+ | **ConvNeXt** | `tiny`, `small`, `base`, `large` |
145
+ | **Swin** | `t`, `s`, `b`, `v2_t`, `v2_s`, `v2_b` |
146
+ | **MobileNetV3** | `small`, `large` |
147
+ | **VGG16** | \- |
148
+
149
+ ### Clustering Algorithms
150
+
151
+ | Algorithm | Parameters |
152
+ | ----------- | --------------------------------- |
153
+ | **K-Means** | `n_clusters` |
154
+ | **HDBSCAN** | `min_cluster_size`, `min_samples` |
155
+ | **GMM** | `n_components`, `covariance_type` |
156
+
157
+ ### Dimensionality Reduction
158
+
159
+ | Method | Parameters |
160
+ | --------------------------| ----------------------------------------- |
161
+ | **PCA** | `n_components`, `whiten` |
162
+ | **UMAP** | `n_components`, `n_neighbors`, `min_dist` |
163
+ | **t-SNE(in development)** | `n_components`, `perplexity` |
164
+
165
+
166
+ ## 📝 Citation
167
+
168
+ If you use ImageAtlas in your research, please cite:
169
+
170
+ ```bibtex
171
+ @software{imageatlas2024,
172
+ author = {Javed, Ahmad},
173
+ title = {ImageAtlas: A Toolkit for Organizing and Analyzing Image Datasets},
174
+ year = {2024},
175
+ url = {https://github.com/ahmadjaved97/ImageAtlas}
176
+ }
177
+ ```
178
+ ## Acknowledgments
179
+
180
+ - [DINOv2](https://github.com/facebookresearch/dinov2): Facebook Research
181
+ - [CLIP](https://github.com/openai/CLIP): OpenAI
182
+ - [Vision Transformers](https://github.com/google-research/vision_transformer): Google Research
183
+ - Built with [PyTorch](https://github.com/pytorch/pytorch), [scikit-learn](https://github.com/scikit-learn/scikit-learn), and [OpenCV](https://github.com/opencv/opencv)
184
+
185
+
186
+ ### Sample Output
187
+ - Dataset Used: [Fruit and Vegetable Classification](https://www.kaggle.com/code/abdelrahman16/fruit-and-vegetable-classification/input)
188
+ - Number of Clusters: 8
189
+ - Model Used: ViT
190
+ - Clustering Method: Kmeans
191
+ - Output:
192
+ <p align="center">
193
+ <img src="./output_grids/cluster_0.jpg" alt="Image 1" width="250" height="250">
194
+ <img src="./output_grids/cluster_1.jpg" alt="Image 2" width="250" height="250">
195
+ <img src="./output_grids/cluster_2.jpg" alt="Image 3" width="250" height= "250">
196
+ <img src="./output_grids/cluster_3.jpg" alt="Image 3" width="250" height= "250">
197
+ <img src="./output_grids/cluster_4.jpg" alt="Image 3" width="250" height= "250">
198
+ <img src="./output_grids/cluster_5.jpg" alt="Image 3" width="250" height= "250">
199
+ <img src="./output_grids/cluster_6.jpg" alt="Image 3" width="250" height= "250">
200
+ <img src="./output_grids/cluster_7.jpg" alt="Image 3" width="250" height= "250">
201
+ </p>
202
+
203
+
@@ -0,0 +1,139 @@
1
+ # ImageAtlas
2
+
3
+ ## Overview
4
+
5
+ ImageAtlas is a comprehensive toolkit designed to organize, clean, and analyze image datasets.
6
+
7
+ ⚠️ Note: ImageAtlas is currently in active development. The current version focuses on clustering and visualization functionality, with additional features coming soon.
8
+
9
+ Perfect for dataset curation, duplicate detection, quality control, and exploratory data analysis.
10
+
11
+ ## 📦 Installation
12
+
13
+ **Basic Installation**
14
+
15
+ ```
16
+ pip install imageatlas
17
+ ```
18
+
19
+ **Full Installation**
20
+
21
+ ```
22
+ pip install imageatlas[full]
23
+ ```
24
+
25
+ **From Source**
26
+ ```
27
+ git clone https://github.com/ahmadjaved97/ImageAtlas.git
28
+ cd ImageAtlas
29
+ pip install -e .
30
+ ```
31
+
32
+ ## 🚀 Quick Start
33
+
34
+ ### Minimal Working Example
35
+
36
+ ```python
37
+ import os
38
+ from imageatlas import ImageClusterer
39
+
40
+ # Initialize clusterer
41
+ clusterer = ImageClusterer(
42
+ model='dinov2', # State-of-the-art features
43
+ clustering_method='kmeans',
44
+ n_clusters=10,
45
+ device='cuda' # or 'cpu'
46
+ )
47
+
48
+ # Run clustering on your images
49
+ results = clusterer.fit("./path/to/images")
50
+
51
+ # Save results to JSON
52
+ results.to_json("./output/clustering_results.json")
53
+
54
+ # Create visual grids for each cluster
55
+ results.create_grids(
56
+ image_dir="./path/to/images",
57
+ output_dir="./output/grids"
58
+ )
59
+
60
+ # Organize images into cluster folders
61
+ results.create_cluster_folders(
62
+ image_dir="./path/to/images",
63
+ output_dir="./output/clusters"
64
+ )
65
+ ```
66
+
67
+ That's it! Your images are now clustered, visualized, and organized.
68
+
69
+ ## Available Models & Algorithms
70
+
71
+ ### Feature Extraction Models
72
+
73
+ | Model | Variants |
74
+ | ---------------- | --------------------------------------------------- |
75
+ | **DINOv2** | `vits14`, `vitb14`, `vitl14`, `vitg14` |
76
+ | **ViT** | `b_16`, `b_32`, `l_16`, `l_32`, `h_14` |
77
+ | **ResNet** | `18`, `34`, `50`, `101`, `152` |
78
+ | **EfficientNet** | `s`, `m`, `l` |
79
+ | **CLIP** | `RN50`, `RN101`, `ViT-B/32`, `ViT-B/16`, `ViT-L/14` |
80
+ | **ConvNeXt** | `tiny`, `small`, `base`, `large` |
81
+ | **Swin** | `t`, `s`, `b`, `v2_t`, `v2_s`, `v2_b` |
82
+ | **MobileNetV3** | `small`, `large` |
83
+ | **VGG16** | \- |
84
+
85
+ ### Clustering Algorithms
86
+
87
+ | Algorithm | Parameters |
88
+ | ----------- | --------------------------------- |
89
+ | **K-Means** | `n_clusters` |
90
+ | **HDBSCAN** | `min_cluster_size`, `min_samples` |
91
+ | **GMM** | `n_components`, `covariance_type` |
92
+
93
+ ### Dimensionality Reduction
94
+
95
+ | Method | Parameters |
96
+ | --------------------------| ----------------------------------------- |
97
+ | **PCA** | `n_components`, `whiten` |
98
+ | **UMAP** | `n_components`, `n_neighbors`, `min_dist` |
99
+ | **t-SNE(in development)** | `n_components`, `perplexity` |
100
+
101
+
102
+ ## 📝 Citation
103
+
104
+ If you use ImageAtlas in your research, please cite:
105
+
106
+ ```bibtex
107
+ @software{imageatlas2024,
108
+ author = {Javed, Ahmad},
109
+ title = {ImageAtlas: A Toolkit for Organizing and Analyzing Image Datasets},
110
+ year = {2024},
111
+ url = {https://github.com/ahmadjaved97/ImageAtlas}
112
+ }
113
+ ```
114
+ ## Acknowledgments
115
+
116
+ - [DINOv2](https://github.com/facebookresearch/dinov2): Facebook Research
117
+ - [CLIP](https://github.com/openai/CLIP): OpenAI
118
+ - [Vision Transformers](https://github.com/google-research/vision_transformer): Google Research
119
+ - Built with [PyTorch](https://github.com/pytorch/pytorch), [scikit-learn](https://github.com/scikit-learn/scikit-learn), and [OpenCV](https://github.com/opencv/opencv)
120
+
121
+
122
+ ### Sample Output
123
+ - Dataset Used: [Fruit and Vegetable Classification](https://www.kaggle.com/code/abdelrahman16/fruit-and-vegetable-classification/input)
124
+ - Number of Clusters: 8
125
+ - Model Used: ViT
126
+ - Clustering Method: Kmeans
127
+ - Output:
128
+ <p align="center">
129
+ <img src="./output_grids/cluster_0.jpg" alt="Image 1" width="250" height="250">
130
+ <img src="./output_grids/cluster_1.jpg" alt="Image 2" width="250" height="250">
131
+ <img src="./output_grids/cluster_2.jpg" alt="Image 3" width="250" height= "250">
132
+ <img src="./output_grids/cluster_3.jpg" alt="Image 3" width="250" height= "250">
133
+ <img src="./output_grids/cluster_4.jpg" alt="Image 3" width="250" height= "250">
134
+ <img src="./output_grids/cluster_5.jpg" alt="Image 3" width="250" height= "250">
135
+ <img src="./output_grids/cluster_6.jpg" alt="Image 3" width="250" height= "250">
136
+ <img src="./output_grids/cluster_7.jpg" alt="Image 3" width="250" height= "250">
137
+ </p>
138
+
139
+