PyPI - active-vision - Versions diffs - 0.0.4__tar.gz → 0.1.0__tar.gz - Mend

active-vision 0.0.4tar.gz → 0.1.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (14) hide show

{active_vision-0.0.4 → active_vision-0.1.0}/PKG-INFO RENAMED Viewed

@@ -1,10 +1,11 @@
 Metadata-Version: 2.2
 Name: active-vision
-Version: 0.0.4
+Version: 0.1.0
 Summary: Active learning for edge vision.
 Requires-Python: >=3.10
 Description-Content-Type: text/markdown
 License-File: LICENSE
+Requires-Dist: accelerate>=1.2.1
 Requires-Dist: datasets>=3.2.0
 Requires-Dist: fastai>=2.7.18
 Requires-Dist: gradio>=5.12.0
@@ -13,6 +14,8 @@ Requires-Dist: ipywidgets>=8.1.5
 Requires-Dist: loguru>=0.7.3
 Requires-Dist: seaborn>=0.13.2
 Requires-Dist: timm>=1.0.13
+Requires-Dist: transformers>=4.48.0
+Requires-Dist: xinfer>=0.3.2
 ![Python Version](https://img.shields.io/badge/python-3.10%2B-blue?style=for-the-badge)
 ![License](https://img.shields.io/badge/License-Apache%202.0-green.svg?style=for-the-badge)
@@ -23,17 +26,38 @@ Requires-Dist: timm>=1.0.13
   <img src="https://raw.githubusercontent.com/dnth/active-vision/main/assets/logo.png" alt="active-vision">
 </p>
-Active learning at the edge for computer vision.
+The goal of this project is to create a framework for the active learning loop for computer vision. The diagram below shows a general workflow of how the active learning loop works.
-The goal of this project is to create a framework for the active learning loop for computer vision deployed on edge devices.
+<p align="center">
+  <img src="https://raw.githubusercontent.com/dnth/active-vision/main/assets/data_flywheel.gif" alt="active-vision", width="700">
+</p>
-Supported tasks:
+### Supported tasks:
 - [X] Image classification
 - [ ] Object detection
 - [ ] Segmentation
+### Supported models:
+- [X] Fastai models
+- [X] Torchvision models
+- [X] Timm models
+- [ ] Hugging Face models
+### Supported Active Learning Strategies:
+Uncertainty Sampling:
+- [X] Least confidence
+- [ ] Margin of confidence
+- [ ] Ratio of confidence
+- [ ] Entropy
-## Installation
+Diverse Sampling:
+- [X] Random sampling
+- [ ] Model-based outlier
+- [ ] Cluster-based
+- [ ] Representative
+## 📦 Installation
 Get a release from PyPI
 ```bash
@@ -58,18 +82,18 @@ uv sync
 Once the virtual environment is created, you can install the package using pip.
 > [!TIP]
-> If you're using uv add a uv before the pip install command to install into your virtual environment. Eg:
+> If you're using uv add a `uv` before the pip install command to install into your virtual environment. Eg:
 > ```bash
 > uv pip install active-vision
 > ```
-## Usage
+## 🛠️ Usage
 See the [notebook](./nbs/04_relabel_loop.ipynb) for a complete example.
-Be sure to prepared 3 datasets:
-- [initial_samples](./nbs/initial_samples.parquet): A dataframe of an existing labeled training dataset to seed the training set.
-- [unlabeled](./nbs/unlabeled_samples.parquet): A dataframe of unlabeled data which we will sample from using active learning.
-- [eval](./nbs/evaluation_samples.parquet): A dataframe of labeled data which we will use to evaluate the performance of the model.
+Be sure to prepared 3 subsets of the dataset:
+- [Initial samples](./nbs/initial_samples.parquet): A dataframe of a labeled images to train an initial model. If you don't have any labeled data, you can label some images yourself.
+- [Unlabeled samples](./nbs/unlabeled_samples.parquet): A dataframe of *unlabeled* images. We will continuously sample from this set using active learning strategies.
+- [Evaluation samples](./nbs/evaluation_samples.parquet): A dataframe of *labeled* images. We will use this set to evaluate the performance of the model. This is the test set, DO NOT use it for active learning. Split this out in the beginning.
 As a toy example I created the above 3 datasets from the imagenette dataset.
@@ -100,7 +124,7 @@ uncertain_df = al.sample_uncertain(pred_df, num_samples=10)
 al.label(uncertain_df, output_filename="uncertain")
 ```
-![Gradio UI](./assets/labeling_ui.png)
+![Gradio UI](https://raw.githubusercontent.com/dnth/active-vision/main/assets/labeling_ui.png)
 Once complete, the labeled samples will be save into a new df.
 We can now add the newly labeled data to the training set.
@@ -119,11 +143,77 @@ Repeat the process until the model is good enough. Use the dataset to train a la
 >
 > But using the dataset of 200+ images, I trained a more capable model (convnext_small_in22k) and got 99.3% accuracy on the evaluation set. See the [notebook](./nbs/05_retrain_larger.ipynb) for more details.
-## Workflow
-There are two workflows for active learning at the edge that we can use depending on the availability of labeled data.
+## 📊 Benchmarks
+This section contains the benchmarks I ran using the active learning loop on various datasets.
+Column description:
+- `#Labeled Images`: The number of labeled images used to train the model.
+- `Evaluation Accuracy`: The accuracy of the model on the evaluation set.
+- `Train Epochs`: The number of epochs used to train the model.
+- `Model`: The model used to train.
+- `Active Learning`: Whether active learning was used to train the model.
+- `Source`: The source of the results.
+### Imagenette
+- num classes: 10
+- num images: 9469
+To start the active learning loop, I labeled 100 images (10 images from each class) and iteratively relabeled the most informative images until I hit 275 labeled images.
+The active learning loop is a iterative process and can keep going until you hit a stopping point. You can decide your own stopping point based on your use case. It could be:
+- You ran out of data to label.
+- You hit a performance goal.
+- You hit a budget.
+- Other criteria.
+For this dataset,I decided to stop the active learning loop at 275 labeled images because the performance on the evaluation set is close to the top performing model on the leaderboard.
+| #Labeled Images | Evaluation Accuracy | Train Epochs | Model                | Active Learning | Source |
+|-----------------|---------------------|--------------|----------------------|----------------|--------|
+| 9469            | 94.90%              | 80           | xse_resnext50        | ❌             | [Link](https://github.com/fastai/imagenette) |
+| 9469            | 95.11%              | 200          | xse_resnext50        | ❌             | [Link](https://github.com/fastai/imagenette) |
+| 275             | 99.33%               | 6            | convnext_small_in22k | ✓              | [Link](https://github.com/dnth/active-vision/blob/main/nbs/05_retrain_larger.ipynb) |
+| 275             | 93.40%               | 4            | resnet18             | ✓              | [Link](https://github.com/dnth/active-vision/blob/main/nbs/04_relabel_loop.ipynb) |
+### Dog Food
+- num classes: 2
+- num images: 2100
+To start the active learning loop, I labeled 20 images (10 images from each class) and iteratively relabeled the most informative images until I hit 160 labeled images.
+I decided to stop the active learning loop at 160 labeled images because the performance on the evaluation set is close to the top performing model on the leaderboard. You can decide your own stopping point based on your use case.
+| #Labeled Images | Evaluation Accuracy | Train Epochs | Model | Active Learning | Source |
+|-----------------|---------------------|--------------|-------|----------------|--------|
+| 2100            | 99.70%              | ?            | vit-base-patch16-224   | ❌             | [Link](https://huggingface.co/abhishek/autotrain-dog-vs-food) |
+| 160             | 100.00%             | 6            | convnext_small_in22k   | ✓              | [Link](https://github.com/dnth/active-vision/blob/main/nbs/dog_food_dataset/02_train.ipynb) |
+| 160             | 97.60%              | 4            | resnet18              | ✓              | [Link](https://github.com/dnth/active-vision/blob/main/nbs/dog_food_dataset/01_label.ipynb) |
+### Oxford-IIIT Pet
+- num classes: 37
+- num images: 3680
+To start the active learning loop, I labeled 370 images (10 images from each class) and iteratively relabeled the most informative images until I hit 612 labeled images.
+I decided to stop the active learning loop at 612 labeled images because the performance on the evaluation set is close to the top performing model on the leaderboard. You can decide your own stopping point based on your use case.
+| #Labeled Images | Evaluation Accuracy | Train Epochs | Model | Active Learning | Source |
+|-----------------|---------------------|--------------|-------|----------------|--------|
+| 3680             | 95.40%              | 5           | vit-base-patch16-224   | ❌              | [Link](https://huggingface.co/walterg777/vit-base-oxford-iiit-pets) |
+| 612             | 90.26%              | 11            | convnext_small_in22k              | ✓              | [Link](https://github.com/dnth/active-vision/blob/main/nbs/oxford_iiit_pets/02_train.ipynb) |
+| 612             | 91.38%              | 11            | vit-base-patch16-224              | ✓              | [Link](https://github.com/dnth/active-vision/blob/main/nbs/oxford_iiit_pets/03_train_vit.ipynb) |
+## ➿ Workflow
+This section describes a more detailed workflow for active learning. There are two workflows for active learning that we can use depending on the availability of labeled data.
 ### With unlabeled data
-If we have no labeled data, we can use active learning to iteratively improve the model and build a labeled dataset.
+If we have no labeled data, the goal of the active learning loop is to build a resonably good labeled dataset to train a larger model.
+Steps:
 1. Load a small proxy model.
 2. Label an initial dataset. If there is none, you'll have to label some images.
@@ -155,24 +245,25 @@ graph TD
 ```
 ### With labeled data
-If we have a labeled dataset, we can use active learning to iteratively improve the dataset and the model by fixing the most important label errors.
+If we already have a labeled dataset, the goal of the active learning loop is to iteratively improve the dataset and the model by fixing the most important label errors.
+Steps:
 1. Load a small proxy model.
 2. Train the proxy model on the labeled dataset.
 3. Run inference on the entire labeled dataset.
-4. Get the most important label errors with active learning.
+4. Get the most impactful label errors with active learning.
 5. Fix the label errors.
 6. Repeat steps 2-5 until the dataset is good enough.
 7. Save the labeled dataset.
 8. Train a larger model on the saved labeled dataset.
 ```mermaid
 graph TD
     A[Load a small proxy model] --> B[Train proxy model on labeled dataset]
     B --> C[Run inference on labeled dataset]
-    C --> D[Get important label errors using active learning]
+    C --> D[Get label errors using active learning]
     D --> E[Fix label errors]
     E --> F{Dataset good enough?}
     F -->|No| B
@@ -181,6 +272,7 @@ graph TD
 ```
 <!-- ## Methodology
 To test out the workflows we will use the [imagenette dataset](https://huggingface.co/datasets/frgfm/imagenette). But this will be applicable to any dataset.

{active_vision-0.0.4 → active_vision-0.1.0}/README.md RENAMED Viewed

@@ -7,17 +7,38 @@
   <img src="https://raw.githubusercontent.com/dnth/active-vision/main/assets/logo.png" alt="active-vision">
 </p>
-Active learning at the edge for computer vision.
+The goal of this project is to create a framework for the active learning loop for computer vision. The diagram below shows a general workflow of how the active learning loop works.
-The goal of this project is to create a framework for the active learning loop for computer vision deployed on edge devices.
+<p align="center">
+  <img src="https://raw.githubusercontent.com/dnth/active-vision/main/assets/data_flywheel.gif" alt="active-vision", width="700">
+</p>
-Supported tasks:
+### Supported tasks:
 - [X] Image classification
 - [ ] Object detection
 - [ ] Segmentation
+### Supported models:
+- [X] Fastai models
+- [X] Torchvision models
+- [X] Timm models
+- [ ] Hugging Face models
+### Supported Active Learning Strategies:
+Uncertainty Sampling:
+- [X] Least confidence
+- [ ] Margin of confidence
+- [ ] Ratio of confidence
+- [ ] Entropy
-## Installation
+Diverse Sampling:
+- [X] Random sampling
+- [ ] Model-based outlier
+- [ ] Cluster-based
+- [ ] Representative
+## 📦 Installation
 Get a release from PyPI
 ```bash
@@ -42,18 +63,18 @@ uv sync
 Once the virtual environment is created, you can install the package using pip.
 > [!TIP]
-> If you're using uv add a uv before the pip install command to install into your virtual environment. Eg:
+> If you're using uv add a `uv` before the pip install command to install into your virtual environment. Eg:
 > ```bash
 > uv pip install active-vision
 > ```
-## Usage
+## 🛠️ Usage
 See the [notebook](./nbs/04_relabel_loop.ipynb) for a complete example.
-Be sure to prepared 3 datasets:
-- [initial_samples](./nbs/initial_samples.parquet): A dataframe of an existing labeled training dataset to seed the training set.
-- [unlabeled](./nbs/unlabeled_samples.parquet): A dataframe of unlabeled data which we will sample from using active learning.
-- [eval](./nbs/evaluation_samples.parquet): A dataframe of labeled data which we will use to evaluate the performance of the model.
+Be sure to prepared 3 subsets of the dataset:
+- [Initial samples](./nbs/initial_samples.parquet): A dataframe of a labeled images to train an initial model. If you don't have any labeled data, you can label some images yourself.
+- [Unlabeled samples](./nbs/unlabeled_samples.parquet): A dataframe of *unlabeled* images. We will continuously sample from this set using active learning strategies.
+- [Evaluation samples](./nbs/evaluation_samples.parquet): A dataframe of *labeled* images. We will use this set to evaluate the performance of the model. This is the test set, DO NOT use it for active learning. Split this out in the beginning.
 As a toy example I created the above 3 datasets from the imagenette dataset.
@@ -84,7 +105,7 @@ uncertain_df = al.sample_uncertain(pred_df, num_samples=10)
 al.label(uncertain_df, output_filename="uncertain")
 ```
-![Gradio UI](./assets/labeling_ui.png)
+![Gradio UI](https://raw.githubusercontent.com/dnth/active-vision/main/assets/labeling_ui.png)
 Once complete, the labeled samples will be save into a new df.
 We can now add the newly labeled data to the training set.
@@ -103,11 +124,77 @@ Repeat the process until the model is good enough. Use the dataset to train a la
 >
 > But using the dataset of 200+ images, I trained a more capable model (convnext_small_in22k) and got 99.3% accuracy on the evaluation set. See the [notebook](./nbs/05_retrain_larger.ipynb) for more details.
-## Workflow
-There are two workflows for active learning at the edge that we can use depending on the availability of labeled data.
+## 📊 Benchmarks
+This section contains the benchmarks I ran using the active learning loop on various datasets.
+Column description:
+- `#Labeled Images`: The number of labeled images used to train the model.
+- `Evaluation Accuracy`: The accuracy of the model on the evaluation set.
+- `Train Epochs`: The number of epochs used to train the model.
+- `Model`: The model used to train.
+- `Active Learning`: Whether active learning was used to train the model.
+- `Source`: The source of the results.
+### Imagenette
+- num classes: 10
+- num images: 9469
+To start the active learning loop, I labeled 100 images (10 images from each class) and iteratively relabeled the most informative images until I hit 275 labeled images.
+The active learning loop is a iterative process and can keep going until you hit a stopping point. You can decide your own stopping point based on your use case. It could be:
+- You ran out of data to label.
+- You hit a performance goal.
+- You hit a budget.
+- Other criteria.
+For this dataset,I decided to stop the active learning loop at 275 labeled images because the performance on the evaluation set is close to the top performing model on the leaderboard.
+| #Labeled Images | Evaluation Accuracy | Train Epochs | Model                | Active Learning | Source |
+|-----------------|---------------------|--------------|----------------------|----------------|--------|
+| 9469            | 94.90%              | 80           | xse_resnext50        | ❌             | [Link](https://github.com/fastai/imagenette) |
+| 9469            | 95.11%              | 200          | xse_resnext50        | ❌             | [Link](https://github.com/fastai/imagenette) |
+| 275             | 99.33%               | 6            | convnext_small_in22k | ✓              | [Link](https://github.com/dnth/active-vision/blob/main/nbs/05_retrain_larger.ipynb) |
+| 275             | 93.40%               | 4            | resnet18             | ✓              | [Link](https://github.com/dnth/active-vision/blob/main/nbs/04_relabel_loop.ipynb) |
+### Dog Food
+- num classes: 2
+- num images: 2100
+To start the active learning loop, I labeled 20 images (10 images from each class) and iteratively relabeled the most informative images until I hit 160 labeled images.
+I decided to stop the active learning loop at 160 labeled images because the performance on the evaluation set is close to the top performing model on the leaderboard. You can decide your own stopping point based on your use case.
+| #Labeled Images | Evaluation Accuracy | Train Epochs | Model | Active Learning | Source |
+|-----------------|---------------------|--------------|-------|----------------|--------|
+| 2100            | 99.70%              | ?            | vit-base-patch16-224   | ❌             | [Link](https://huggingface.co/abhishek/autotrain-dog-vs-food) |
+| 160             | 100.00%             | 6            | convnext_small_in22k   | ✓              | [Link](https://github.com/dnth/active-vision/blob/main/nbs/dog_food_dataset/02_train.ipynb) |
+| 160             | 97.60%              | 4            | resnet18              | ✓              | [Link](https://github.com/dnth/active-vision/blob/main/nbs/dog_food_dataset/01_label.ipynb) |
+### Oxford-IIIT Pet
+- num classes: 37
+- num images: 3680
+To start the active learning loop, I labeled 370 images (10 images from each class) and iteratively relabeled the most informative images until I hit 612 labeled images.
+I decided to stop the active learning loop at 612 labeled images because the performance on the evaluation set is close to the top performing model on the leaderboard. You can decide your own stopping point based on your use case.
+| #Labeled Images | Evaluation Accuracy | Train Epochs | Model | Active Learning | Source |
+|-----------------|---------------------|--------------|-------|----------------|--------|
+| 3680             | 95.40%              | 5           | vit-base-patch16-224   | ❌              | [Link](https://huggingface.co/walterg777/vit-base-oxford-iiit-pets) |
+| 612             | 90.26%              | 11            | convnext_small_in22k              | ✓              | [Link](https://github.com/dnth/active-vision/blob/main/nbs/oxford_iiit_pets/02_train.ipynb) |
+| 612             | 91.38%              | 11            | vit-base-patch16-224              | ✓              | [Link](https://github.com/dnth/active-vision/blob/main/nbs/oxford_iiit_pets/03_train_vit.ipynb) |
+## ➿ Workflow
+This section describes a more detailed workflow for active learning. There are two workflows for active learning that we can use depending on the availability of labeled data.
 ### With unlabeled data
-If we have no labeled data, we can use active learning to iteratively improve the model and build a labeled dataset.
+If we have no labeled data, the goal of the active learning loop is to build a resonably good labeled dataset to train a larger model.
+Steps:
 1. Load a small proxy model.
 2. Label an initial dataset. If there is none, you'll have to label some images.
@@ -139,24 +226,25 @@ graph TD
 ```
 ### With labeled data
-If we have a labeled dataset, we can use active learning to iteratively improve the dataset and the model by fixing the most important label errors.
+If we already have a labeled dataset, the goal of the active learning loop is to iteratively improve the dataset and the model by fixing the most important label errors.
+Steps:
 1. Load a small proxy model.
 2. Train the proxy model on the labeled dataset.
 3. Run inference on the entire labeled dataset.
-4. Get the most important label errors with active learning.
+4. Get the most impactful label errors with active learning.
 5. Fix the label errors.
 6. Repeat steps 2-5 until the dataset is good enough.
 7. Save the labeled dataset.
 8. Train a larger model on the saved labeled dataset.
 ```mermaid
 graph TD
     A[Load a small proxy model] --> B[Train proxy model on labeled dataset]
     B --> C[Run inference on labeled dataset]
-    C --> D[Get important label errors using active learning]
+    C --> D[Get label errors using active learning]
     D --> E[Fix label errors]
     E --> F{Dataset good enough?}
     F -->|No| B
@@ -165,6 +253,7 @@ graph TD
 ```
 <!-- ## Methodology
 To test out the workflows we will use the [imagenette dataset](https://huggingface.co/datasets/frgfm/imagenette). But this will be applicable to any dataset.

{active_vision-0.0.4 → active_vision-0.1.0}/pyproject.toml RENAMED Viewed

@@ -1,10 +1,11 @@
 [project]
 name = "active-vision"
-version = "0.0.4"
+version = "0.1.0"
 description = "Active learning for edge vision."
 readme = "README.md"
 requires-python = ">=3.10"
 dependencies = [
+    "accelerate>=1.2.1",
     "datasets>=3.2.0",
     "fastai>=2.7.18",
     "gradio>=5.12.0",
@@ -13,4 +14,6 @@ dependencies = [
     "loguru>=0.7.3",
     "seaborn>=0.13.2",
     "timm>=1.0.13",
+    "transformers>=4.48.0",
+    "xinfer>=0.3.2",
 ]

active_vision-0.1.0/src/active_vision/__init__.py ADDED Viewed

@@ -0,0 +1,3 @@
+__version__ = "0.1.0"
+from .core import *

active-vision 0.0.4__tar.gz → 0.1.0__tar.gz

active-vision 0.0.4tar.gz → 0.1.0tar.gz