active-vision 0.0.4__tar.gz → 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,10 +1,11 @@
1
1
  Metadata-Version: 2.2
2
2
  Name: active-vision
3
- Version: 0.0.4
3
+ Version: 0.1.0
4
4
  Summary: Active learning for edge vision.
5
5
  Requires-Python: >=3.10
6
6
  Description-Content-Type: text/markdown
7
7
  License-File: LICENSE
8
+ Requires-Dist: accelerate>=1.2.1
8
9
  Requires-Dist: datasets>=3.2.0
9
10
  Requires-Dist: fastai>=2.7.18
10
11
  Requires-Dist: gradio>=5.12.0
@@ -13,6 +14,8 @@ Requires-Dist: ipywidgets>=8.1.5
13
14
  Requires-Dist: loguru>=0.7.3
14
15
  Requires-Dist: seaborn>=0.13.2
15
16
  Requires-Dist: timm>=1.0.13
17
+ Requires-Dist: transformers>=4.48.0
18
+ Requires-Dist: xinfer>=0.3.2
16
19
 
17
20
  ![Python Version](https://img.shields.io/badge/python-3.10%2B-blue?style=for-the-badge)
18
21
  ![License](https://img.shields.io/badge/License-Apache%202.0-green.svg?style=for-the-badge)
@@ -23,17 +26,38 @@ Requires-Dist: timm>=1.0.13
23
26
  <img src="https://raw.githubusercontent.com/dnth/active-vision/main/assets/logo.png" alt="active-vision">
24
27
  </p>
25
28
 
26
- Active learning at the edge for computer vision.
29
+ The goal of this project is to create a framework for the active learning loop for computer vision. The diagram below shows a general workflow of how the active learning loop works.
27
30
 
28
- The goal of this project is to create a framework for the active learning loop for computer vision deployed on edge devices.
31
+ <p align="center">
32
+ <img src="https://raw.githubusercontent.com/dnth/active-vision/main/assets/data_flywheel.gif" alt="active-vision", width="700">
33
+ </p>
29
34
 
30
- Supported tasks:
35
+ ### Supported tasks:
31
36
  - [X] Image classification
32
37
  - [ ] Object detection
33
38
  - [ ] Segmentation
34
39
 
40
+ ### Supported models:
41
+ - [X] Fastai models
42
+ - [X] Torchvision models
43
+ - [X] Timm models
44
+ - [ ] Hugging Face models
45
+
46
+ ### Supported Active Learning Strategies:
47
+
48
+ Uncertainty Sampling:
49
+ - [X] Least confidence
50
+ - [ ] Margin of confidence
51
+ - [ ] Ratio of confidence
52
+ - [ ] Entropy
35
53
 
36
- ## Installation
54
+ Diverse Sampling:
55
+ - [X] Random sampling
56
+ - [ ] Model-based outlier
57
+ - [ ] Cluster-based
58
+ - [ ] Representative
59
+
60
+ ## 📦 Installation
37
61
 
38
62
  Get a release from PyPI
39
63
  ```bash
@@ -58,18 +82,18 @@ uv sync
58
82
  Once the virtual environment is created, you can install the package using pip.
59
83
 
60
84
  > [!TIP]
61
- > If you're using uv add a uv before the pip install command to install into your virtual environment. Eg:
85
+ > If you're using uv add a `uv` before the pip install command to install into your virtual environment. Eg:
62
86
  > ```bash
63
87
  > uv pip install active-vision
64
88
  > ```
65
89
 
66
- ## Usage
90
+ ## 🛠️ Usage
67
91
  See the [notebook](./nbs/04_relabel_loop.ipynb) for a complete example.
68
92
 
69
- Be sure to prepared 3 datasets:
70
- - [initial_samples](./nbs/initial_samples.parquet): A dataframe of an existing labeled training dataset to seed the training set.
71
- - [unlabeled](./nbs/unlabeled_samples.parquet): A dataframe of unlabeled data which we will sample from using active learning.
72
- - [eval](./nbs/evaluation_samples.parquet): A dataframe of labeled data which we will use to evaluate the performance of the model.
93
+ Be sure to prepared 3 subsets of the dataset:
94
+ - [Initial samples](./nbs/initial_samples.parquet): A dataframe of a labeled images to train an initial model. If you don't have any labeled data, you can label some images yourself.
95
+ - [Unlabeled samples](./nbs/unlabeled_samples.parquet): A dataframe of *unlabeled* images. We will continuously sample from this set using active learning strategies.
96
+ - [Evaluation samples](./nbs/evaluation_samples.parquet): A dataframe of *labeled* images. We will use this set to evaluate the performance of the model. This is the test set, DO NOT use it for active learning. Split this out in the beginning.
73
97
 
74
98
  As a toy example I created the above 3 datasets from the imagenette dataset.
75
99
 
@@ -100,7 +124,7 @@ uncertain_df = al.sample_uncertain(pred_df, num_samples=10)
100
124
  al.label(uncertain_df, output_filename="uncertain")
101
125
  ```
102
126
 
103
- ![Gradio UI](./assets/labeling_ui.png)
127
+ ![Gradio UI](https://raw.githubusercontent.com/dnth/active-vision/main/assets/labeling_ui.png)
104
128
 
105
129
  Once complete, the labeled samples will be save into a new df.
106
130
  We can now add the newly labeled data to the training set.
@@ -119,11 +143,77 @@ Repeat the process until the model is good enough. Use the dataset to train a la
119
143
  >
120
144
  > But using the dataset of 200+ images, I trained a more capable model (convnext_small_in22k) and got 99.3% accuracy on the evaluation set. See the [notebook](./nbs/05_retrain_larger.ipynb) for more details.
121
145
 
122
- ## Workflow
123
- There are two workflows for active learning at the edge that we can use depending on the availability of labeled data.
146
+
147
+ ## 📊 Benchmarks
148
+ This section contains the benchmarks I ran using the active learning loop on various datasets.
149
+
150
+ Column description:
151
+ - `#Labeled Images`: The number of labeled images used to train the model.
152
+ - `Evaluation Accuracy`: The accuracy of the model on the evaluation set.
153
+ - `Train Epochs`: The number of epochs used to train the model.
154
+ - `Model`: The model used to train.
155
+ - `Active Learning`: Whether active learning was used to train the model.
156
+ - `Source`: The source of the results.
157
+
158
+ ### Imagenette
159
+ - num classes: 10
160
+ - num images: 9469
161
+
162
+ To start the active learning loop, I labeled 100 images (10 images from each class) and iteratively relabeled the most informative images until I hit 275 labeled images.
163
+
164
+ The active learning loop is a iterative process and can keep going until you hit a stopping point. You can decide your own stopping point based on your use case. It could be:
165
+ - You ran out of data to label.
166
+ - You hit a performance goal.
167
+ - You hit a budget.
168
+ - Other criteria.
169
+
170
+ For this dataset,I decided to stop the active learning loop at 275 labeled images because the performance on the evaluation set is close to the top performing model on the leaderboard.
171
+
172
+
173
+ | #Labeled Images | Evaluation Accuracy | Train Epochs | Model | Active Learning | Source |
174
+ |-----------------|---------------------|--------------|----------------------|----------------|--------|
175
+ | 9469 | 94.90% | 80 | xse_resnext50 | ❌ | [Link](https://github.com/fastai/imagenette) |
176
+ | 9469 | 95.11% | 200 | xse_resnext50 | ❌ | [Link](https://github.com/fastai/imagenette) |
177
+ | 275 | 99.33% | 6 | convnext_small_in22k | ✓ | [Link](https://github.com/dnth/active-vision/blob/main/nbs/05_retrain_larger.ipynb) |
178
+ | 275 | 93.40% | 4 | resnet18 | ✓ | [Link](https://github.com/dnth/active-vision/blob/main/nbs/04_relabel_loop.ipynb) |
179
+
180
+ ### Dog Food
181
+ - num classes: 2
182
+ - num images: 2100
183
+
184
+ To start the active learning loop, I labeled 20 images (10 images from each class) and iteratively relabeled the most informative images until I hit 160 labeled images.
185
+
186
+ I decided to stop the active learning loop at 160 labeled images because the performance on the evaluation set is close to the top performing model on the leaderboard. You can decide your own stopping point based on your use case.
187
+
188
+ | #Labeled Images | Evaluation Accuracy | Train Epochs | Model | Active Learning | Source |
189
+ |-----------------|---------------------|--------------|-------|----------------|--------|
190
+ | 2100 | 99.70% | ? | vit-base-patch16-224 | ❌ | [Link](https://huggingface.co/abhishek/autotrain-dog-vs-food) |
191
+ | 160 | 100.00% | 6 | convnext_small_in22k | ✓ | [Link](https://github.com/dnth/active-vision/blob/main/nbs/dog_food_dataset/02_train.ipynb) |
192
+ | 160 | 97.60% | 4 | resnet18 | ✓ | [Link](https://github.com/dnth/active-vision/blob/main/nbs/dog_food_dataset/01_label.ipynb) |
193
+
194
+ ### Oxford-IIIT Pet
195
+ - num classes: 37
196
+ - num images: 3680
197
+
198
+ To start the active learning loop, I labeled 370 images (10 images from each class) and iteratively relabeled the most informative images until I hit 612 labeled images.
199
+
200
+ I decided to stop the active learning loop at 612 labeled images because the performance on the evaluation set is close to the top performing model on the leaderboard. You can decide your own stopping point based on your use case.
201
+
202
+ | #Labeled Images | Evaluation Accuracy | Train Epochs | Model | Active Learning | Source |
203
+ |-----------------|---------------------|--------------|-------|----------------|--------|
204
+ | 3680 | 95.40% | 5 | vit-base-patch16-224 | ❌ | [Link](https://huggingface.co/walterg777/vit-base-oxford-iiit-pets) |
205
+ | 612 | 90.26% | 11 | convnext_small_in22k | ✓ | [Link](https://github.com/dnth/active-vision/blob/main/nbs/oxford_iiit_pets/02_train.ipynb) |
206
+ | 612 | 91.38% | 11 | vit-base-patch16-224 | ✓ | [Link](https://github.com/dnth/active-vision/blob/main/nbs/oxford_iiit_pets/03_train_vit.ipynb) |
207
+
208
+
209
+
210
+ ## ➿ Workflow
211
+ This section describes a more detailed workflow for active learning. There are two workflows for active learning that we can use depending on the availability of labeled data.
124
212
 
125
213
  ### With unlabeled data
126
- If we have no labeled data, we can use active learning to iteratively improve the model and build a labeled dataset.
214
+ If we have no labeled data, the goal of the active learning loop is to build a resonably good labeled dataset to train a larger model.
215
+
216
+ Steps:
127
217
 
128
218
  1. Load a small proxy model.
129
219
  2. Label an initial dataset. If there is none, you'll have to label some images.
@@ -155,24 +245,25 @@ graph TD
155
245
  ```
156
246
 
157
247
  ### With labeled data
158
- If we have a labeled dataset, we can use active learning to iteratively improve the dataset and the model by fixing the most important label errors.
248
+ If we already have a labeled dataset, the goal of the active learning loop is to iteratively improve the dataset and the model by fixing the most important label errors.
249
+
250
+ Steps:
159
251
 
160
252
  1. Load a small proxy model.
161
253
  2. Train the proxy model on the labeled dataset.
162
254
  3. Run inference on the entire labeled dataset.
163
- 4. Get the most important label errors with active learning.
255
+ 4. Get the most impactful label errors with active learning.
164
256
  5. Fix the label errors.
165
257
  6. Repeat steps 2-5 until the dataset is good enough.
166
258
  7. Save the labeled dataset.
167
259
  8. Train a larger model on the saved labeled dataset.
168
260
 
169
261
 
170
-
171
262
  ```mermaid
172
263
  graph TD
173
264
  A[Load a small proxy model] --> B[Train proxy model on labeled dataset]
174
265
  B --> C[Run inference on labeled dataset]
175
- C --> D[Get important label errors using active learning]
266
+ C --> D[Get label errors using active learning]
176
267
  D --> E[Fix label errors]
177
268
  E --> F{Dataset good enough?}
178
269
  F -->|No| B
@@ -181,6 +272,7 @@ graph TD
181
272
  ```
182
273
 
183
274
 
275
+
184
276
  <!-- ## Methodology
185
277
  To test out the workflows we will use the [imagenette dataset](https://huggingface.co/datasets/frgfm/imagenette). But this will be applicable to any dataset.
186
278
 
@@ -7,17 +7,38 @@
7
7
  <img src="https://raw.githubusercontent.com/dnth/active-vision/main/assets/logo.png" alt="active-vision">
8
8
  </p>
9
9
 
10
- Active learning at the edge for computer vision.
10
+ The goal of this project is to create a framework for the active learning loop for computer vision. The diagram below shows a general workflow of how the active learning loop works.
11
11
 
12
- The goal of this project is to create a framework for the active learning loop for computer vision deployed on edge devices.
12
+ <p align="center">
13
+ <img src="https://raw.githubusercontent.com/dnth/active-vision/main/assets/data_flywheel.gif" alt="active-vision", width="700">
14
+ </p>
13
15
 
14
- Supported tasks:
16
+ ### Supported tasks:
15
17
  - [X] Image classification
16
18
  - [ ] Object detection
17
19
  - [ ] Segmentation
18
20
 
21
+ ### Supported models:
22
+ - [X] Fastai models
23
+ - [X] Torchvision models
24
+ - [X] Timm models
25
+ - [ ] Hugging Face models
26
+
27
+ ### Supported Active Learning Strategies:
28
+
29
+ Uncertainty Sampling:
30
+ - [X] Least confidence
31
+ - [ ] Margin of confidence
32
+ - [ ] Ratio of confidence
33
+ - [ ] Entropy
19
34
 
20
- ## Installation
35
+ Diverse Sampling:
36
+ - [X] Random sampling
37
+ - [ ] Model-based outlier
38
+ - [ ] Cluster-based
39
+ - [ ] Representative
40
+
41
+ ## 📦 Installation
21
42
 
22
43
  Get a release from PyPI
23
44
  ```bash
@@ -42,18 +63,18 @@ uv sync
42
63
  Once the virtual environment is created, you can install the package using pip.
43
64
 
44
65
  > [!TIP]
45
- > If you're using uv add a uv before the pip install command to install into your virtual environment. Eg:
66
+ > If you're using uv add a `uv` before the pip install command to install into your virtual environment. Eg:
46
67
  > ```bash
47
68
  > uv pip install active-vision
48
69
  > ```
49
70
 
50
- ## Usage
71
+ ## 🛠️ Usage
51
72
  See the [notebook](./nbs/04_relabel_loop.ipynb) for a complete example.
52
73
 
53
- Be sure to prepared 3 datasets:
54
- - [initial_samples](./nbs/initial_samples.parquet): A dataframe of an existing labeled training dataset to seed the training set.
55
- - [unlabeled](./nbs/unlabeled_samples.parquet): A dataframe of unlabeled data which we will sample from using active learning.
56
- - [eval](./nbs/evaluation_samples.parquet): A dataframe of labeled data which we will use to evaluate the performance of the model.
74
+ Be sure to prepared 3 subsets of the dataset:
75
+ - [Initial samples](./nbs/initial_samples.parquet): A dataframe of a labeled images to train an initial model. If you don't have any labeled data, you can label some images yourself.
76
+ - [Unlabeled samples](./nbs/unlabeled_samples.parquet): A dataframe of *unlabeled* images. We will continuously sample from this set using active learning strategies.
77
+ - [Evaluation samples](./nbs/evaluation_samples.parquet): A dataframe of *labeled* images. We will use this set to evaluate the performance of the model. This is the test set, DO NOT use it for active learning. Split this out in the beginning.
57
78
 
58
79
  As a toy example I created the above 3 datasets from the imagenette dataset.
59
80
 
@@ -84,7 +105,7 @@ uncertain_df = al.sample_uncertain(pred_df, num_samples=10)
84
105
  al.label(uncertain_df, output_filename="uncertain")
85
106
  ```
86
107
 
87
- ![Gradio UI](./assets/labeling_ui.png)
108
+ ![Gradio UI](https://raw.githubusercontent.com/dnth/active-vision/main/assets/labeling_ui.png)
88
109
 
89
110
  Once complete, the labeled samples will be save into a new df.
90
111
  We can now add the newly labeled data to the training set.
@@ -103,11 +124,77 @@ Repeat the process until the model is good enough. Use the dataset to train a la
103
124
  >
104
125
  > But using the dataset of 200+ images, I trained a more capable model (convnext_small_in22k) and got 99.3% accuracy on the evaluation set. See the [notebook](./nbs/05_retrain_larger.ipynb) for more details.
105
126
 
106
- ## Workflow
107
- There are two workflows for active learning at the edge that we can use depending on the availability of labeled data.
127
+
128
+ ## 📊 Benchmarks
129
+ This section contains the benchmarks I ran using the active learning loop on various datasets.
130
+
131
+ Column description:
132
+ - `#Labeled Images`: The number of labeled images used to train the model.
133
+ - `Evaluation Accuracy`: The accuracy of the model on the evaluation set.
134
+ - `Train Epochs`: The number of epochs used to train the model.
135
+ - `Model`: The model used to train.
136
+ - `Active Learning`: Whether active learning was used to train the model.
137
+ - `Source`: The source of the results.
138
+
139
+ ### Imagenette
140
+ - num classes: 10
141
+ - num images: 9469
142
+
143
+ To start the active learning loop, I labeled 100 images (10 images from each class) and iteratively relabeled the most informative images until I hit 275 labeled images.
144
+
145
+ The active learning loop is a iterative process and can keep going until you hit a stopping point. You can decide your own stopping point based on your use case. It could be:
146
+ - You ran out of data to label.
147
+ - You hit a performance goal.
148
+ - You hit a budget.
149
+ - Other criteria.
150
+
151
+ For this dataset,I decided to stop the active learning loop at 275 labeled images because the performance on the evaluation set is close to the top performing model on the leaderboard.
152
+
153
+
154
+ | #Labeled Images | Evaluation Accuracy | Train Epochs | Model | Active Learning | Source |
155
+ |-----------------|---------------------|--------------|----------------------|----------------|--------|
156
+ | 9469 | 94.90% | 80 | xse_resnext50 | ❌ | [Link](https://github.com/fastai/imagenette) |
157
+ | 9469 | 95.11% | 200 | xse_resnext50 | ❌ | [Link](https://github.com/fastai/imagenette) |
158
+ | 275 | 99.33% | 6 | convnext_small_in22k | ✓ | [Link](https://github.com/dnth/active-vision/blob/main/nbs/05_retrain_larger.ipynb) |
159
+ | 275 | 93.40% | 4 | resnet18 | ✓ | [Link](https://github.com/dnth/active-vision/blob/main/nbs/04_relabel_loop.ipynb) |
160
+
161
+ ### Dog Food
162
+ - num classes: 2
163
+ - num images: 2100
164
+
165
+ To start the active learning loop, I labeled 20 images (10 images from each class) and iteratively relabeled the most informative images until I hit 160 labeled images.
166
+
167
+ I decided to stop the active learning loop at 160 labeled images because the performance on the evaluation set is close to the top performing model on the leaderboard. You can decide your own stopping point based on your use case.
168
+
169
+ | #Labeled Images | Evaluation Accuracy | Train Epochs | Model | Active Learning | Source |
170
+ |-----------------|---------------------|--------------|-------|----------------|--------|
171
+ | 2100 | 99.70% | ? | vit-base-patch16-224 | ❌ | [Link](https://huggingface.co/abhishek/autotrain-dog-vs-food) |
172
+ | 160 | 100.00% | 6 | convnext_small_in22k | ✓ | [Link](https://github.com/dnth/active-vision/blob/main/nbs/dog_food_dataset/02_train.ipynb) |
173
+ | 160 | 97.60% | 4 | resnet18 | ✓ | [Link](https://github.com/dnth/active-vision/blob/main/nbs/dog_food_dataset/01_label.ipynb) |
174
+
175
+ ### Oxford-IIIT Pet
176
+ - num classes: 37
177
+ - num images: 3680
178
+
179
+ To start the active learning loop, I labeled 370 images (10 images from each class) and iteratively relabeled the most informative images until I hit 612 labeled images.
180
+
181
+ I decided to stop the active learning loop at 612 labeled images because the performance on the evaluation set is close to the top performing model on the leaderboard. You can decide your own stopping point based on your use case.
182
+
183
+ | #Labeled Images | Evaluation Accuracy | Train Epochs | Model | Active Learning | Source |
184
+ |-----------------|---------------------|--------------|-------|----------------|--------|
185
+ | 3680 | 95.40% | 5 | vit-base-patch16-224 | ❌ | [Link](https://huggingface.co/walterg777/vit-base-oxford-iiit-pets) |
186
+ | 612 | 90.26% | 11 | convnext_small_in22k | ✓ | [Link](https://github.com/dnth/active-vision/blob/main/nbs/oxford_iiit_pets/02_train.ipynb) |
187
+ | 612 | 91.38% | 11 | vit-base-patch16-224 | ✓ | [Link](https://github.com/dnth/active-vision/blob/main/nbs/oxford_iiit_pets/03_train_vit.ipynb) |
188
+
189
+
190
+
191
+ ## ➿ Workflow
192
+ This section describes a more detailed workflow for active learning. There are two workflows for active learning that we can use depending on the availability of labeled data.
108
193
 
109
194
  ### With unlabeled data
110
- If we have no labeled data, we can use active learning to iteratively improve the model and build a labeled dataset.
195
+ If we have no labeled data, the goal of the active learning loop is to build a resonably good labeled dataset to train a larger model.
196
+
197
+ Steps:
111
198
 
112
199
  1. Load a small proxy model.
113
200
  2. Label an initial dataset. If there is none, you'll have to label some images.
@@ -139,24 +226,25 @@ graph TD
139
226
  ```
140
227
 
141
228
  ### With labeled data
142
- If we have a labeled dataset, we can use active learning to iteratively improve the dataset and the model by fixing the most important label errors.
229
+ If we already have a labeled dataset, the goal of the active learning loop is to iteratively improve the dataset and the model by fixing the most important label errors.
230
+
231
+ Steps:
143
232
 
144
233
  1. Load a small proxy model.
145
234
  2. Train the proxy model on the labeled dataset.
146
235
  3. Run inference on the entire labeled dataset.
147
- 4. Get the most important label errors with active learning.
236
+ 4. Get the most impactful label errors with active learning.
148
237
  5. Fix the label errors.
149
238
  6. Repeat steps 2-5 until the dataset is good enough.
150
239
  7. Save the labeled dataset.
151
240
  8. Train a larger model on the saved labeled dataset.
152
241
 
153
242
 
154
-
155
243
  ```mermaid
156
244
  graph TD
157
245
  A[Load a small proxy model] --> B[Train proxy model on labeled dataset]
158
246
  B --> C[Run inference on labeled dataset]
159
- C --> D[Get important label errors using active learning]
247
+ C --> D[Get label errors using active learning]
160
248
  D --> E[Fix label errors]
161
249
  E --> F{Dataset good enough?}
162
250
  F -->|No| B
@@ -165,6 +253,7 @@ graph TD
165
253
  ```
166
254
 
167
255
 
256
+
168
257
  <!-- ## Methodology
169
258
  To test out the workflows we will use the [imagenette dataset](https://huggingface.co/datasets/frgfm/imagenette). But this will be applicable to any dataset.
170
259
 
@@ -1,10 +1,11 @@
1
1
  [project]
2
2
  name = "active-vision"
3
- version = "0.0.4"
3
+ version = "0.1.0"
4
4
  description = "Active learning for edge vision."
5
5
  readme = "README.md"
6
6
  requires-python = ">=3.10"
7
7
  dependencies = [
8
+ "accelerate>=1.2.1",
8
9
  "datasets>=3.2.0",
9
10
  "fastai>=2.7.18",
10
11
  "gradio>=5.12.0",
@@ -13,4 +14,6 @@ dependencies = [
13
14
  "loguru>=0.7.3",
14
15
  "seaborn>=0.13.2",
15
16
  "timm>=1.0.13",
17
+ "transformers>=4.48.0",
18
+ "xinfer>=0.3.2",
16
19
  ]
@@ -0,0 +1,3 @@
1
+ __version__ = "0.1.0"
2
+
3
+ from .core import *