active-vision 0.0.3__tar.gz → 0.0.5__tar.gz
Sign up to get free protection for your applications and to get access to all the features.
- {active_vision-0.0.3 → active_vision-0.0.5}/PKG-INFO +135 -29
- {active_vision-0.0.3 → active_vision-0.0.5}/README.md +133 -28
- {active_vision-0.0.3 → active_vision-0.0.5}/pyproject.toml +3 -2
- active_vision-0.0.5/src/active_vision/__init__.py +3 -0
- {active_vision-0.0.3 → active_vision-0.0.5}/src/active_vision/core.py +74 -13
- {active_vision-0.0.3 → active_vision-0.0.5}/src/active_vision.egg-info/PKG-INFO +135 -29
- {active_vision-0.0.3 → active_vision-0.0.5}/src/active_vision.egg-info/requires.txt +1 -0
- active_vision-0.0.3/src/active_vision/__init__.py +0 -3
- {active_vision-0.0.3 → active_vision-0.0.5}/LICENSE +0 -0
- {active_vision-0.0.3 → active_vision-0.0.5}/setup.cfg +0 -0
- {active_vision-0.0.3 → active_vision-0.0.5}/src/active_vision.egg-info/SOURCES.txt +0 -0
- {active_vision-0.0.3 → active_vision-0.0.5}/src/active_vision.egg-info/dependency_links.txt +0 -0
- {active_vision-0.0.3 → active_vision-0.0.5}/src/active_vision.egg-info/top_level.txt +0 -0
@@ -1,6 +1,6 @@
|
|
1
1
|
Metadata-Version: 2.2
|
2
2
|
Name: active-vision
|
3
|
-
Version: 0.0.
|
3
|
+
Version: 0.0.5
|
4
4
|
Summary: Active learning for edge vision.
|
5
5
|
Requires-Python: >=3.10
|
6
6
|
Description-Content-Type: text/markdown
|
@@ -12,6 +12,7 @@ Requires-Dist: ipykernel>=6.29.5
|
|
12
12
|
Requires-Dist: ipywidgets>=8.1.5
|
13
13
|
Requires-Dist: loguru>=0.7.3
|
14
14
|
Requires-Dist: seaborn>=0.13.2
|
15
|
+
Requires-Dist: timm>=1.0.13
|
15
16
|
|
16
17
|
![Python Version](https://img.shields.io/badge/python-3.10%2B-blue?style=for-the-badge)
|
17
18
|
![License](https://img.shields.io/badge/License-Apache%202.0-green.svg?style=for-the-badge)
|
@@ -22,20 +23,38 @@ Requires-Dist: seaborn>=0.13.2
|
|
22
23
|
<img src="https://raw.githubusercontent.com/dnth/active-vision/main/assets/logo.png" alt="active-vision">
|
23
24
|
</p>
|
24
25
|
|
25
|
-
|
26
|
+
The goal of this project is to create a framework for the active learning loop for computer vision. The diagram below shows a general workflow of how the active learning loop works.
|
26
27
|
|
27
|
-
|
28
|
+
<p align="center">
|
29
|
+
<img src="https://raw.githubusercontent.com/dnth/active-vision/main/assets/data_flywheel.gif" alt="active-vision", width="700">
|
30
|
+
</p>
|
28
31
|
|
29
|
-
|
30
|
-
|
32
|
+
### Supported tasks:
|
33
|
+
- [X] Image classification
|
34
|
+
- [ ] Object detection
|
35
|
+
- [ ] Segmentation
|
31
36
|
|
32
|
-
|
37
|
+
### Supported models:
|
38
|
+
- [X] Fastai models
|
39
|
+
- [X] Torchvision models
|
40
|
+
- [X] Timm models
|
41
|
+
- [ ] Hugging Face models
|
33
42
|
|
34
|
-
|
35
|
-
|
36
|
-
|
37
|
-
|
38
|
-
|
43
|
+
### Supported Active Learning Strategies:
|
44
|
+
|
45
|
+
Uncertainty Sampling:
|
46
|
+
- [X] Least confidence
|
47
|
+
- [ ] Margin of confidence
|
48
|
+
- [ ] Ratio of confidence
|
49
|
+
- [ ] Entropy
|
50
|
+
|
51
|
+
Diverse Sampling:
|
52
|
+
- [X] Random sampling
|
53
|
+
- [ ] Model-based outlier
|
54
|
+
- [ ] Cluster-based
|
55
|
+
- [ ] Representative
|
56
|
+
|
57
|
+
## 📦 Installation
|
39
58
|
|
40
59
|
Get a release from PyPI
|
41
60
|
```bash
|
@@ -49,19 +68,31 @@ cd active-vision
|
|
49
68
|
pip install -e .
|
50
69
|
```
|
51
70
|
|
71
|
+
I recommend using [uv](https://docs.astral.sh/uv/) to set up a virtual environment and install the package. You can also use other virtual env of your choice.
|
72
|
+
|
73
|
+
If you're using uv:
|
74
|
+
|
75
|
+
```bash
|
76
|
+
uv venv
|
77
|
+
uv sync
|
78
|
+
```
|
79
|
+
Once the virtual environment is created, you can install the package using pip.
|
80
|
+
|
52
81
|
> [!TIP]
|
53
|
-
> If you're using uv add a uv before the pip install command to install into your virtual environment. Eg:
|
82
|
+
> If you're using uv add a `uv` before the pip install command to install into your virtual environment. Eg:
|
54
83
|
> ```bash
|
55
84
|
> uv pip install active-vision
|
56
85
|
> ```
|
57
86
|
|
58
|
-
## Usage
|
87
|
+
## 🛠️ Usage
|
59
88
|
See the [notebook](./nbs/04_relabel_loop.ipynb) for a complete example.
|
60
89
|
|
61
|
-
Be sure to prepared 3
|
62
|
-
-
|
63
|
-
-
|
64
|
-
-
|
90
|
+
Be sure to prepared 3 subsets of the dataset:
|
91
|
+
- [Initial samples](./nbs/initial_samples.parquet): A dataframe of a labeled images to train an initial model. If you don't have any labeled data, you can label some images yourself.
|
92
|
+
- [Unlabeled samples](./nbs/unlabeled_samples.parquet): A dataframe of *unlabeled* images. We will continuously sample from this set using active learning strategies.
|
93
|
+
- [Evaluation samples](./nbs/evaluation_samples.parquet): A dataframe of *labeled* images. We will use this set to evaluate the performance of the model. This is the test set, DO NOT use it for active learning. Split this out in the beginning.
|
94
|
+
|
95
|
+
As a toy example I created the above 3 datasets from the imagenette dataset.
|
65
96
|
|
66
97
|
```python
|
67
98
|
from active_vision import ActiveLearner
|
@@ -90,7 +121,7 @@ uncertain_df = al.sample_uncertain(pred_df, num_samples=10)
|
|
90
121
|
al.label(uncertain_df, output_filename="uncertain")
|
91
122
|
```
|
92
123
|
|
93
|
-
![Gradio UI](
|
124
|
+
![Gradio UI](https://raw.githubusercontent.com/dnth/active-vision/main/assets/labeling_ui.png)
|
94
125
|
|
95
126
|
Once complete, the labeled samples will be save into a new df.
|
96
127
|
We can now add the newly labeled data to the training set.
|
@@ -102,17 +133,90 @@ al.add_to_train_set(labeled_df, output_filename="active_labeled")
|
|
102
133
|
|
103
134
|
Repeat the process until the model is good enough. Use the dataset to train a larger model and deploy.
|
104
135
|
|
105
|
-
|
106
|
-
|
136
|
+
> [!TIP]
|
137
|
+
> For the toy dataset, I got to about 93% accuracy on the evaluation set with 200+ labeled images. The best performing model on the [leaderboard](https://github.com/fastai/imagenette) got 95.11% accuracy training on all 9469 labeled images.
|
138
|
+
>
|
139
|
+
> This took me about 6 iterations of relabeling. Each iteration took about 5 minutes to complete including labeling and model training (resnet18). See the [notebook](./nbs/04_relabel_loop.ipynb) for more details.
|
140
|
+
>
|
141
|
+
> But using the dataset of 200+ images, I trained a more capable model (convnext_small_in22k) and got 99.3% accuracy on the evaluation set. See the [notebook](./nbs/05_retrain_larger.ipynb) for more details.
|
142
|
+
|
143
|
+
|
144
|
+
## 📊 Benchmarks
|
145
|
+
This section contains the benchmarks I ran using the active learning loop on various datasets.
|
146
|
+
|
147
|
+
Column description:
|
148
|
+
- `#Labeled Images`: The number of labeled images used to train the model.
|
149
|
+
- `Evaluation Accuracy`: The accuracy of the model on the evaluation set.
|
150
|
+
- `Train Epochs`: The number of epochs used to train the model.
|
151
|
+
- `Model`: The model used to train.
|
152
|
+
- `Active Learning`: Whether active learning was used to train the model.
|
153
|
+
- `Source`: The source of the results.
|
154
|
+
|
155
|
+
### Imagenette
|
156
|
+
- num classes: 10
|
157
|
+
- num images: 9469
|
158
|
+
|
159
|
+
To start the active learning loop, I labeled 100 images (10 images from each class) and iteratively relabeled the most informative images until I hit 275 labeled images.
|
160
|
+
|
161
|
+
The active learning loop is a iterative process and can keep going until you hit a stopping point. You can decide your own stopping point based on your use case. It could be:
|
162
|
+
- You ran out of data to label.
|
163
|
+
- You hit a performance goal.
|
164
|
+
- You hit a budget.
|
165
|
+
- Other criteria.
|
166
|
+
|
167
|
+
For this dataset,I decided to stop the active learning loop at 275 labeled images because the performance on the evaluation set is close to the top performing model on the leaderboard.
|
168
|
+
|
169
|
+
|
170
|
+
| #Labeled Images | Evaluation Accuracy | Train Epochs | Model | Active Learning | Source |
|
171
|
+
|-----------------|---------------------|--------------|----------------------|----------------|--------|
|
172
|
+
| 9469 | 94.90% | 80 | xse_resnext50 | ❌ | [Link](https://github.com/fastai/imagenette) |
|
173
|
+
| 9469 | 95.11% | 200 | xse_resnext50 | ❌ | [Link](https://github.com/fastai/imagenette) |
|
174
|
+
| 275 | 99.33% | 6 | convnext_small_in22k | ✓ | [Link](https://github.com/dnth/active-vision/blob/main/nbs/05_retrain_larger.ipynb) |
|
175
|
+
| 275 | 93.40% | 4 | resnet18 | ✓ | [Link](https://github.com/dnth/active-vision/blob/main/nbs/04_relabel_loop.ipynb) |
|
176
|
+
|
177
|
+
### Dog Food
|
178
|
+
- num classes: 2
|
179
|
+
- num images: 2100
|
180
|
+
|
181
|
+
To start the active learning loop, I labeled 20 images (10 images from each class) and iteratively relabeled the most informative images until I hit 160 labeled images.
|
182
|
+
|
183
|
+
I decided to stop the active learning loop at 160 labeled images because the performance on the evaluation set is close to the top performing model on the leaderboard. You can decide your own stopping point based on your use case.
|
184
|
+
|
185
|
+
| #Labeled Images | Evaluation Accuracy | Train Epochs | Model | Active Learning | Source |
|
186
|
+
|-----------------|---------------------|--------------|-------|----------------|--------|
|
187
|
+
| 2100 | 99.70% | ? | vit-base-patch16-224 | ❌ | [Link](https://huggingface.co/abhishek/autotrain-dog-vs-food) |
|
188
|
+
| 160 | 100.00% | 6 | convnext_small_in22k | ✓ | [Link](https://github.com/dnth/active-vision/blob/main/nbs/dog_food_dataset/02_train.ipynb) |
|
189
|
+
| 160 | 97.60% | 4 | resnet18 | ✓ | [Link](https://github.com/dnth/active-vision/blob/main/nbs/dog_food_dataset/01_label.ipynb) |
|
190
|
+
|
191
|
+
### Oxford-IIIT Pet
|
192
|
+
- num classes: 37
|
193
|
+
- num images: 3680
|
194
|
+
|
195
|
+
To start the active learning loop, I labeled 370 images (10 images from each class) and iteratively relabeled the most informative images until I hit 612 labeled images.
|
196
|
+
|
197
|
+
I decided to stop the active learning loop at 612 labeled images because the performance on the evaluation set is close to the top performing model on the leaderboard. You can decide your own stopping point based on your use case.
|
198
|
+
|
199
|
+
| #Labeled Images | Evaluation Accuracy | Train Epochs | Model | Active Learning | Source |
|
200
|
+
|-----------------|---------------------|--------------|-------|----------------|--------|
|
201
|
+
| 3680 | 95.40% | 5 | vit-base-patch16-224 | ❌ | [Link](https://huggingface.co/walterg777/vit-base-oxford-iiit-pets) |
|
202
|
+
| 612 | 90.26% | 11 | convnext_small_in22k | ✓ | [Link](https://github.com/dnth/active-vision/blob/main/nbs/oxford_iiit_pets/02_train.ipynb) |
|
203
|
+
| 612 | 91.38% | 11 | vit-base-patch16-224 | ✓ | [Link](https://github.com/dnth/active-vision/blob/main/nbs/oxford_iiit_pets/03_train_vit.ipynb) |
|
204
|
+
|
205
|
+
|
206
|
+
|
207
|
+
## ➿ Workflow
|
208
|
+
This section describes a more detailed workflow for active learning. There are two workflows for active learning that we can use depending on the availability of labeled data.
|
107
209
|
|
108
210
|
### With unlabeled data
|
109
|
-
If we have no labeled data,
|
211
|
+
If we have no labeled data, the goal of the active learning loop is to build a resonably good labeled dataset to train a larger model.
|
212
|
+
|
213
|
+
Steps:
|
110
214
|
|
111
215
|
1. Load a small proxy model.
|
112
|
-
2. Label an initial dataset.
|
216
|
+
2. Label an initial dataset. If there is none, you'll have to label some images.
|
113
217
|
3. Train the proxy model on the labeled dataset.
|
114
218
|
4. Run inference on the unlabeled dataset.
|
115
|
-
5. Evaluate the performance of the proxy model
|
219
|
+
5. Evaluate the performance of the proxy model.
|
116
220
|
6. Is model good enough?
|
117
221
|
- Yes: Save the proxy model and the dataset.
|
118
222
|
- No: Select the most informative images to label using active learning.
|
@@ -138,24 +242,25 @@ graph TD
|
|
138
242
|
```
|
139
243
|
|
140
244
|
### With labeled data
|
141
|
-
If we have a labeled dataset,
|
245
|
+
If we already have a labeled dataset, the goal of the active learning loop is to iteratively improve the dataset and the model by fixing the most important label errors.
|
246
|
+
|
247
|
+
Steps:
|
142
248
|
|
143
249
|
1. Load a small proxy model.
|
144
250
|
2. Train the proxy model on the labeled dataset.
|
145
251
|
3. Run inference on the entire labeled dataset.
|
146
|
-
4. Get the most
|
252
|
+
4. Get the most impactful label errors with active learning.
|
147
253
|
5. Fix the label errors.
|
148
254
|
6. Repeat steps 2-5 until the dataset is good enough.
|
149
255
|
7. Save the labeled dataset.
|
150
256
|
8. Train a larger model on the saved labeled dataset.
|
151
257
|
|
152
258
|
|
153
|
-
|
154
259
|
```mermaid
|
155
260
|
graph TD
|
156
261
|
A[Load a small proxy model] --> B[Train proxy model on labeled dataset]
|
157
262
|
B --> C[Run inference on labeled dataset]
|
158
|
-
C --> D[Get
|
263
|
+
C --> D[Get label errors using active learning]
|
159
264
|
D --> E[Fix label errors]
|
160
265
|
E --> F{Dataset good enough?}
|
161
266
|
F -->|No| B
|
@@ -164,7 +269,8 @@ graph TD
|
|
164
269
|
```
|
165
270
|
|
166
271
|
|
167
|
-
|
272
|
+
|
273
|
+
<!-- ## Methodology
|
168
274
|
To test out the workflows we will use the [imagenette dataset](https://huggingface.co/datasets/frgfm/imagenette). But this will be applicable to any dataset.
|
169
275
|
|
170
276
|
Imagenette is a subset of the ImageNet dataset with 10 classes. We will use this dataset to test out the workflows. Additionally, Imagenette has an existing leaderboard which we can use to evaluate the performance of the models.
|
@@ -215,4 +321,4 @@ After the first iteration we got 94.57% accuracy on the validation set. See the
|
|
215
321
|
> [!TIP]
|
216
322
|
> | Train Epochs | Number of Images | Validation Accuracy | Source |
|
217
323
|
> |--------------|-----------------|----------------------|------------------|
|
218
|
-
> | 10 | 200 | 94.57% | First relabeling [notebook](./nbs/03_retrain_model.ipynb) |
|
324
|
+
> | 10 | 200 | 94.57% | First relabeling [notebook](./nbs/03_retrain_model.ipynb) | -->
|
@@ -7,20 +7,38 @@
|
|
7
7
|
<img src="https://raw.githubusercontent.com/dnth/active-vision/main/assets/logo.png" alt="active-vision">
|
8
8
|
</p>
|
9
9
|
|
10
|
-
|
10
|
+
The goal of this project is to create a framework for the active learning loop for computer vision. The diagram below shows a general workflow of how the active learning loop works.
|
11
11
|
|
12
|
-
|
12
|
+
<p align="center">
|
13
|
+
<img src="https://raw.githubusercontent.com/dnth/active-vision/main/assets/data_flywheel.gif" alt="active-vision", width="700">
|
14
|
+
</p>
|
13
15
|
|
14
|
-
|
15
|
-
|
16
|
+
### Supported tasks:
|
17
|
+
- [X] Image classification
|
18
|
+
- [ ] Object detection
|
19
|
+
- [ ] Segmentation
|
16
20
|
|
17
|
-
|
21
|
+
### Supported models:
|
22
|
+
- [X] Fastai models
|
23
|
+
- [X] Torchvision models
|
24
|
+
- [X] Timm models
|
25
|
+
- [ ] Hugging Face models
|
18
26
|
|
19
|
-
|
20
|
-
|
21
|
-
|
22
|
-
|
23
|
-
|
27
|
+
### Supported Active Learning Strategies:
|
28
|
+
|
29
|
+
Uncertainty Sampling:
|
30
|
+
- [X] Least confidence
|
31
|
+
- [ ] Margin of confidence
|
32
|
+
- [ ] Ratio of confidence
|
33
|
+
- [ ] Entropy
|
34
|
+
|
35
|
+
Diverse Sampling:
|
36
|
+
- [X] Random sampling
|
37
|
+
- [ ] Model-based outlier
|
38
|
+
- [ ] Cluster-based
|
39
|
+
- [ ] Representative
|
40
|
+
|
41
|
+
## 📦 Installation
|
24
42
|
|
25
43
|
Get a release from PyPI
|
26
44
|
```bash
|
@@ -34,19 +52,31 @@ cd active-vision
|
|
34
52
|
pip install -e .
|
35
53
|
```
|
36
54
|
|
55
|
+
I recommend using [uv](https://docs.astral.sh/uv/) to set up a virtual environment and install the package. You can also use other virtual env of your choice.
|
56
|
+
|
57
|
+
If you're using uv:
|
58
|
+
|
59
|
+
```bash
|
60
|
+
uv venv
|
61
|
+
uv sync
|
62
|
+
```
|
63
|
+
Once the virtual environment is created, you can install the package using pip.
|
64
|
+
|
37
65
|
> [!TIP]
|
38
|
-
> If you're using uv add a uv before the pip install command to install into your virtual environment. Eg:
|
66
|
+
> If you're using uv add a `uv` before the pip install command to install into your virtual environment. Eg:
|
39
67
|
> ```bash
|
40
68
|
> uv pip install active-vision
|
41
69
|
> ```
|
42
70
|
|
43
|
-
## Usage
|
71
|
+
## 🛠️ Usage
|
44
72
|
See the [notebook](./nbs/04_relabel_loop.ipynb) for a complete example.
|
45
73
|
|
46
|
-
Be sure to prepared 3
|
47
|
-
-
|
48
|
-
-
|
49
|
-
-
|
74
|
+
Be sure to prepared 3 subsets of the dataset:
|
75
|
+
- [Initial samples](./nbs/initial_samples.parquet): A dataframe of a labeled images to train an initial model. If you don't have any labeled data, you can label some images yourself.
|
76
|
+
- [Unlabeled samples](./nbs/unlabeled_samples.parquet): A dataframe of *unlabeled* images. We will continuously sample from this set using active learning strategies.
|
77
|
+
- [Evaluation samples](./nbs/evaluation_samples.parquet): A dataframe of *labeled* images. We will use this set to evaluate the performance of the model. This is the test set, DO NOT use it for active learning. Split this out in the beginning.
|
78
|
+
|
79
|
+
As a toy example I created the above 3 datasets from the imagenette dataset.
|
50
80
|
|
51
81
|
```python
|
52
82
|
from active_vision import ActiveLearner
|
@@ -75,7 +105,7 @@ uncertain_df = al.sample_uncertain(pred_df, num_samples=10)
|
|
75
105
|
al.label(uncertain_df, output_filename="uncertain")
|
76
106
|
```
|
77
107
|
|
78
|
-
![Gradio UI](
|
108
|
+
![Gradio UI](https://raw.githubusercontent.com/dnth/active-vision/main/assets/labeling_ui.png)
|
79
109
|
|
80
110
|
Once complete, the labeled samples will be save into a new df.
|
81
111
|
We can now add the newly labeled data to the training set.
|
@@ -87,17 +117,90 @@ al.add_to_train_set(labeled_df, output_filename="active_labeled")
|
|
87
117
|
|
88
118
|
Repeat the process until the model is good enough. Use the dataset to train a larger model and deploy.
|
89
119
|
|
90
|
-
|
91
|
-
|
120
|
+
> [!TIP]
|
121
|
+
> For the toy dataset, I got to about 93% accuracy on the evaluation set with 200+ labeled images. The best performing model on the [leaderboard](https://github.com/fastai/imagenette) got 95.11% accuracy training on all 9469 labeled images.
|
122
|
+
>
|
123
|
+
> This took me about 6 iterations of relabeling. Each iteration took about 5 minutes to complete including labeling and model training (resnet18). See the [notebook](./nbs/04_relabel_loop.ipynb) for more details.
|
124
|
+
>
|
125
|
+
> But using the dataset of 200+ images, I trained a more capable model (convnext_small_in22k) and got 99.3% accuracy on the evaluation set. See the [notebook](./nbs/05_retrain_larger.ipynb) for more details.
|
126
|
+
|
127
|
+
|
128
|
+
## 📊 Benchmarks
|
129
|
+
This section contains the benchmarks I ran using the active learning loop on various datasets.
|
130
|
+
|
131
|
+
Column description:
|
132
|
+
- `#Labeled Images`: The number of labeled images used to train the model.
|
133
|
+
- `Evaluation Accuracy`: The accuracy of the model on the evaluation set.
|
134
|
+
- `Train Epochs`: The number of epochs used to train the model.
|
135
|
+
- `Model`: The model used to train.
|
136
|
+
- `Active Learning`: Whether active learning was used to train the model.
|
137
|
+
- `Source`: The source of the results.
|
138
|
+
|
139
|
+
### Imagenette
|
140
|
+
- num classes: 10
|
141
|
+
- num images: 9469
|
142
|
+
|
143
|
+
To start the active learning loop, I labeled 100 images (10 images from each class) and iteratively relabeled the most informative images until I hit 275 labeled images.
|
144
|
+
|
145
|
+
The active learning loop is a iterative process and can keep going until you hit a stopping point. You can decide your own stopping point based on your use case. It could be:
|
146
|
+
- You ran out of data to label.
|
147
|
+
- You hit a performance goal.
|
148
|
+
- You hit a budget.
|
149
|
+
- Other criteria.
|
150
|
+
|
151
|
+
For this dataset,I decided to stop the active learning loop at 275 labeled images because the performance on the evaluation set is close to the top performing model on the leaderboard.
|
152
|
+
|
153
|
+
|
154
|
+
| #Labeled Images | Evaluation Accuracy | Train Epochs | Model | Active Learning | Source |
|
155
|
+
|-----------------|---------------------|--------------|----------------------|----------------|--------|
|
156
|
+
| 9469 | 94.90% | 80 | xse_resnext50 | ❌ | [Link](https://github.com/fastai/imagenette) |
|
157
|
+
| 9469 | 95.11% | 200 | xse_resnext50 | ❌ | [Link](https://github.com/fastai/imagenette) |
|
158
|
+
| 275 | 99.33% | 6 | convnext_small_in22k | ✓ | [Link](https://github.com/dnth/active-vision/blob/main/nbs/05_retrain_larger.ipynb) |
|
159
|
+
| 275 | 93.40% | 4 | resnet18 | ✓ | [Link](https://github.com/dnth/active-vision/blob/main/nbs/04_relabel_loop.ipynb) |
|
160
|
+
|
161
|
+
### Dog Food
|
162
|
+
- num classes: 2
|
163
|
+
- num images: 2100
|
164
|
+
|
165
|
+
To start the active learning loop, I labeled 20 images (10 images from each class) and iteratively relabeled the most informative images until I hit 160 labeled images.
|
166
|
+
|
167
|
+
I decided to stop the active learning loop at 160 labeled images because the performance on the evaluation set is close to the top performing model on the leaderboard. You can decide your own stopping point based on your use case.
|
168
|
+
|
169
|
+
| #Labeled Images | Evaluation Accuracy | Train Epochs | Model | Active Learning | Source |
|
170
|
+
|-----------------|---------------------|--------------|-------|----------------|--------|
|
171
|
+
| 2100 | 99.70% | ? | vit-base-patch16-224 | ❌ | [Link](https://huggingface.co/abhishek/autotrain-dog-vs-food) |
|
172
|
+
| 160 | 100.00% | 6 | convnext_small_in22k | ✓ | [Link](https://github.com/dnth/active-vision/blob/main/nbs/dog_food_dataset/02_train.ipynb) |
|
173
|
+
| 160 | 97.60% | 4 | resnet18 | ✓ | [Link](https://github.com/dnth/active-vision/blob/main/nbs/dog_food_dataset/01_label.ipynb) |
|
174
|
+
|
175
|
+
### Oxford-IIIT Pet
|
176
|
+
- num classes: 37
|
177
|
+
- num images: 3680
|
178
|
+
|
179
|
+
To start the active learning loop, I labeled 370 images (10 images from each class) and iteratively relabeled the most informative images until I hit 612 labeled images.
|
180
|
+
|
181
|
+
I decided to stop the active learning loop at 612 labeled images because the performance on the evaluation set is close to the top performing model on the leaderboard. You can decide your own stopping point based on your use case.
|
182
|
+
|
183
|
+
| #Labeled Images | Evaluation Accuracy | Train Epochs | Model | Active Learning | Source |
|
184
|
+
|-----------------|---------------------|--------------|-------|----------------|--------|
|
185
|
+
| 3680 | 95.40% | 5 | vit-base-patch16-224 | ❌ | [Link](https://huggingface.co/walterg777/vit-base-oxford-iiit-pets) |
|
186
|
+
| 612 | 90.26% | 11 | convnext_small_in22k | ✓ | [Link](https://github.com/dnth/active-vision/blob/main/nbs/oxford_iiit_pets/02_train.ipynb) |
|
187
|
+
| 612 | 91.38% | 11 | vit-base-patch16-224 | ✓ | [Link](https://github.com/dnth/active-vision/blob/main/nbs/oxford_iiit_pets/03_train_vit.ipynb) |
|
188
|
+
|
189
|
+
|
190
|
+
|
191
|
+
## ➿ Workflow
|
192
|
+
This section describes a more detailed workflow for active learning. There are two workflows for active learning that we can use depending on the availability of labeled data.
|
92
193
|
|
93
194
|
### With unlabeled data
|
94
|
-
If we have no labeled data,
|
195
|
+
If we have no labeled data, the goal of the active learning loop is to build a resonably good labeled dataset to train a larger model.
|
196
|
+
|
197
|
+
Steps:
|
95
198
|
|
96
199
|
1. Load a small proxy model.
|
97
|
-
2. Label an initial dataset.
|
200
|
+
2. Label an initial dataset. If there is none, you'll have to label some images.
|
98
201
|
3. Train the proxy model on the labeled dataset.
|
99
202
|
4. Run inference on the unlabeled dataset.
|
100
|
-
5. Evaluate the performance of the proxy model
|
203
|
+
5. Evaluate the performance of the proxy model.
|
101
204
|
6. Is model good enough?
|
102
205
|
- Yes: Save the proxy model and the dataset.
|
103
206
|
- No: Select the most informative images to label using active learning.
|
@@ -123,24 +226,25 @@ graph TD
|
|
123
226
|
```
|
124
227
|
|
125
228
|
### With labeled data
|
126
|
-
If we have a labeled dataset,
|
229
|
+
If we already have a labeled dataset, the goal of the active learning loop is to iteratively improve the dataset and the model by fixing the most important label errors.
|
230
|
+
|
231
|
+
Steps:
|
127
232
|
|
128
233
|
1. Load a small proxy model.
|
129
234
|
2. Train the proxy model on the labeled dataset.
|
130
235
|
3. Run inference on the entire labeled dataset.
|
131
|
-
4. Get the most
|
236
|
+
4. Get the most impactful label errors with active learning.
|
132
237
|
5. Fix the label errors.
|
133
238
|
6. Repeat steps 2-5 until the dataset is good enough.
|
134
239
|
7. Save the labeled dataset.
|
135
240
|
8. Train a larger model on the saved labeled dataset.
|
136
241
|
|
137
242
|
|
138
|
-
|
139
243
|
```mermaid
|
140
244
|
graph TD
|
141
245
|
A[Load a small proxy model] --> B[Train proxy model on labeled dataset]
|
142
246
|
B --> C[Run inference on labeled dataset]
|
143
|
-
C --> D[Get
|
247
|
+
C --> D[Get label errors using active learning]
|
144
248
|
D --> E[Fix label errors]
|
145
249
|
E --> F{Dataset good enough?}
|
146
250
|
F -->|No| B
|
@@ -149,7 +253,8 @@ graph TD
|
|
149
253
|
```
|
150
254
|
|
151
255
|
|
152
|
-
|
256
|
+
|
257
|
+
<!-- ## Methodology
|
153
258
|
To test out the workflows we will use the [imagenette dataset](https://huggingface.co/datasets/frgfm/imagenette). But this will be applicable to any dataset.
|
154
259
|
|
155
260
|
Imagenette is a subset of the ImageNet dataset with 10 classes. We will use this dataset to test out the workflows. Additionally, Imagenette has an existing leaderboard which we can use to evaluate the performance of the models.
|
@@ -200,4 +305,4 @@ After the first iteration we got 94.57% accuracy on the validation set. See the
|
|
200
305
|
> [!TIP]
|
201
306
|
> | Train Epochs | Number of Images | Validation Accuracy | Source |
|
202
307
|
> |--------------|-----------------|----------------------|------------------|
|
203
|
-
> | 10 | 200 | 94.57% | First relabeling [notebook](./nbs/03_retrain_model.ipynb) |
|
308
|
+
> | 10 | 200 | 94.57% | First relabeling [notebook](./nbs/03_retrain_model.ipynb) | -->
|
@@ -1,6 +1,6 @@
|
|
1
1
|
[project]
|
2
2
|
name = "active-vision"
|
3
|
-
version = "0.0.
|
3
|
+
version = "0.0.5"
|
4
4
|
description = "Active learning for edge vision."
|
5
5
|
readme = "README.md"
|
6
6
|
requires-python = ">=3.10"
|
@@ -12,4 +12,5 @@ dependencies = [
|
|
12
12
|
"ipywidgets>=8.1.5",
|
13
13
|
"loguru>=0.7.3",
|
14
14
|
"seaborn>=0.13.2",
|
15
|
-
|
15
|
+
"timm>=1.0.13",
|
16
|
+
]
|
@@ -1,6 +1,5 @@
|
|
1
1
|
import pandas as pd
|
2
2
|
from loguru import logger
|
3
|
-
from fastai.vision.models import resnet18, resnet34
|
4
3
|
from fastai.callback.all import ShowGraphCallback
|
5
4
|
from fastai.vision.all import (
|
6
5
|
ImageDataLoaders,
|
@@ -17,6 +16,7 @@ import torch
|
|
17
16
|
import torch.nn.functional as F
|
18
17
|
|
19
18
|
import warnings
|
19
|
+
from typing import Callable
|
20
20
|
|
21
21
|
warnings.filterwarnings("ignore", category=FutureWarning)
|
22
22
|
|
@@ -25,13 +25,14 @@ class ActiveLearner:
|
|
25
25
|
def __init__(self, model_name: str):
|
26
26
|
self.model = self.load_model(model_name)
|
27
27
|
|
28
|
-
def load_model(self, model_name: str):
|
29
|
-
|
30
|
-
|
31
|
-
|
32
|
-
|
33
|
-
|
34
|
-
|
28
|
+
def load_model(self, model_name: str | Callable):
|
29
|
+
if isinstance(model_name, Callable):
|
30
|
+
logger.info(f"Loading fastai model {model_name.__name__}")
|
31
|
+
return model_name
|
32
|
+
|
33
|
+
if isinstance(model_name, str):
|
34
|
+
logger.info(f"Loading timm model {model_name}")
|
35
|
+
return model_name
|
35
36
|
|
36
37
|
def load_dataset(
|
37
38
|
self,
|
@@ -41,6 +42,7 @@ class ActiveLearner:
|
|
41
42
|
valid_pct: float = 0.2,
|
42
43
|
batch_size: int = 16,
|
43
44
|
image_size: int = 224,
|
45
|
+
batch_tfms: Callable = None,
|
44
46
|
):
|
45
47
|
logger.info(f"Loading dataset from {filepath_col} and {label_col}")
|
46
48
|
self.train_set = df.copy()
|
@@ -54,13 +56,16 @@ class ActiveLearner:
|
|
54
56
|
label_col=label_col,
|
55
57
|
bs=batch_size,
|
56
58
|
item_tfms=Resize(image_size),
|
57
|
-
batch_tfms=
|
59
|
+
batch_tfms=batch_tfms,
|
58
60
|
)
|
59
61
|
logger.info("Creating learner")
|
60
62
|
self.learn = vision_learner(self.dls, self.model, metrics=accuracy).to_fp16()
|
61
63
|
self.class_names = self.dls.vocab
|
62
64
|
logger.info("Done. Ready to train.")
|
63
65
|
|
66
|
+
def show_batch(self):
|
67
|
+
self.dls.show_batch()
|
68
|
+
|
64
69
|
def lr_find(self):
|
65
70
|
logger.info("Finding optimal learning rate")
|
66
71
|
self.lrs = self.learn.lr_find(suggest_funcs=(minimum, steep, valley, slide))
|
@@ -112,13 +117,69 @@ class ActiveLearner:
|
|
112
117
|
logger.info(f"Accuracy: {accuracy:.2%}")
|
113
118
|
return accuracy
|
114
119
|
|
115
|
-
def sample_uncertain(
|
120
|
+
def sample_uncertain(
|
121
|
+
self, df: pd.DataFrame, num_samples: int, strategy: str = "least-confidence"
|
122
|
+
):
|
116
123
|
"""
|
117
124
|
Sample top `num_samples` low confidence samples. Returns a df with filepaths and predicted labels, and confidence scores.
|
125
|
+
|
126
|
+
Strategies:
|
127
|
+
- least-confidence: Get top `num_samples` low confidence samples.
|
128
|
+
- margin-of-confidence: Get top `num_samples` samples with the smallest margin between the top two predictions.
|
129
|
+
- ratio-of-confidence: Get top `num_samples` samples with the highest ratio between the top two predictions.
|
130
|
+
- entropy: Get top `num_samples` samples with the highest entropy.
|
118
131
|
"""
|
119
|
-
|
120
|
-
|
121
|
-
|
132
|
+
|
133
|
+
# Remove samples that is already in the training set
|
134
|
+
df = df[~df["filepath"].isin(self.train_set["filepath"])]
|
135
|
+
|
136
|
+
if strategy == "least-confidence":
|
137
|
+
logger.info(f"Getting top {num_samples} low confidence samples")
|
138
|
+
uncertain_df = df.sort_values(by="pred_conf", ascending=True).head(
|
139
|
+
num_samples
|
140
|
+
)
|
141
|
+
return uncertain_df
|
142
|
+
|
143
|
+
# TODO: Implement margin of confidence strategy
|
144
|
+
elif strategy == "margin-of-confidence":
|
145
|
+
logger.error("Margin of confidence strategy not implemented")
|
146
|
+
raise NotImplementedError("Margin of confidence strategy not implemented")
|
147
|
+
|
148
|
+
# TODO: Implement ratio of confidence strategy
|
149
|
+
elif strategy == "ratio-of-confidence":
|
150
|
+
logger.error("Ratio of confidence strategy not implemented")
|
151
|
+
raise NotImplementedError("Ratio of confidence strategy not implemented")
|
152
|
+
|
153
|
+
# TODO: Implement entropy strategy
|
154
|
+
elif strategy == "entropy":
|
155
|
+
logger.error("Entropy strategy not implemented")
|
156
|
+
raise NotImplementedError("Entropy strategy not implemented")
|
157
|
+
|
158
|
+
else:
|
159
|
+
logger.error(f"Unknown strategy: {strategy}")
|
160
|
+
raise ValueError(f"Unknown strategy: {strategy}")
|
161
|
+
|
162
|
+
def sample_diverse(self, df: pd.DataFrame, num_samples: int):
|
163
|
+
"""
|
164
|
+
Sample top `num_samples` diverse samples. Returns a df with filepaths and predicted labels, and confidence scores.
|
165
|
+
|
166
|
+
Strategies:
|
167
|
+
- model-based-outlier: Get top `num_samples` samples with lowest activation of the model's last layer.
|
168
|
+
- cluster-based: Get top `num_samples` samples with the highest distance to the nearest neighbor.
|
169
|
+
- representative: Get top `num_samples` samples with the highest distance to the centroid of the training set.
|
170
|
+
"""
|
171
|
+
logger.error("Diverse sampling strategy not implemented")
|
172
|
+
raise NotImplementedError("Diverse sampling strategy not implemented")
|
173
|
+
|
174
|
+
def sample_random(self, df: pd.DataFrame, num_samples: int, seed: int = None):
|
175
|
+
"""
|
176
|
+
Sample `num_samples` random samples. Returns a df with filepaths and predicted labels, and confidence scores.
|
177
|
+
"""
|
178
|
+
|
179
|
+
logger.info(f"Sampling {num_samples} random samples")
|
180
|
+
if seed is not None:
|
181
|
+
logger.info(f"Using seed: {seed}")
|
182
|
+
return df.sample(n=num_samples, random_state=seed)
|
122
183
|
|
123
184
|
def label(self, df: pd.DataFrame, output_filename: str = "labeled"):
|
124
185
|
"""
|
@@ -1,6 +1,6 @@
|
|
1
1
|
Metadata-Version: 2.2
|
2
2
|
Name: active-vision
|
3
|
-
Version: 0.0.
|
3
|
+
Version: 0.0.5
|
4
4
|
Summary: Active learning for edge vision.
|
5
5
|
Requires-Python: >=3.10
|
6
6
|
Description-Content-Type: text/markdown
|
@@ -12,6 +12,7 @@ Requires-Dist: ipykernel>=6.29.5
|
|
12
12
|
Requires-Dist: ipywidgets>=8.1.5
|
13
13
|
Requires-Dist: loguru>=0.7.3
|
14
14
|
Requires-Dist: seaborn>=0.13.2
|
15
|
+
Requires-Dist: timm>=1.0.13
|
15
16
|
|
16
17
|
![Python Version](https://img.shields.io/badge/python-3.10%2B-blue?style=for-the-badge)
|
17
18
|
![License](https://img.shields.io/badge/License-Apache%202.0-green.svg?style=for-the-badge)
|
@@ -22,20 +23,38 @@ Requires-Dist: seaborn>=0.13.2
|
|
22
23
|
<img src="https://raw.githubusercontent.com/dnth/active-vision/main/assets/logo.png" alt="active-vision">
|
23
24
|
</p>
|
24
25
|
|
25
|
-
|
26
|
+
The goal of this project is to create a framework for the active learning loop for computer vision. The diagram below shows a general workflow of how the active learning loop works.
|
26
27
|
|
27
|
-
|
28
|
+
<p align="center">
|
29
|
+
<img src="https://raw.githubusercontent.com/dnth/active-vision/main/assets/data_flywheel.gif" alt="active-vision", width="700">
|
30
|
+
</p>
|
28
31
|
|
29
|
-
|
30
|
-
|
32
|
+
### Supported tasks:
|
33
|
+
- [X] Image classification
|
34
|
+
- [ ] Object detection
|
35
|
+
- [ ] Segmentation
|
31
36
|
|
32
|
-
|
37
|
+
### Supported models:
|
38
|
+
- [X] Fastai models
|
39
|
+
- [X] Torchvision models
|
40
|
+
- [X] Timm models
|
41
|
+
- [ ] Hugging Face models
|
33
42
|
|
34
|
-
|
35
|
-
|
36
|
-
|
37
|
-
|
38
|
-
|
43
|
+
### Supported Active Learning Strategies:
|
44
|
+
|
45
|
+
Uncertainty Sampling:
|
46
|
+
- [X] Least confidence
|
47
|
+
- [ ] Margin of confidence
|
48
|
+
- [ ] Ratio of confidence
|
49
|
+
- [ ] Entropy
|
50
|
+
|
51
|
+
Diverse Sampling:
|
52
|
+
- [X] Random sampling
|
53
|
+
- [ ] Model-based outlier
|
54
|
+
- [ ] Cluster-based
|
55
|
+
- [ ] Representative
|
56
|
+
|
57
|
+
## 📦 Installation
|
39
58
|
|
40
59
|
Get a release from PyPI
|
41
60
|
```bash
|
@@ -49,19 +68,31 @@ cd active-vision
|
|
49
68
|
pip install -e .
|
50
69
|
```
|
51
70
|
|
71
|
+
I recommend using [uv](https://docs.astral.sh/uv/) to set up a virtual environment and install the package. You can also use other virtual env of your choice.
|
72
|
+
|
73
|
+
If you're using uv:
|
74
|
+
|
75
|
+
```bash
|
76
|
+
uv venv
|
77
|
+
uv sync
|
78
|
+
```
|
79
|
+
Once the virtual environment is created, you can install the package using pip.
|
80
|
+
|
52
81
|
> [!TIP]
|
53
|
-
> If you're using uv add a uv before the pip install command to install into your virtual environment. Eg:
|
82
|
+
> If you're using uv add a `uv` before the pip install command to install into your virtual environment. Eg:
|
54
83
|
> ```bash
|
55
84
|
> uv pip install active-vision
|
56
85
|
> ```
|
57
86
|
|
58
|
-
## Usage
|
87
|
+
## 🛠️ Usage
|
59
88
|
See the [notebook](./nbs/04_relabel_loop.ipynb) for a complete example.
|
60
89
|
|
61
|
-
Be sure to prepared 3
|
62
|
-
-
|
63
|
-
-
|
64
|
-
-
|
90
|
+
Be sure to prepared 3 subsets of the dataset:
|
91
|
+
- [Initial samples](./nbs/initial_samples.parquet): A dataframe of a labeled images to train an initial model. If you don't have any labeled data, you can label some images yourself.
|
92
|
+
- [Unlabeled samples](./nbs/unlabeled_samples.parquet): A dataframe of *unlabeled* images. We will continuously sample from this set using active learning strategies.
|
93
|
+
- [Evaluation samples](./nbs/evaluation_samples.parquet): A dataframe of *labeled* images. We will use this set to evaluate the performance of the model. This is the test set, DO NOT use it for active learning. Split this out in the beginning.
|
94
|
+
|
95
|
+
As a toy example I created the above 3 datasets from the imagenette dataset.
|
65
96
|
|
66
97
|
```python
|
67
98
|
from active_vision import ActiveLearner
|
@@ -90,7 +121,7 @@ uncertain_df = al.sample_uncertain(pred_df, num_samples=10)
|
|
90
121
|
al.label(uncertain_df, output_filename="uncertain")
|
91
122
|
```
|
92
123
|
|
93
|
-
![Gradio UI](
|
124
|
+
![Gradio UI](https://raw.githubusercontent.com/dnth/active-vision/main/assets/labeling_ui.png)
|
94
125
|
|
95
126
|
Once complete, the labeled samples will be save into a new df.
|
96
127
|
We can now add the newly labeled data to the training set.
|
@@ -102,17 +133,90 @@ al.add_to_train_set(labeled_df, output_filename="active_labeled")
|
|
102
133
|
|
103
134
|
Repeat the process until the model is good enough. Use the dataset to train a larger model and deploy.
|
104
135
|
|
105
|
-
|
106
|
-
|
136
|
+
> [!TIP]
|
137
|
+
> For the toy dataset, I got to about 93% accuracy on the evaluation set with 200+ labeled images. The best performing model on the [leaderboard](https://github.com/fastai/imagenette) got 95.11% accuracy training on all 9469 labeled images.
|
138
|
+
>
|
139
|
+
> This took me about 6 iterations of relabeling. Each iteration took about 5 minutes to complete including labeling and model training (resnet18). See the [notebook](./nbs/04_relabel_loop.ipynb) for more details.
|
140
|
+
>
|
141
|
+
> But using the dataset of 200+ images, I trained a more capable model (convnext_small_in22k) and got 99.3% accuracy on the evaluation set. See the [notebook](./nbs/05_retrain_larger.ipynb) for more details.
|
142
|
+
|
143
|
+
|
144
|
+
## 📊 Benchmarks
|
145
|
+
This section contains the benchmarks I ran using the active learning loop on various datasets.
|
146
|
+
|
147
|
+
Column description:
|
148
|
+
- `#Labeled Images`: The number of labeled images used to train the model.
|
149
|
+
- `Evaluation Accuracy`: The accuracy of the model on the evaluation set.
|
150
|
+
- `Train Epochs`: The number of epochs used to train the model.
|
151
|
+
- `Model`: The model used to train.
|
152
|
+
- `Active Learning`: Whether active learning was used to train the model.
|
153
|
+
- `Source`: The source of the results.
|
154
|
+
|
155
|
+
### Imagenette
|
156
|
+
- num classes: 10
|
157
|
+
- num images: 9469
|
158
|
+
|
159
|
+
To start the active learning loop, I labeled 100 images (10 images from each class) and iteratively relabeled the most informative images until I hit 275 labeled images.
|
160
|
+
|
161
|
+
The active learning loop is a iterative process and can keep going until you hit a stopping point. You can decide your own stopping point based on your use case. It could be:
|
162
|
+
- You ran out of data to label.
|
163
|
+
- You hit a performance goal.
|
164
|
+
- You hit a budget.
|
165
|
+
- Other criteria.
|
166
|
+
|
167
|
+
For this dataset,I decided to stop the active learning loop at 275 labeled images because the performance on the evaluation set is close to the top performing model on the leaderboard.
|
168
|
+
|
169
|
+
|
170
|
+
| #Labeled Images | Evaluation Accuracy | Train Epochs | Model | Active Learning | Source |
|
171
|
+
|-----------------|---------------------|--------------|----------------------|----------------|--------|
|
172
|
+
| 9469 | 94.90% | 80 | xse_resnext50 | ❌ | [Link](https://github.com/fastai/imagenette) |
|
173
|
+
| 9469 | 95.11% | 200 | xse_resnext50 | ❌ | [Link](https://github.com/fastai/imagenette) |
|
174
|
+
| 275 | 99.33% | 6 | convnext_small_in22k | ✓ | [Link](https://github.com/dnth/active-vision/blob/main/nbs/05_retrain_larger.ipynb) |
|
175
|
+
| 275 | 93.40% | 4 | resnet18 | ✓ | [Link](https://github.com/dnth/active-vision/blob/main/nbs/04_relabel_loop.ipynb) |
|
176
|
+
|
177
|
+
### Dog Food
|
178
|
+
- num classes: 2
|
179
|
+
- num images: 2100
|
180
|
+
|
181
|
+
To start the active learning loop, I labeled 20 images (10 images from each class) and iteratively relabeled the most informative images until I hit 160 labeled images.
|
182
|
+
|
183
|
+
I decided to stop the active learning loop at 160 labeled images because the performance on the evaluation set is close to the top performing model on the leaderboard. You can decide your own stopping point based on your use case.
|
184
|
+
|
185
|
+
| #Labeled Images | Evaluation Accuracy | Train Epochs | Model | Active Learning | Source |
|
186
|
+
|-----------------|---------------------|--------------|-------|----------------|--------|
|
187
|
+
| 2100 | 99.70% | ? | vit-base-patch16-224 | ❌ | [Link](https://huggingface.co/abhishek/autotrain-dog-vs-food) |
|
188
|
+
| 160 | 100.00% | 6 | convnext_small_in22k | ✓ | [Link](https://github.com/dnth/active-vision/blob/main/nbs/dog_food_dataset/02_train.ipynb) |
|
189
|
+
| 160 | 97.60% | 4 | resnet18 | ✓ | [Link](https://github.com/dnth/active-vision/blob/main/nbs/dog_food_dataset/01_label.ipynb) |
|
190
|
+
|
191
|
+
### Oxford-IIIT Pet
|
192
|
+
- num classes: 37
|
193
|
+
- num images: 3680
|
194
|
+
|
195
|
+
To start the active learning loop, I labeled 370 images (10 images from each class) and iteratively relabeled the most informative images until I hit 612 labeled images.
|
196
|
+
|
197
|
+
I decided to stop the active learning loop at 612 labeled images because the performance on the evaluation set is close to the top performing model on the leaderboard. You can decide your own stopping point based on your use case.
|
198
|
+
|
199
|
+
| #Labeled Images | Evaluation Accuracy | Train Epochs | Model | Active Learning | Source |
|
200
|
+
|-----------------|---------------------|--------------|-------|----------------|--------|
|
201
|
+
| 3680 | 95.40% | 5 | vit-base-patch16-224 | ❌ | [Link](https://huggingface.co/walterg777/vit-base-oxford-iiit-pets) |
|
202
|
+
| 612 | 90.26% | 11 | convnext_small_in22k | ✓ | [Link](https://github.com/dnth/active-vision/blob/main/nbs/oxford_iiit_pets/02_train.ipynb) |
|
203
|
+
| 612 | 91.38% | 11 | vit-base-patch16-224 | ✓ | [Link](https://github.com/dnth/active-vision/blob/main/nbs/oxford_iiit_pets/03_train_vit.ipynb) |
|
204
|
+
|
205
|
+
|
206
|
+
|
207
|
+
## ➿ Workflow
|
208
|
+
This section describes a more detailed workflow for active learning. There are two workflows for active learning that we can use depending on the availability of labeled data.
|
107
209
|
|
108
210
|
### With unlabeled data
|
109
|
-
If we have no labeled data,
|
211
|
+
If we have no labeled data, the goal of the active learning loop is to build a resonably good labeled dataset to train a larger model.
|
212
|
+
|
213
|
+
Steps:
|
110
214
|
|
111
215
|
1. Load a small proxy model.
|
112
|
-
2. Label an initial dataset.
|
216
|
+
2. Label an initial dataset. If there is none, you'll have to label some images.
|
113
217
|
3. Train the proxy model on the labeled dataset.
|
114
218
|
4. Run inference on the unlabeled dataset.
|
115
|
-
5. Evaluate the performance of the proxy model
|
219
|
+
5. Evaluate the performance of the proxy model.
|
116
220
|
6. Is model good enough?
|
117
221
|
- Yes: Save the proxy model and the dataset.
|
118
222
|
- No: Select the most informative images to label using active learning.
|
@@ -138,24 +242,25 @@ graph TD
|
|
138
242
|
```
|
139
243
|
|
140
244
|
### With labeled data
|
141
|
-
If we have a labeled dataset,
|
245
|
+
If we already have a labeled dataset, the goal of the active learning loop is to iteratively improve the dataset and the model by fixing the most important label errors.
|
246
|
+
|
247
|
+
Steps:
|
142
248
|
|
143
249
|
1. Load a small proxy model.
|
144
250
|
2. Train the proxy model on the labeled dataset.
|
145
251
|
3. Run inference on the entire labeled dataset.
|
146
|
-
4. Get the most
|
252
|
+
4. Get the most impactful label errors with active learning.
|
147
253
|
5. Fix the label errors.
|
148
254
|
6. Repeat steps 2-5 until the dataset is good enough.
|
149
255
|
7. Save the labeled dataset.
|
150
256
|
8. Train a larger model on the saved labeled dataset.
|
151
257
|
|
152
258
|
|
153
|
-
|
154
259
|
```mermaid
|
155
260
|
graph TD
|
156
261
|
A[Load a small proxy model] --> B[Train proxy model on labeled dataset]
|
157
262
|
B --> C[Run inference on labeled dataset]
|
158
|
-
C --> D[Get
|
263
|
+
C --> D[Get label errors using active learning]
|
159
264
|
D --> E[Fix label errors]
|
160
265
|
E --> F{Dataset good enough?}
|
161
266
|
F -->|No| B
|
@@ -164,7 +269,8 @@ graph TD
|
|
164
269
|
```
|
165
270
|
|
166
271
|
|
167
|
-
|
272
|
+
|
273
|
+
<!-- ## Methodology
|
168
274
|
To test out the workflows we will use the [imagenette dataset](https://huggingface.co/datasets/frgfm/imagenette). But this will be applicable to any dataset.
|
169
275
|
|
170
276
|
Imagenette is a subset of the ImageNet dataset with 10 classes. We will use this dataset to test out the workflows. Additionally, Imagenette has an existing leaderboard which we can use to evaluate the performance of the models.
|
@@ -215,4 +321,4 @@ After the first iteration we got 94.57% accuracy on the validation set. See the
|
|
215
321
|
> [!TIP]
|
216
322
|
> | Train Epochs | Number of Images | Validation Accuracy | Source |
|
217
323
|
> |--------------|-----------------|----------------------|------------------|
|
218
|
-
> | 10 | 200 | 94.57% | First relabeling [notebook](./nbs/03_retrain_model.ipynb) |
|
324
|
+
> | 10 | 200 | 94.57% | First relabeling [notebook](./nbs/03_retrain_model.ipynb) | -->
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|