PyPI - active-vision - Versions diffs - 0.0.2__tar.gz → 0.0.3__tar.gz - Mend

active-vision 0.0.2tar.gz → 0.0.3tar.gz

Files changed (14) hide show

{active_vision-0.0.2 → active_vision-0.0.3}/PKG-INFO RENAMED Viewed

@@ -1,12 +1,13 @@
 Metadata-Version: 2.2
 Name: active-vision
-Version: 0.0.2
+Version: 0.0.3
 Summary: Active learning for edge vision.
 Requires-Python: >=3.10
 Description-Content-Type: text/markdown
 License-File: LICENSE
 Requires-Dist: datasets>=3.2.0
 Requires-Dist: fastai>=2.7.18
+Requires-Dist: gradio>=5.12.0
 Requires-Dist: ipykernel>=6.29.5
 Requires-Dist: ipywidgets>=8.1.5
 Requires-Dist: loguru>=0.7.3
@@ -14,40 +15,53 @@ Requires-Dist: seaborn>=0.13.2
 ![Python Version](https://img.shields.io/badge/python-3.10%2B-blue?style=for-the-badge)
 ![License](https://img.shields.io/badge/License-Apache%202.0-green.svg?style=for-the-badge)
-![PyPI](https://img.shields.io/pypi/v/active-vision?style=for-the-badge)
+[![PyPI](https://img.shields.io/pypi/v/active-vision?style=for-the-badge)](https://pypi.org/project/active-vision/)
 ![Downloads](https://img.shields.io/pepy/dt/active-vision?style=for-the-badge&logo=pypi&logoColor=white&label=Downloads&color=purple)
 <p align="center">
-  <img src="https://github.com/dnth/active-vision/blob/main/assets/logo.png" alt="active-vision">
+  <img src="https://raw.githubusercontent.com/dnth/active-vision/main/assets/logo.png" alt="active-vision">
 </p>
 Active learning at the edge for computer vision.
-The goal of this project is to create a framework for active learning at the edge for computer vision. We should be able to train a model on a small dataset and then use active learning to iteratively improve the model all on a local machine.
+The goal of this project is to create a framework for the active learning loop for computer vision deployed on edge devices.
-## Tech Stack
+## Installation
+I recommend using [uv](https://docs.astral.sh/uv/) to set up a virtual environment and install the package. You can also use other virtual env of your choice.
-- Training framework: fastai
-- User interface: streamlit
-- Database: sqlite
-- Experiment tracking: wandb
+If you're using uv:
-## Installation
+```bash
+uv venv
+uv sync
+```
+Once the virtual environment is created, you can install the package using pip.
-PyPI
+Get a release from PyPI
 ```bash
 pip install active-vision
 ```
-Local install
+Install from source
 ```bash
 git clone https://github.com/dnth/active-vision.git
 cd active-vision
 pip install -e .
 ```
+> [!TIP]
+> If you're using uv add a uv before the pip install command to install into your virtual environment. Eg:
+> ```bash
+> uv pip install active-vision
+> ```
 ## Usage
-See the [notebook](./nbs/end-to-end.ipynb) for a complete example.
+See the [notebook](./nbs/04_relabel_loop.ipynb) for a complete example.
+Be sure to prepared 3 datasets:
+- train: A dataframe of an existing labeled training dataset.
+- unlabeled: A dataframe of unlabeled data which we will sample from using active learning.
+- eval: A dataframe of labeled data which we will use to evaluate the performance of the model. (Optional)
 ```python
 from active_vision import ActiveLearner
@@ -56,29 +70,38 @@ import pandas as pd
 # Create an active learner instance with a model
 al = ActiveLearner("resnet18")
-# Load the dataset into the active learner
+# Load dataset
 train_df = pd.read_parquet("training_samples.parquet")
-al.load_dataset(train_df, "filepath", "label")
+al.load_dataset(df, filepath_col="filepath", label_col="label")
-# Train the model
+# Train model
 al.train(epochs=3, lr=1e-3)
-# Load evaluation data
-eval_df = pd.read_parquet("evaluation_samples.parquet")
+# Evaluate the model on a *labeled* evaluation set
+accuracy = al.evaluate(eval_df, filepath_col="filepath", label_col="label")
-# Evaluate the model on a labeled evaluation set
-accuracy = al.evaluate(eval_df, "filepath", "label")
-# Get predictions from an unlabeled set
+# Get predictions from an *unlabeled* set
 pred_df = al.predict(filepaths)
-# Sample low confidence predictions
+# Sample low confidence predictions from unlabeled set
 uncertain_df = al.sample_uncertain(pred_df, num_samples=10)
-# Add newly labeled data to training set
-al.add_to_train_set(uncertain_df)
+# Launch a Gradio UI to label the low confidence samples
+al.label(uncertain_df, output_filename="uncertain")
 ```
+![Gradio UI](./assets/labeling_ui.png)
+Once complete, the labeled samples will be save into a new df.
+We can now add the newly labeled data to the training set.
+```python
+# Add newly labeled data to training set and save as a new file active_labeled
+al.add_to_train_set(labeled_df, output_filename="active_labeled")
+```
+Repeat the process until the model is good enough. Use the dataset to train a larger model and deploy.
 ## Workflow
 There are two workflows for active learning at the edge that we can use depending on the availability of labeled data.

{active_vision-0.0.2 → active_vision-0.0.3}/README.md RENAMED Viewed

@@ -1,39 +1,52 @@
 ![Python Version](https://img.shields.io/badge/python-3.10%2B-blue?style=for-the-badge)
 ![License](https://img.shields.io/badge/License-Apache%202.0-green.svg?style=for-the-badge)
-![PyPI](https://img.shields.io/pypi/v/active-vision?style=for-the-badge)
+[![PyPI](https://img.shields.io/pypi/v/active-vision?style=for-the-badge)](https://pypi.org/project/active-vision/)
 ![Downloads](https://img.shields.io/pepy/dt/active-vision?style=for-the-badge&logo=pypi&logoColor=white&label=Downloads&color=purple)
 <p align="center">
-  <img src="https://github.com/dnth/active-vision/blob/main/assets/logo.png" alt="active-vision">
+  <img src="https://raw.githubusercontent.com/dnth/active-vision/main/assets/logo.png" alt="active-vision">
 </p>
 Active learning at the edge for computer vision.
-The goal of this project is to create a framework for active learning at the edge for computer vision. We should be able to train a model on a small dataset and then use active learning to iteratively improve the model all on a local machine.
+The goal of this project is to create a framework for the active learning loop for computer vision deployed on edge devices.
-## Tech Stack
+## Installation
+I recommend using [uv](https://docs.astral.sh/uv/) to set up a virtual environment and install the package. You can also use other virtual env of your choice.
-- Training framework: fastai
-- User interface: streamlit
-- Database: sqlite
-- Experiment tracking: wandb
+If you're using uv:
-## Installation
+```bash
+uv venv
+uv sync
+```
+Once the virtual environment is created, you can install the package using pip.
-PyPI
+Get a release from PyPI
 ```bash
 pip install active-vision
 ```
-Local install
+Install from source
 ```bash
 git clone https://github.com/dnth/active-vision.git
 cd active-vision
 pip install -e .
 ```
+> [!TIP]
+> If you're using uv add a uv before the pip install command to install into your virtual environment. Eg:
+> ```bash
+> uv pip install active-vision
+> ```
 ## Usage
-See the [notebook](./nbs/end-to-end.ipynb) for a complete example.
+See the [notebook](./nbs/04_relabel_loop.ipynb) for a complete example.
+Be sure to prepared 3 datasets:
+- train: A dataframe of an existing labeled training dataset.
+- unlabeled: A dataframe of unlabeled data which we will sample from using active learning.
+- eval: A dataframe of labeled data which we will use to evaluate the performance of the model. (Optional)
 ```python
 from active_vision import ActiveLearner
@@ -42,29 +55,38 @@ import pandas as pd
 # Create an active learner instance with a model
 al = ActiveLearner("resnet18")
-# Load the dataset into the active learner
+# Load dataset
 train_df = pd.read_parquet("training_samples.parquet")
-al.load_dataset(train_df, "filepath", "label")
+al.load_dataset(df, filepath_col="filepath", label_col="label")
-# Train the model
+# Train model
 al.train(epochs=3, lr=1e-3)
-# Load evaluation data
-eval_df = pd.read_parquet("evaluation_samples.parquet")
+# Evaluate the model on a *labeled* evaluation set
+accuracy = al.evaluate(eval_df, filepath_col="filepath", label_col="label")
-# Evaluate the model on a labeled evaluation set
-accuracy = al.evaluate(eval_df, "filepath", "label")
-# Get predictions from an unlabeled set
+# Get predictions from an *unlabeled* set
 pred_df = al.predict(filepaths)
-# Sample low confidence predictions
+# Sample low confidence predictions from unlabeled set
 uncertain_df = al.sample_uncertain(pred_df, num_samples=10)
-# Add newly labeled data to training set
-al.add_to_train_set(uncertain_df)
+# Launch a Gradio UI to label the low confidence samples
+al.label(uncertain_df, output_filename="uncertain")
 ```
+![Gradio UI](./assets/labeling_ui.png)
+Once complete, the labeled samples will be save into a new df.
+We can now add the newly labeled data to the training set.
+```python
+# Add newly labeled data to training set and save as a new file active_labeled
+al.add_to_train_set(labeled_df, output_filename="active_labeled")
+```
+Repeat the process until the model is good enough. Use the dataset to train a larger model and deploy.
 ## Workflow
 There are two workflows for active learning at the edge that we can use depending on the availability of labeled data.

{active_vision-0.0.2 → active_vision-0.0.3}/pyproject.toml RENAMED Viewed

@@ -1,14 +1,15 @@
 [project]
 name = "active-vision"
-version = "0.0.2"
+version = "0.0.3"
 description = "Active learning for edge vision."
 readme = "README.md"
 requires-python = ">=3.10"
 dependencies = [
     "datasets>=3.2.0",
     "fastai>=2.7.18",
+    "gradio>=5.12.0",
     "ipykernel>=6.29.5",
     "ipywidgets>=8.1.5",
     "loguru>=0.7.3",
     "seaborn>=0.13.2",
-]
+]

active_vision-0.0.3/src/active_vision/__init__.py ADDED Viewed

@@ -0,0 +1,3 @@
+__version__ = "0.0.3"
+from .core import *

active_vision-0.0.3/src/active_vision/core.py ADDED Viewed

@@ -0,0 +1,291 @@
+import pandas as pd
+from loguru import logger
+from fastai.vision.models import resnet18, resnet34
+from fastai.callback.all import ShowGraphCallback
+from fastai.vision.all import (
+    ImageDataLoaders,
+    aug_transforms,
+    Resize,
+    vision_learner,
+    accuracy,
+    valley,
+    slide,
+    minimum,
+    steep,
+)
+import torch
+import torch.nn.functional as F
+import warnings
+warnings.filterwarnings("ignore", category=FutureWarning)
+class ActiveLearner:
+    def __init__(self, model_name: str):
+        self.model = self.load_model(model_name)
+    def load_model(self, model_name: str):
+        models = {"resnet18": resnet18, "resnet34": resnet34}
+        logger.info(f"Loading model {model_name}")
+        if model_name not in models:
+            logger.error(f"Model {model_name} not found")
+            raise ValueError(f"Model {model_name} not found")
+        return models[model_name]
+    def load_dataset(
+        self,
+        df: pd.DataFrame,
+        filepath_col: str,
+        label_col: str,
+        valid_pct: float = 0.2,
+        batch_size: int = 16,
+        image_size: int = 224,
+    ):
+        logger.info(f"Loading dataset from {filepath_col} and {label_col}")
+        self.train_set = df.copy()
+        logger.info("Creating dataloaders")
+        self.dls = ImageDataLoaders.from_df(
+            df,
+            path=".",
+            valid_pct=valid_pct,
+            fn_col=filepath_col,
+            label_col=label_col,
+            bs=batch_size,
+            item_tfms=Resize(image_size),
+            batch_tfms=aug_transforms(size=image_size, min_scale=0.75),
+        )
+        logger.info("Creating learner")
+        self.learn = vision_learner(self.dls, self.model, metrics=accuracy).to_fp16()
+        self.class_names = self.dls.vocab
+        logger.info("Done. Ready to train.")
+    def lr_find(self):
+        logger.info("Finding optimal learning rate")
+        self.lrs = self.learn.lr_find(suggest_funcs=(minimum, steep, valley, slide))
+        logger.info(f"Optimal learning rate: {self.lrs.valley}")
+    def train(self, epochs: int, lr: float):
+        logger.info(f"Training for {epochs} epochs with learning rate: {lr}")
+        self.learn.fine_tune(epochs, lr, cbs=[ShowGraphCallback()])
+    def predict(self, filepaths: list[str], batch_size: int = 16):
+        """
+        Run inference on an unlabeled dataset. Returns a df with filepaths and predicted labels, and confidence scores.
+        """
+        logger.info(f"Running inference on {len(filepaths)} samples")
+        test_dl = self.dls.test_dl(filepaths, bs=batch_size)
+        preds, _, cls_preds = self.learn.get_preds(dl=test_dl, with_decoded=True)
+        self.pred_df = pd.DataFrame(
+            {
+                "filepath": filepaths,
+                "pred_label": [self.learn.dls.vocab[i] for i in cls_preds.numpy()],
+                "pred_conf": torch.max(F.softmax(preds, dim=1), dim=1)[0].numpy(),
+            }
+        )
+        return self.pred_df
+    def evaluate(
+        self, df: pd.DataFrame, filepath_col: str, label_col: str, batch_size: int = 16
+    ):
+        """
+        Evaluate on a labeled dataset. Returns a score.
+        """
+        self.eval_set = df.copy()
+        filepaths = self.eval_set[filepath_col].tolist()
+        labels = self.eval_set[label_col].tolist()
+        test_dl = self.dls.test_dl(filepaths, bs=batch_size)
+        preds, _, cls_preds = self.learn.get_preds(dl=test_dl, with_decoded=True)
+        self.eval_df = pd.DataFrame(
+            {
+                "filepath": filepaths,
+                "label": labels,
+                "pred_label": [self.learn.dls.vocab[i] for i in cls_preds.numpy()],
+            }
+        )
+        accuracy = float((self.eval_df["label"] == self.eval_df["pred_label"]).mean())
+        logger.info(f"Accuracy: {accuracy:.2%}")
+        return accuracy
+    def sample_uncertain(self, df: pd.DataFrame, num_samples: int):
+        """
+        Sample top `num_samples` low confidence samples. Returns a df with filepaths and predicted labels, and confidence scores.
+        """
+        logger.info(f"Getting top {num_samples} low confidence samples")
+        uncertain_df = df.sort_values(by="pred_conf", ascending=True).head(num_samples)
+        return uncertain_df
+    def label(self, df: pd.DataFrame, output_filename: str = "labeled"):
+        """
+        Launch a labeling interface for the user to label the samples.
+        Input is a df with filepaths listing the files to be labeled. Output is a df with filepaths and labels.
+        """
+        import gradio as gr
+        shortcut_js = """
+        <script>
+        function shortcuts(e) {
+            // Only block shortcuts if we're in a text input or textarea
+            if (e.target.tagName.toLowerCase() === "textarea" ||
+                (e.target.tagName.toLowerCase() === "input" && e.target.type.toLowerCase() === "text")) {
+                return;
+            }
+            if (e.key.toLowerCase() == "w") {
+                document.getElementById("submit_btn").click();
+            } else if (e.key.toLowerCase() == "d") {
+                document.getElementById("next_btn").click();
+            } else if (e.key.toLowerCase() == "a") {
+                document.getElementById("back_btn").click();
+            }
+        }
+        document.addEventListener('keypress', shortcuts, false);
+        </script>
+        """
+        logger.info(f"Launching labeling interface for {len(df)} samples")
+        filepaths = df["filepath"].tolist()
+        with gr.Blocks(head=shortcut_js) as demo:
+            current_index = gr.State(value=0)
+            filename = gr.Textbox(
+                label="Filename", value=filepaths[0], interactive=False
+            )
+            image = gr.Image(
+                type="filepath", label="Image", value=filepaths[0], height=500
+            )
+            category = gr.Radio(choices=self.class_names, label="Select Category")
+            with gr.Row():
+                back_btn = gr.Button("← Previous (A)", elem_id="back_btn")
+                submit_btn = gr.Button(
+                    "Submit (W)",
+                    variant="primary",
+                    elem_id="submit_btn",
+                    interactive=False,
+                )
+                next_btn = gr.Button("Next → (D)", elem_id="next_btn")
+            progress = gr.Slider(
+                minimum=0,
+                maximum=len(filepaths) - 1,
+                value=0,
+                label="Progress",
+                interactive=False,
+            )
+            finish_btn = gr.Button("Finish Labeling", variant="primary")
+            def update_submit_btn(choice):
+                return gr.Button(interactive=choice is not None)
+            category.change(
+                fn=update_submit_btn, inputs=[category], outputs=[submit_btn]
+            )
+            def navigate(current_idx, direction):
+                next_idx = current_idx + direction
+                if 0 <= next_idx < len(filepaths):
+                    return filepaths[next_idx], filepaths[next_idx], next_idx, next_idx
+                return (
+                    filepaths[current_idx],
+                    filepaths[current_idx],
+                    current_idx,
+                    current_idx,
+                )
+            def save_and_next(current_idx, selected_category):
+                if selected_category is None:
+                    return (
+                        filepaths[current_idx],
+                        filepaths[current_idx],
+                        current_idx,
+                        current_idx,
+                    )
+                # Save the current annotation
+                with open(f"{output_filename}.csv", "a") as f:
+                    f.write(f"{filepaths[current_idx]},{selected_category}\n")
+                # Move to next image if not at the end
+                next_idx = current_idx + 1
+                if next_idx >= len(filepaths):
+                    return (
+                        filepaths[current_idx],
+                        filepaths[current_idx],
+                        current_idx,
+                        current_idx,
+                    )
+                return filepaths[next_idx], filepaths[next_idx], next_idx, next_idx
+            def convert_csv_to_parquet():
+                try:
+                    df = pd.read_csv(f"{output_filename}.csv", header=None)
+                    df.columns = ["filepath", "label"]
+                    df = df.drop_duplicates(subset=["filepath"], keep="last")
+                    df.to_parquet(f"{output_filename}.parquet")
+                    gr.Info(f"Annotation saved to {output_filename}.parquet")
+                except Exception as e:
+                    logger.error(e)
+                    return
+            back_btn.click(
+                fn=lambda idx: navigate(idx, -1),
+                inputs=[current_index],
+                outputs=[filename, image, current_index, progress],
+            )
+            next_btn.click(
+                fn=lambda idx: navigate(idx, 1),
+                inputs=[current_index],
+                outputs=[filename, image, current_index, progress],
+            )
+            submit_btn.click(
+                fn=save_and_next,
+                inputs=[current_index, category],
+                outputs=[filename, image, current_index, progress],
+            )
+            finish_btn.click(fn=convert_csv_to_parquet)
+        demo.launch(height=1000)
+    def add_to_train_set(self, df: pd.DataFrame, output_filename: str):
+        """
+        Add samples to the training set.
+        """
+        new_train_set = df.copy()
+        # new_train_set.drop(columns=["pred_conf"], inplace=True)
+        # new_train_set.rename(columns={"pred_label": "label"}, inplace=True)
+        # len_old = len(self.train_set)
+        logger.info(f"Adding {len(new_train_set)} samples to training set")
+        self.train_set = pd.concat([self.train_set, new_train_set])
+        self.train_set = self.train_set.drop_duplicates(
+            subset=["filepath"], keep="last"
+        )
+        self.train_set.reset_index(drop=True, inplace=True)
+        self.train_set.to_parquet(f"{output_filename}.parquet")
+        logger.info(f"Saved training set to {output_filename}.parquet")
+        # if len(self.train_set) == len_old:
+        #     logger.warning("No new samples added to training set")
+        # elif len_old + len(new_train_set) < len(self.train_set):
+        #     logger.warning("Some samples were duplicates and removed from training set")
+        # else:
+        #     logger.info("All new samples added to training set")
+        #     logger.info(f"Training set now has {len(self.train_set)} samples")

{active_vision-0.0.2 → active_vision-0.0.3}/src/active_vision.egg-info/PKG-INFO RENAMED Viewed

@@ -1,12 +1,13 @@
 Metadata-Version: 2.2
 Name: active-vision
-Version: 0.0.2
+Version: 0.0.3
 Summary: Active learning for edge vision.
 Requires-Python: >=3.10
 Description-Content-Type: text/markdown
 License-File: LICENSE
 Requires-Dist: datasets>=3.2.0
 Requires-Dist: fastai>=2.7.18
+Requires-Dist: gradio>=5.12.0
 Requires-Dist: ipykernel>=6.29.5
 Requires-Dist: ipywidgets>=8.1.5
 Requires-Dist: loguru>=0.7.3
@@ -14,40 +15,53 @@ Requires-Dist: seaborn>=0.13.2
 ![Python Version](https://img.shields.io/badge/python-3.10%2B-blue?style=for-the-badge)
 ![License](https://img.shields.io/badge/License-Apache%202.0-green.svg?style=for-the-badge)
-![PyPI](https://img.shields.io/pypi/v/active-vision?style=for-the-badge)
+[![PyPI](https://img.shields.io/pypi/v/active-vision?style=for-the-badge)](https://pypi.org/project/active-vision/)
 ![Downloads](https://img.shields.io/pepy/dt/active-vision?style=for-the-badge&logo=pypi&logoColor=white&label=Downloads&color=purple)
 <p align="center">
-  <img src="https://github.com/dnth/active-vision/blob/main/assets/logo.png" alt="active-vision">
+  <img src="https://raw.githubusercontent.com/dnth/active-vision/main/assets/logo.png" alt="active-vision">
 </p>
 Active learning at the edge for computer vision.
-The goal of this project is to create a framework for active learning at the edge for computer vision. We should be able to train a model on a small dataset and then use active learning to iteratively improve the model all on a local machine.
+The goal of this project is to create a framework for the active learning loop for computer vision deployed on edge devices.
-## Tech Stack
+## Installation
+I recommend using [uv](https://docs.astral.sh/uv/) to set up a virtual environment and install the package. You can also use other virtual env of your choice.
-- Training framework: fastai
-- User interface: streamlit
-- Database: sqlite
-- Experiment tracking: wandb
+If you're using uv:
-## Installation
+```bash
+uv venv
+uv sync
+```
+Once the virtual environment is created, you can install the package using pip.
-PyPI
+Get a release from PyPI
 ```bash
 pip install active-vision
 ```
-Local install
+Install from source
 ```bash
 git clone https://github.com/dnth/active-vision.git
 cd active-vision
 pip install -e .
 ```
+> [!TIP]
+> If you're using uv add a uv before the pip install command to install into your virtual environment. Eg:
+> ```bash
+> uv pip install active-vision
+> ```
 ## Usage
-See the [notebook](./nbs/end-to-end.ipynb) for a complete example.
+See the [notebook](./nbs/04_relabel_loop.ipynb) for a complete example.
+Be sure to prepared 3 datasets:
+- train: A dataframe of an existing labeled training dataset.
+- unlabeled: A dataframe of unlabeled data which we will sample from using active learning.
+- eval: A dataframe of labeled data which we will use to evaluate the performance of the model. (Optional)
 ```python
 from active_vision import ActiveLearner
@@ -56,29 +70,38 @@ import pandas as pd
 # Create an active learner instance with a model
 al = ActiveLearner("resnet18")
-# Load the dataset into the active learner
+# Load dataset
 train_df = pd.read_parquet("training_samples.parquet")
-al.load_dataset(train_df, "filepath", "label")
+al.load_dataset(df, filepath_col="filepath", label_col="label")
-# Train the model
+# Train model
 al.train(epochs=3, lr=1e-3)
-# Load evaluation data
-eval_df = pd.read_parquet("evaluation_samples.parquet")
+# Evaluate the model on a *labeled* evaluation set
+accuracy = al.evaluate(eval_df, filepath_col="filepath", label_col="label")
-# Evaluate the model on a labeled evaluation set
-accuracy = al.evaluate(eval_df, "filepath", "label")
-# Get predictions from an unlabeled set
+# Get predictions from an *unlabeled* set
 pred_df = al.predict(filepaths)
-# Sample low confidence predictions
+# Sample low confidence predictions from unlabeled set
 uncertain_df = al.sample_uncertain(pred_df, num_samples=10)
-# Add newly labeled data to training set
-al.add_to_train_set(uncertain_df)
+# Launch a Gradio UI to label the low confidence samples
+al.label(uncertain_df, output_filename="uncertain")
 ```
+![Gradio UI](./assets/labeling_ui.png)
+Once complete, the labeled samples will be save into a new df.
+We can now add the newly labeled data to the training set.
+```python
+# Add newly labeled data to training set and save as a new file active_labeled
+al.add_to_train_set(labeled_df, output_filename="active_labeled")
+```
+Repeat the process until the model is good enough. Use the dataset to train a larger model and deploy.
 ## Workflow
 There are two workflows for active learning at the edge that we can use depending on the availability of labeled data.

{active_vision-0.0.2 → active_vision-0.0.3}/src/active_vision.egg-info/requires.txt RENAMED Viewed

@@ -1,5 +1,6 @@
 datasets>=3.2.0
 fastai>=2.7.18
+gradio>=5.12.0
 ipykernel>=6.29.5
 ipywidgets>=8.1.5
 loguru>=0.7.3

active_vision-0.0.2/src/active_vision/__init__.py DELETED Viewed

@@ -1,3 +0,0 @@
-__version__ = "0.0.2"
-from .core import *

active_vision-0.0.2/src/active_vision/core.py DELETED Viewed

@@ -1,149 +0,0 @@
-import pandas as pd
-from loguru import logger
-from fastai.vision.models import resnet18, resnet34
-from fastai.callback.all import ShowGraphCallback
-from fastai.vision.all import (
-    ImageDataLoaders,
-    aug_transforms,
-    Resize,
-    vision_learner,
-    accuracy,
-    valley,
-    slide,
-    minimum,
-    steep,
-)
-import torch
-import torch.nn.functional as F
-import warnings
-warnings.filterwarnings("ignore", category=FutureWarning)
-class ActiveLearner:
-    def __init__(self, model_name: str):
-        self.model = self.load_model(model_name)
-    def load_model(self, model_name: str):
-        models = {"resnet18": resnet18, "resnet34": resnet34}
-        logger.info(f"Loading model {model_name}")
-        if model_name not in models:
-            logger.error(f"Model {model_name} not found")
-            raise ValueError(f"Model {model_name} not found")
-        return models[model_name]
-    def load_dataset(
-        self,
-        df: pd.DataFrame,
-        filepath_col: str,
-        label_col: str,
-        valid_pct: float = 0.2,
-        batch_size: int = 16,
-        image_size: int = 224,
-    ):
-        logger.info(f"Loading dataset from {filepath_col} and {label_col}")
-        self.train_set = df.copy()
-        logger.info("Creating dataloaders")
-        self.dls = ImageDataLoaders.from_df(
-            df,
-            path=".",
-            valid_pct=valid_pct,
-            fn_col=filepath_col,
-            label_col=label_col,
-            bs=batch_size,
-            item_tfms=Resize(image_size),
-            batch_tfms=aug_transforms(size=image_size, min_scale=0.75),
-        )
-        logger.info("Creating learner")
-        self.learn = vision_learner(self.dls, self.model, metrics=accuracy).to_fp16()
-        self.class_names = self.dls.vocab
-        logger.info("Done. Ready to train.")
-    def lr_find(self):
-        logger.info("Finding optimal learning rate")
-        self.lrs = self.learn.lr_find(suggest_funcs=(minimum, steep, valley, slide))
-        logger.info(f"Optimal learning rate: {self.lrs.valley}")
-    def train(self, epochs: int, lr: float):
-        logger.info(f"Training for {epochs} epochs with learning rate: {lr}")
-        self.learn.fine_tune(epochs, lr, cbs=[ShowGraphCallback()])
-    def predict(self, filepaths: list[str], batch_size: int = 16):
-        """
-        Run inference on an unlabeled dataset. Returns a df with filepaths and predicted labels, and confidence scores.
-        """
-        logger.info(f"Running inference on {len(filepaths)} samples")
-        test_dl = self.dls.test_dl(filepaths, bs=batch_size)
-        preds, _, cls_preds = self.learn.get_preds(dl=test_dl, with_decoded=True)
-        self.pred_df = pd.DataFrame(
-            {
-                "filepath": filepaths,
-                "pred_label": [self.learn.dls.vocab[i] for i in cls_preds.numpy()],
-                "pred_conf": torch.max(F.softmax(preds, dim=1), dim=1)[0].numpy(),
-            }
-        )
-        return self.pred_df
-    def evaluate(self, df: pd.DataFrame, filepath_col: str, label_col: str, batch_size: int = 16):
-        """
-        Evaluate on a labeled dataset. Returns a score.
-        """
-        self.eval_set = df.copy()
-        filepaths = self.eval_set[filepath_col].tolist()
-        labels = self.eval_set[label_col].tolist()
-        test_dl = self.dls.test_dl(filepaths, bs=batch_size)
-        preds, _, cls_preds = self.learn.get_preds(dl=test_dl, with_decoded=True)
-        self.eval_df = pd.DataFrame(
-            {
-                "filepath": filepaths,
-                "label": labels,
-                "pred_label": [self.learn.dls.vocab[i] for i in cls_preds.numpy()],
-            }
-        )
-        accuracy = float((self.eval_df["label"] == self.eval_df["pred_label"]).mean())
-        logger.info(f"Accuracy: {accuracy:.2%}")
-        return accuracy
-    def sample_uncertain(self, df: pd.DataFrame, num_samples: int):
-        """
-        Sample top `num_samples` low confidence samples. Returns a df with filepaths and predicted labels, and confidence scores.
-        """
-        uncertain_df = df.sort_values(
-            by="pred_conf", ascending=True
-        ).head(num_samples)
-        return uncertain_df
-    def add_to_train_set(self, df: pd.DataFrame):
-        """
-        Add samples to the training set.
-        """
-        new_train_set = df.copy()
-        new_train_set.drop(columns=["pred_conf"], inplace=True)
-        new_train_set.rename(columns={"pred_label": "label"}, inplace=True)
-        len_old = len(self.train_set)
-        logger.info(f"Adding {len(new_train_set)} samples to training set")
-        self.train_set = pd.concat([self.train_set, new_train_set])
-        self.train_set = self.train_set.drop_duplicates(
-            subset=["filepath"], keep="last"
-        )
-        self.train_set.reset_index(drop=True, inplace=True)
-        if len(self.train_set) == len_old:
-            logger.warning("No new samples added to training set")
-        elif len_old + len(new_train_set) < len(self.train_set):
-            logger.warning("Some samples were duplicates and removed from training set")
-        else:
-            logger.info("All new samples added to training set")
-            logger.info(f"Training set now has {len(self.train_set)} samples")

{active_vision-0.0.2 → active_vision-0.0.3}/LICENSE RENAMED Viewed

File without changes

{active_vision-0.0.2 → active_vision-0.0.3}/setup.cfg RENAMED Viewed

File without changes

{active_vision-0.0.2 → active_vision-0.0.3}/src/active_vision.egg-info/SOURCES.txt RENAMED Viewed

File without changes

{active_vision-0.0.2 → active_vision-0.0.3}/src/active_vision.egg-info/dependency_links.txt RENAMED Viewed

File without changes

{active_vision-0.0.2 → active_vision-0.0.3}/src/active_vision.egg-info/top_level.txt RENAMED Viewed

File without changes

active-vision 0.0.2__tar.gz → 0.0.3__tar.gz

active-vision 0.0.2tar.gz → 0.0.3tar.gz