PyPI - labelr - Versions diffs - 0.7.0__tar.gz → 0.8.0__tar.gz - Mend

labelr 0.7.0tar.gz → 0.8.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (33) hide show

{labelr-0.7.0/src/labelr.egg-info → labelr-0.8.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: labelr
-Version: 0.7.0
+Version: 0.8.0
 Summary: A command-line tool to manage labeling tasks with Label Studio.
 Requires-Python: >=3.10
 Description-Content-Type: text/markdown
@@ -73,7 +73,7 @@ Once you have a Label Studio instance running, you can create a project easily.
 For an object detection task, a command allows you to create the configuration file automatically:
 ```bash
-labelr projects create-config --labels 'label1' --labels 'label2' --output-file label_config.xml
+labelr ls create-config --labels 'label1' --labels 'label2' --output-file label_config.xml
 ```
 where `label1` and `label2` are the labels you want to use for the object detection task, and `label_config.xml` is the output file that will contain the configuration.
@@ -81,17 +81,19 @@ where `label1` and `label2` are the labels you want to use for the object detect
 Then, you can create a project on Label Studio with the following command:
 ```bash
-labelr projects create --title my_project --api-key API_KEY --config-file label_config.xml
+labelr ls create --title my_project --api-key API_KEY --config-file label_config.xml
 ```
 where `API_KEY` is the API key of the Label Studio instance (API key is available at Account page), and `label_config.xml` is the configuration file of the project.
+`ls` stands for Label Studio in the CLI.
 #### Create a dataset file
 If you have a list of images, for an object detection task, you can quickly create a dataset file with the following command:
 ```bash
-labelr projects create-dataset-file --input-file image_urls.txt --output-file dataset.json
+labelr ls create-dataset-file --input-file image_urls.txt --output-file dataset.json
 ```
 where `image_urls.txt` is a file containing the URLs of the images, one per line, and `dataset.json` is the output file.
@@ -101,7 +103,7 @@ where `image_urls.txt` is a file containing the URLs of the images, one per line
 Next, import the generated data to a project with the following command:
 ```bash
-labelr projects import-data --project-id PROJECT_ID --dataset-path dataset.json
+labelr ls import-data --project-id PROJECT_ID --dataset-path dataset.json
 ```
 where `PROJECT_ID` is the ID of the project you created.
@@ -117,7 +119,7 @@ To accelerate annotation, you can pre-annotate the images with an object detecti
 To pre-annotate the data with Triton, use the following command:
 ```bash
-labelr projects add-prediction --project-id PROJECT_ID --backend ultralytics --labels 'product' --labels 'price tag' --label-mapping '{"price tag": "price-tag"}'
+labelr ls add-prediction --project-id PROJECT_ID --backend ultralytics --labels 'product' --labels 'price tag' --label-mapping '{"price tag": "price-tag"}'
 ```
 where `labels` is the list of labels to use for the object detection task (you can add as many labels as you want).

{labelr-0.7.0 → labelr-0.8.0}/README.md RENAMED Viewed

@@ -52,7 +52,7 @@ Once you have a Label Studio instance running, you can create a project easily.
 For an object detection task, a command allows you to create the configuration file automatically:
 ```bash
-labelr projects create-config --labels 'label1' --labels 'label2' --output-file label_config.xml
+labelr ls create-config --labels 'label1' --labels 'label2' --output-file label_config.xml
 ```
 where `label1` and `label2` are the labels you want to use for the object detection task, and `label_config.xml` is the output file that will contain the configuration.
@@ -60,17 +60,19 @@ where `label1` and `label2` are the labels you want to use for the object detect
 Then, you can create a project on Label Studio with the following command:
 ```bash
-labelr projects create --title my_project --api-key API_KEY --config-file label_config.xml
+labelr ls create --title my_project --api-key API_KEY --config-file label_config.xml
 ```
 where `API_KEY` is the API key of the Label Studio instance (API key is available at Account page), and `label_config.xml` is the configuration file of the project.
+`ls` stands for Label Studio in the CLI.
 #### Create a dataset file
 If you have a list of images, for an object detection task, you can quickly create a dataset file with the following command:
 ```bash
-labelr projects create-dataset-file --input-file image_urls.txt --output-file dataset.json
+labelr ls create-dataset-file --input-file image_urls.txt --output-file dataset.json
 ```
 where `image_urls.txt` is a file containing the URLs of the images, one per line, and `dataset.json` is the output file.
@@ -80,7 +82,7 @@ where `image_urls.txt` is a file containing the URLs of the images, one per line
 Next, import the generated data to a project with the following command:
 ```bash
-labelr projects import-data --project-id PROJECT_ID --dataset-path dataset.json
+labelr ls import-data --project-id PROJECT_ID --dataset-path dataset.json
 ```
 where `PROJECT_ID` is the ID of the project you created.
@@ -96,7 +98,7 @@ To accelerate annotation, you can pre-annotate the images with an object detecti
 To pre-annotate the data with Triton, use the following command:
 ```bash
-labelr projects add-prediction --project-id PROJECT_ID --backend ultralytics --labels 'product' --labels 'price tag' --label-mapping '{"price tag": "price-tag"}'
+labelr ls add-prediction --project-id PROJECT_ID --backend ultralytics --labels 'product' --labels 'price tag' --label-mapping '{"price tag": "price-tag"}'
 ```
 where `labels` is the list of labels to use for the object detection task (you can add as many labels as you want).

{labelr-0.7.0 → labelr-0.8.0}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [project]
 name = "labelr"
-version = "0.7.0"
+version = "0.8.0"
 description = "A command-line tool to manage labeling tasks with Label Studio."
 readme = "README.md"
 requires-python = ">=3.10"

{labelr-0.7.0 → labelr-0.8.0}/src/labelr/apps/datasets.py RENAMED Viewed

@@ -1,3 +1,6 @@
+"""Commands to manage datasets local datasets and export between platforms
+(Label Studio, HuggingFace Hub, local dataset,...)."""
 import json
 import random
 import shutil
@@ -21,45 +24,29 @@ logger = get_logger(__name__)
 @app.command()
 def check(
-    api_key: Annotated[
-        Optional[str], typer.Option(envvar="LABEL_STUDIO_API_KEY")
-    ] = None,
-    project_id: Annotated[
-        Optional[int], typer.Option(help="Label Studio Project ID")
-    ] = None,
-    label_studio_url: str = LABEL_STUDIO_DEFAULT_URL,
     dataset_dir: Annotated[
-        Optional[Path],
+        Path,
         typer.Option(
             help="Path to the dataset directory", exists=True, file_okay=False
         ),
-    ] = None,
+    ],
     remove: Annotated[
         bool,
-        typer.Option(
-            help="Remove duplicate images from the dataset, only for local datasets"
-        ),
+        typer.Option(help="Remove duplicate images from the dataset"),
     ] = False,
 ):
-    """Check a dataset for duplicate images."""
-    from label_studio_sdk.client import LabelStudio
+    """Check a local dataset in Ultralytics format for duplicate images."""
-    from ..check import check_local_dataset, check_ls_dataset
+    from ..check import check_local_dataset
-    if project_id is not None:
-        ls = LabelStudio(base_url=label_studio_url, api_key=api_key)
-        check_ls_dataset(ls, project_id)
-    elif dataset_dir is not None:
-        check_local_dataset(dataset_dir, remove=remove)
-    else:
-        raise typer.BadParameter("Either project ID or dataset directory is required")
+    check_local_dataset(dataset_dir, remove=remove)
 @app.command()
 def split_train_test(
     task_type: TaskType, dataset_dir: Path, output_dir: Path, train_ratio: float = 0.8
 ):
-    """Split a dataset into training and test sets.
+    """Split a local dataset into training and test sets.
     Only classification tasks are supported.
     """
@@ -112,7 +99,7 @@ def convert_object_detection_dataset(
     Studio format, and save it to a JSON file."""
     from datasets import load_dataset
-    from labelr.sample import format_object_detection_sample_from_hf
+    from labelr.sample import format_object_detection_sample_from_hf_to_ls
     logger.info("Loading dataset: %s", repo_id)
     ds = load_dataset(repo_id)
@@ -122,7 +109,7 @@ def convert_object_detection_dataset(
         for split in ds.keys():
             logger.info("Processing split: %s", split)
             for sample in ds[split]:
-                label_studio_sample = format_object_detection_sample_from_hf(
+                label_studio_sample = format_object_detection_sample_from_hf_to_ls(
                     sample, split=split
                 )
                 f.write(json.dumps(label_studio_sample) + "\n")

labelr-0.8.0/src/labelr/apps/evaluate.py ADDED Viewed

@@ -0,0 +1,41 @@
+from typing import Annotated
+import typer
+app = typer.Typer()
+@app.command()
+def visualize_object_detection(
+    hf_repo_id: Annotated[
+        str,
+        typer.Option(
+            ...,
+            help="Hugging Face repository ID of the trained model. "
+            "A `predictions.parquet` file is expected in the repo. Revision can be specified "
+            "by appending `@<revision>` to the repo ID.",
+        ),
+    ],
+    dataset_name: Annotated[
+        str | None, typer.Option(..., help="Name of the FiftyOne dataset to create.")
+    ] = None,
+    persistent: Annotated[
+        bool,
+        typer.Option(
+            ...,
+            help="Whether to make the FiftyOne dataset persistent (i.e., saved to disk).",
+        ),
+    ] = False,
+):
+    """Visualize object detection model predictions stored in a Hugging Face
+    repository using FiftyOne."""
+    from labelr.evaluate import object_detection
+    if dataset_name is None:
+        dataset_name = hf_repo_id.replace("/", "-").replace("@", "-")
+    object_detection.visualize(
+        hf_repo_id=hf_repo_id,
+        dataset_name=dataset_name,
+        persistent=persistent,
+    )

labelr-0.8.0/src/labelr/apps/hugging_face.py ADDED Viewed

@@ -0,0 +1,57 @@
+from pathlib import Path
+from typing import Annotated
+import typer
+app = typer.Typer()
+@app.command()
+def show_hf_sample(
+    repo_id: Annotated[
+        str,
+        typer.Argument(
+            ...,
+            help="Hugging Face Datasets repo ID. The revision can be specified by "
+            "appending `@<revision>` to the repo ID.",
+        ),
+    ],
+    image_id: Annotated[
+        str,
+        typer.Argument(
+            ...,
+            help="ID of the image associated with the sample to display (field: `image_id`)",
+        ),
+    ],
+    output_image_path: Annotated[
+        Path | None,
+        typer.Option(help="Path to save the sample image (optional)", exists=False),
+    ] = None,
+):
+    """Display a sample from a Hugging Face Datasets repository by image ID."""
+    from labelr.utils import parse_hf_repo_id
+    repo_id, revision = parse_hf_repo_id(repo_id)
+    from datasets import load_dataset
+    ds = load_dataset(repo_id, revision=revision)
+    sample = None
+    for split in ds.keys():
+        samples = ds[split].filter(lambda x: x == image_id, input_columns="image_id")
+        if len(samples) > 0:
+            sample = samples[0]
+            break
+    if sample is None:
+        typer.echo(f"Sample with image ID {image_id} not found in dataset {repo_id}")
+        raise typer.Exit(code=1)
+    else:
+        for key, value in sample.items():
+            typer.echo(f"{key}: {value}")
+        if output_image_path is not None:
+            image = sample["image"]
+            image.save(output_image_path)
+            typer.echo(f"Image saved to {output_image_path}")

labelr-0.7.0/src/labelr/apps/projects.py → labelr-0.8.0/src/labelr/apps/label_studio.py RENAMED Viewed

@@ -6,12 +6,7 @@ from typing import Annotated, Optional
 import typer
 from openfoodfacts.utils import get_logger
-from PIL import Image
-from ..annotate import (
-    format_annotation_results_from_robotoff,
-    format_annotation_results_from_ultralytics,
-)
 from ..config import LABEL_STUDIO_DEFAULT_URL
 app = typer.Typer()
@@ -43,14 +38,20 @@ def import_data(
     api_key: Annotated[str, typer.Option(envvar="LABEL_STUDIO_API_KEY")],
     project_id: Annotated[int, typer.Option(help="Label Studio Project ID")],
     dataset_path: Annotated[
-        Path, typer.Option(help="Path to the Label Studio dataset file", file_okay=True)
+        Path,
+        typer.Option(
+            help="Path to the Label Studio dataset JSONL file", file_okay=True
+        ),
     ],
     label_studio_url: str = LABEL_STUDIO_DEFAULT_URL,
     batch_size: int = 25,
 ):
     """Import tasks from a dataset file to a Label Studio project.
-    The dataset file should contain one JSON object per line."""
+    The dataset file must be a JSONL file: it should contain one JSON object
+    per line. To generate such a file, you can use the `create-dataset-file`
+    command.
+    """
     import more_itertools
     import tqdm
     from label_studio_sdk.client import LabelStudio
@@ -279,6 +280,12 @@ def add_prediction(
     import tqdm
     from label_studio_sdk.client import LabelStudio
     from openfoodfacts.utils import get_image_from_url, http_session
+    from PIL import Image
+    from ..annotate import (
+        format_annotation_results_from_robotoff,
+        format_annotation_results_from_ultralytics,
+    )
     label_mapping_dict = None
     if label_mapping:
@@ -375,11 +382,16 @@ def create_dataset_file(
         typer.Option(help="Path to a list of image URLs", exists=True),
     ],
     output_file: Annotated[
-        Path, typer.Option(help="Path to the output JSON file", exists=False)
+        Path, typer.Option(help="Path to the output JSONL file", exists=False)
     ],
 ):
     """Create a Label Studio object detection dataset file from a list of
-    image URLs."""
+    image URLs.
+    The output file is a JSONL file. It cannot be imported directly in Label
+    Studio (which requires a JSON file as input), the `import-data` command
+    should be used to import the generated dataset file.
+    """
     from urllib.parse import urlparse
     import tqdm
@@ -432,3 +444,47 @@ def create_config_file(
     config = create_object_detection_label_config(labels)
     output_file.write_text(config)
     logger.info("Label config file created: %s", output_file)
+@app.command()
+def check_dataset(
+    project_id: Annotated[int, typer.Option(help="Label Studio Project ID")],
+    api_key: Annotated[
+        Optional[str], typer.Option(envvar="LABEL_STUDIO_API_KEY")
+    ] = None,
+    label_studio_url: str = LABEL_STUDIO_DEFAULT_URL,
+):
+    """Check a dataset for duplicate images on Label Studio."""
+    from label_studio_sdk.client import LabelStudio
+    from ..check import check_ls_dataset
+    ls = LabelStudio(base_url=label_studio_url, api_key=api_key)
+    check_ls_dataset(ls, project_id)
+@app.command()
+def list_users(
+    api_key: Annotated[str, typer.Option(envvar="LABEL_STUDIO_API_KEY")],
+    label_studio_url: str = LABEL_STUDIO_DEFAULT_URL,
+):
+    """List all users in Label Studio."""
+    from label_studio_sdk.client import LabelStudio
+    ls = LabelStudio(base_url=label_studio_url, api_key=api_key)
+    for user in ls.users.list():
+        print(f"{user.id:02d}: {user.email}")
+@app.command()
+def delete_user(
+    user_id: int,
+    api_key: Annotated[str, typer.Option(envvar="LABEL_STUDIO_API_KEY")],
+    label_studio_url: str = LABEL_STUDIO_DEFAULT_URL,
+):
+    """Delete a user from Label Studio."""
+    from label_studio_sdk.client import LabelStudio
+    ls = LabelStudio(base_url=label_studio_url, api_key=api_key)
+    ls.users.delete(user_id)

{labelr-0.7.0 → labelr-0.8.0}/src/labelr/apps/train.py RENAMED Viewed

@@ -1,7 +1,6 @@
 import datetime
 import typer
-from google.cloud import batch_v1
 app = typer.Typer()
@@ -28,6 +27,11 @@ AVAILABLE_OBJECT_DETECTION_MODELS = [
     "yolo11m.pt",
     "yolo11l.pt",
     "yolo11x.pt",
+    "yolo12n.pt",
+    "yolo12s.pt",
+    "yolo12m.pt",
+    "yolo12l.pt",
+    "yolo12x.pt",
 ]
@@ -42,6 +46,9 @@ def train_object_detection(
         help="The Hugging Face token, used to push the trained model to Hugging Face Hub.",
     ),
     run_name: str = typer.Option(..., help="A name for the training run."),
+    add_date_to_run_name: bool = typer.Option(
+        True, help="Whether to append the date to the run name."
+    ),
     hf_repo_id: str = typer.Option(
         ..., help="The Hugging Face dataset repository ID to use to train."
     ),
@@ -64,6 +71,11 @@ def train_object_detection(
             f"Invalid model name '{model_name}'. Available models are: {', '.join(AVAILABLE_OBJECT_DETECTION_MODELS)}"
         )
+    datestamp = datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
+    if add_date_to_run_name:
+        run_name = f"{run_name}-{datestamp}"
     env_variables = {
         "HF_REPO_ID": hf_repo_id,
         "HF_TRAINED_MODEL_REPO_ID": hf_trained_model_repo_id,
@@ -77,8 +89,12 @@ def train_object_detection(
         "USE_AWS_IMAGE_CACHE": "False",
         "YOLO_MODEL_NAME": model_name,
     }
-    job_name = "train-yolo-job"
-    job_name = job_name + "-" + datetime.datetime.now().strftime("%Y%m%d%H%M%S")
+    job_name = f"train-yolo-job-{run_name}"
+    if not add_date_to_run_name:
+        # Ensure job name is unique by adding a datestamp if date is not added to run name
+        job_name = f"{job_name}-{datestamp}"
     job = launch_job(
         job_name=job_name,
         container_image_uri="europe-west9-docker.pkg.dev/robotoff/gcf-artifacts/train-yolo",
@@ -112,7 +128,7 @@ def launch_job(
     accelerators_count: int = 1,
     region: str = "europe-west4",
     install_gpu_drivers: bool = True,
-) -> batch_v1.Job:
+):
     """This method creates a Batch Job on GCP.
     Sources:
@@ -126,6 +142,8 @@ def launch_job(
     Returns:
         Batch job information.
     """
+    from google.cloud import batch_v1
     client = batch_v1.BatchServiceClient()
     # Define what will be done as part of the job.

labelr-0.8.0/src/labelr/evaluate/__init__.py ADDED Viewed

File without changes

labelr-0.8.0/src/labelr/evaluate/llm.py ADDED Viewed

File without changes

labelr-0.8.0/src/labelr/evaluate/object_detection.py ADDED Viewed

@@ -0,0 +1,100 @@
+import tempfile
+from pathlib import Path
+import datasets
+import fiftyone as fo
+from huggingface_hub import hf_hub_download
+from labelr.dataset_features import OBJECT_DETECTION_DS_PREDICTION_FEATURES
+from labelr.utils import parse_hf_repo_id
+def convert_bbox_to_fo_format(
+    bbox: tuple[float, float, float, float],
+) -> tuple[float, float, float, float]:
+    # Bounding box coordinates should be relative values
+    # in [0, 1] in the following format:
+    # [top-left-x, top-left-y, width, height]
+    y_min, x_min, y_max, x_max = bbox
+    return (
+        x_min,
+        y_min,
+        (x_max - x_min),
+        (y_max - y_min),
+    )
+def visualize(
+    hf_repo_id: str,
+    dataset_name: str,
+    persistent: bool,
+):
+    hf_repo_id, hf_revision = parse_hf_repo_id(hf_repo_id)
+    file_path = hf_hub_download(
+        hf_repo_id,
+        filename="predictions.parquet",
+        revision=hf_revision,
+        repo_type="model",
+        # local_dir="./predictions/",
+    )
+    file_path = Path(file_path).absolute()
+    prediction_dataset = datasets.load_dataset(
+        "parquet",
+        data_files=str(file_path),
+        split="train",
+        features=OBJECT_DETECTION_DS_PREDICTION_FEATURES,
+    )
+    fo_dataset = fo.Dataset(name=dataset_name, persistent=persistent)
+    with tempfile.TemporaryDirectory() as tmpdir_str:
+        tmp_dir = Path(tmpdir_str)
+        for i, hf_sample in enumerate(prediction_dataset):
+            image = hf_sample["image"]
+            image_path = tmp_dir / f"{i}.jpg"
+            image.save(image_path)
+            split = hf_sample["split"]
+            sample = fo.Sample(
+                filepath=image_path,
+                split=split,
+                tags=[split],
+                image=hf_sample["image_id"],
+            )
+            ground_truth_detections = [
+                fo.Detection(
+                    label=hf_sample["objects"]["category_name"][i],
+                    bounding_box=convert_bbox_to_fo_format(
+                        bbox=hf_sample["objects"]["bbox"][i],
+                    ),
+                )
+                for i in range(len(hf_sample["objects"]["bbox"]))
+            ]
+            sample["ground_truth"] = fo.Detections(detections=ground_truth_detections)
+            if hf_sample["detected"] is not None and hf_sample["detected"]["bbox"]:
+                model_detections = [
+                    fo.Detection(
+                        label=hf_sample["detected"]["category_name"][i],
+                        bounding_box=convert_bbox_to_fo_format(
+                            bbox=hf_sample["detected"]["bbox"][i]
+                        ),
+                        confidence=hf_sample["detected"]["confidence"][i],
+                    )
+                    for i in range(len(hf_sample["detected"]["bbox"]))
+                ]
+                sample["model"] = fo.Detections(detections=model_detections)
+            fo_dataset.add_sample(sample)
+        # View summary info about the dataset
+        print(fo_dataset)
+        # Print the first few samples in the dataset
+        print(fo_dataset.head())
+        # Visualize the dataset in the FiftyOne App
+        session = fo.launch_app(fo_dataset)
+        fo_dataset.evaluate_detections(
+            "model", gt_field="ground_truth", eval_key="eval", compute_mAP=True
+        )
+        session.wait()

{labelr-0.7.0 → labelr-0.8.0}/src/labelr/export.py RENAMED Viewed

@@ -77,13 +77,7 @@ def export_from_ls_to_hf_object_detection(
                 functools.partial(_pickle_sample_generator, tmp_dir),
                 features=HF_DS_OBJECT_DETECTION_FEATURES,
             )
-            hf_ds.push_to_hub(
-                repo_id,
-                split=split,
-                revision=revision,
-                # Create a PR if not pushing to main branch
-                create_pr=revision != "main",
-            )
+            hf_ds.push_to_hub(repo_id, split=split, revision=revision)
 def export_from_ls_to_ultralytics_object_detection(

{labelr-0.7.0 → labelr-0.8.0}/src/labelr/main.py RENAMED Viewed

@@ -4,9 +4,10 @@ import typer
 from openfoodfacts.utils import get_logger
 from labelr.apps import datasets as dataset_app
-from labelr.apps import projects as project_app
+from labelr.apps import evaluate as evaluate_app
+from labelr.apps import hugging_face as hf_app
+from labelr.apps import label_studio as ls_app
 from labelr.apps import train as train_app
-from labelr.apps import users as user_app
 app = typer.Typer(pretty_exceptions_show_locals=False)
@@ -58,22 +59,30 @@ def predict(
         typer.echo(result)
-app.add_typer(user_app.app, name="users", help="Manage Label Studio users")
 app.add_typer(
-    project_app.app,
-    name="projects",
-    help="Manage Label Studio projects (create, import data, etc.)",
+    ls_app.app,
+    name="ls",
+    help="Manage Label Studio projects (create, import data, etc.).",
+)
+app.add_typer(
+    hf_app.app,
+    name="hf",
+    help="Manage Hugging Face Datasets repositories.",
 )
 app.add_typer(
     dataset_app.app,
     name="datasets",
     help="Manage datasets (convert, export, check, etc.)",
 )
 app.add_typer(
     train_app.app,
     name="train",
-    help="Train models",
+    help="Train models.",
+)
+app.add_typer(
+    evaluate_app.app,
+    name="evaluate",
+    help="Visualize and evaluate trained models.",
 )
 if __name__ == "__main__":

{labelr-0.7.0 → labelr-0.8.0}/src/labelr/sample.py RENAMED Viewed

@@ -1,16 +1,19 @@
 import logging
 import random
 import string
+import typing
 import datasets
+import PIL
 from openfoodfacts import Flavor
 from openfoodfacts.barcode import normalize_barcode
 from openfoodfacts.images import download_image, generate_image_url
+from PIL import ImageOps
 logger = logging.getLogger(__name__)
-def format_annotation_results_from_hf(
+def format_annotation_results_from_hf_to_ls(
     objects: dict, image_width: int, image_height: int
 ):
     """Format annotation results from a HF object detection dataset into Label
@@ -56,12 +59,12 @@ def format_annotation_results_from_hf(
     return annotation_results
-def format_object_detection_sample_from_hf(hf_sample: dict, split: str) -> dict:
+def format_object_detection_sample_from_hf_to_ls(hf_sample: dict, split: str) -> dict:
     hf_meta = hf_sample["meta"]
     objects = hf_sample["objects"]
     image_width = hf_sample["width"]
     image_height = hf_sample["height"]
-    annotation_results = format_annotation_results_from_hf(
+    annotation_results = format_annotation_results_from_hf_to_ls(
         objects, image_width, image_height
     )
     image_id = hf_sample["image_id"]
@@ -149,8 +152,24 @@ def format_object_detection_sample_to_hf(
     annotations: list[dict],
     label_names: list[str],
     merge_labels: bool = False,
-    use_aws_cache: bool = True,
+    use_aws_cache: bool = False,
 ) -> dict | None:
+    """Format a Label Studio object detection sample to Hugging Face format.
+    Args:
+        task_data: The task data from Label Studio.
+        annotations: The annotations from Label Studio.
+        label_names: The list of label names.
+        merge_labels: Whether to merge all labels into a single label (the
+            first label in `label_names`).
+        use_aws_cache: Whether to use AWS cache when downloading images.
+    Returns:
+        The formatted sample, or None in the following cases:
+        - More than one annotation is found
+        - No annotation is found
+        - An error occurs when downloading the image
+    """
     if len(annotations) > 1:
         logger.info("More than one annotation found, skipping")
         return None
@@ -186,6 +205,13 @@ def format_object_detection_sample_to_hf(
         logger.error("Failed to download image: %s", image_url)
         return None
+    # Correct image orientation using EXIF data
+    # Label Studio provides bounding boxes based on the displayed image (after
+    # eventual EXIF rotation), so we need to apply the same transformation to
+    # the image.
+    # Indeed, Hugging Face stores images without applying EXIF rotation, and
+    # EXIF data is not preserved in the dataset.
+    ImageOps.exif_transpose(typing.cast(PIL.Image.Image, image), in_place=True)
     return {
         "image_id": task_data["image_id"],
         "image": image,

{labelr-0.7.0 → labelr-0.8.0/src/labelr.egg-info}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: labelr
-Version: 0.7.0
+Version: 0.8.0
 Summary: A command-line tool to manage labeling tasks with Label Studio.
 Requires-Python: >=3.10
 Description-Content-Type: text/markdown
@@ -73,7 +73,7 @@ Once you have a Label Studio instance running, you can create a project easily.
 For an object detection task, a command allows you to create the configuration file automatically:
 ```bash
-labelr projects create-config --labels 'label1' --labels 'label2' --output-file label_config.xml
+labelr ls create-config --labels 'label1' --labels 'label2' --output-file label_config.xml
 ```
 where `label1` and `label2` are the labels you want to use for the object detection task, and `label_config.xml` is the output file that will contain the configuration.
@@ -81,17 +81,19 @@ where `label1` and `label2` are the labels you want to use for the object detect
 Then, you can create a project on Label Studio with the following command:
 ```bash
-labelr projects create --title my_project --api-key API_KEY --config-file label_config.xml
+labelr ls create --title my_project --api-key API_KEY --config-file label_config.xml
 ```
 where `API_KEY` is the API key of the Label Studio instance (API key is available at Account page), and `label_config.xml` is the configuration file of the project.
+`ls` stands for Label Studio in the CLI.
 #### Create a dataset file
 If you have a list of images, for an object detection task, you can quickly create a dataset file with the following command:
 ```bash
-labelr projects create-dataset-file --input-file image_urls.txt --output-file dataset.json
+labelr ls create-dataset-file --input-file image_urls.txt --output-file dataset.json
 ```
 where `image_urls.txt` is a file containing the URLs of the images, one per line, and `dataset.json` is the output file.
@@ -101,7 +103,7 @@ where `image_urls.txt` is a file containing the URLs of the images, one per line
 Next, import the generated data to a project with the following command:
 ```bash
-labelr projects import-data --project-id PROJECT_ID --dataset-path dataset.json
+labelr ls import-data --project-id PROJECT_ID --dataset-path dataset.json
 ```
 where `PROJECT_ID` is the ID of the project you created.
@@ -117,7 +119,7 @@ To accelerate annotation, you can pre-annotate the images with an object detecti
 To pre-annotate the data with Triton, use the following command:
 ```bash
-labelr projects add-prediction --project-id PROJECT_ID --backend ultralytics --labels 'product' --labels 'price tag' --label-mapping '{"price tag": "price-tag"}'
+labelr ls add-prediction --project-id PROJECT_ID --backend ultralytics --labels 'product' --labels 'price tag' --label-mapping '{"price tag": "price-tag"}'
 ```
 where `labels` is the list of labels to use for the object detection task (you can add as many labels as you want).

{labelr-0.7.0 → labelr-0.8.0}/src/labelr.egg-info/SOURCES.txt RENAMED Viewed

@@ -21,6 +21,10 @@ src/labelr.egg-info/requires.txt
 src/labelr.egg-info/top_level.txt
 src/labelr/apps/__init__.py
 src/labelr/apps/datasets.py
-src/labelr/apps/projects.py
+src/labelr/apps/evaluate.py
+src/labelr/apps/hugging_face.py
+src/labelr/apps/label_studio.py
 src/labelr/apps/train.py
-src/labelr/apps/users.py
+src/labelr/evaluate/__init__.py
+src/labelr/evaluate/llm.py
+src/labelr/evaluate/object_detection.py

labelr-0.7.0/src/labelr/apps/users.py DELETED Viewed

@@ -1,36 +0,0 @@
-from typing import Annotated
-import typer
-from ..config import LABEL_STUDIO_DEFAULT_URL
-app = typer.Typer()
-# Label Studio user management
-@app.command()
-def list(
-    api_key: Annotated[str, typer.Option(envvar="LABEL_STUDIO_API_KEY")],
-    label_studio_url: str = LABEL_STUDIO_DEFAULT_URL,
-):
-    """List all users in Label Studio."""
-    from label_studio_sdk.client import LabelStudio
-    ls = LabelStudio(base_url=label_studio_url, api_key=api_key)
-    for user in ls.users.list():
-        print(f"{user.id:02d}: {user.email}")
-@app.command()
-def delete(
-    user_id: int,
-    api_key: Annotated[str, typer.Option(envvar="LABEL_STUDIO_API_KEY")],
-    label_studio_url: str = LABEL_STUDIO_DEFAULT_URL,
-):
-    """Delete a user from Label Studio."""
-    from label_studio_sdk.client import LabelStudio
-    ls = LabelStudio(base_url=label_studio_url, api_key=api_key)
-    ls.users.delete(user_id)