PyPI - labelpull - Versions diffs - 0.1.0__tar.gz → 0.1.1__tar.gz - Mend

labelpull 0.1.0tar.gz → 0.1.1tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (18) hide show

labelpull-0.1.1/PKG-INFO ADDED Viewed

@@ -0,0 +1,100 @@
+Metadata-Version: 2.4
+Name: labelpull
+Version: 0.1.1
+Summary: Pull the latest Labelbox annotations into a tidy, ontology-agnostic table.
+Author-email: Wietze Suijker <wietze.suijker@gmail.com>
+License-Expression: MIT
+Requires-Python: >=3.10
+Requires-Dist: typer>=0.9
+Provides-Extra: dev
+Requires-Dist: mypy>=1.8; extra == 'dev'
+Requires-Dist: pytest-cov>=4.1; extra == 'dev'
+Requires-Dist: pytest>=7.4; extra == 'dev'
+Requires-Dist: ruff>=0.4; extra == 'dev'
+Provides-Extra: live
+Requires-Dist: labelbox>=7.0; extra == 'live'
+Description-Content-Type: text/markdown
+# labelpull
+Pull the latest Labelbox annotations into a tidy, ontology-agnostic table.
+The Labelbox SDK already exports a project's labels and streams them. What it
+doesn't give you is a *tabular* view of that deeply nested JSON, the correctness
+logic to pick the right label when a row was reviewed, or a workflow status that
+is always populated. `labelpull` is exactly that thin layer on top of the SDK.
+## Quick start
+```bash
+pip install 'labelpull[live]'
+export LABELBOX_API_KEY=...                 # Labelbox → Workspace settings → API keys
+labelpull pull <PROJECT_ID> -o labels.csv   # <PROJECT_ID> is in your project's URL
+```
+You get one row per annotation (any ontology):
+```
+global_key,data_row_id,feature_kind,feature_name,value,workflow_status,labeled_by,created_at,parent_feature_id
+photo_001.jpg,clz…,label,,,Done,bot@lab.org,2026-06-05T08:00:00Z,
+photo_001.jpg,clz…,radio,Species,Ficus insipida,Done,bot@lab.org,2026-06-05T08:00:00Z,
+photo_001.jpg,clz…,checklist,Organs,leaf;flower,Done,bot@lab.org,2026-06-05T08:00:00Z,
+```
+> The first row of each photo (`feature_kind=label`, empty `feature_name`) is a
+> marker that the photo *was reached and labelled* — it carries who/when even
+> when the annotator left it blank, so empty-but-labelled photos still show up.
+> Ignore it if you only want answers: filter to `feature_kind != "label"`.
+## Install
+```bash
+pip install labelpull          # offline parsing + CLI (no SDK)
+pip install 'labelpull[live]'  # + the Labelbox SDK, for live pulls from the API
+```
+## CLI
+```bash
+labelpull pull <PROJECT_ID> -o labels.csv               # everything, generic long CSV
+labelpull pull <PROJECT_ID> --status Done               # only verified rows
+labelpull pull <PROJECT_ID> --since 2026-06-01          # only labels created since a date
+labelpull pull <PROJECT_ID> --from-export export.ndjson # offline: a UI "Export" file, no API key
+```
+`--status` takes `ToLabel | InReview | InRework | Done`. Every run prints a
+summary (rows, labelled count, feature kinds, latest label timestamp).
+If your project is a single-classification task and you want one row per item
+instead of the long format, filter the CSV to your feature (e.g. keep
+`feature_name == "Species"`), or write a 10-line `Adapter` (see below).
+## Library
+```python
+import labelpull
+rows = list(labelpull.export("proj_id", status="Done"))   # live; needs labelpull[live]
+# or, offline from a UI export:
+# rows = labelpull.read_export_file("export.ndjson")
+features = [f for r in rows for f in labelpull.flatten(r, "proj_id")]
+labelpull.write_csv("labels.csv", labelpull.GenericAdapter(), features)
+print(labelpull.summarize(rows, features))
+```
+`flatten()` handles radio / checklist / text classifications and bbox / polygon /
+line / point / mask objects (with nested classifications linked to their parent),
+and always selects the most recently created label so a QC-reviewed row reports
+the reviewer's answer, not the annotator's.
+## Custom output shape
+`GenericAdapter` (the default) writes one row per feature. To collapse features
+into a project-specific wide table, write an `Adapter` — given the flattened
+`FeatureRow`s, yield your own columns. `SpeciesAdapter` is a worked example
+(it pivots a `Taxon` radio + `Organs` checklist into one row per photo):
+```bash
+labelpull pull <PROJECT_ID> --schema species -o taxa.csv
+```

labelpull-0.1.1/README.md ADDED Viewed

@@ -0,0 +1,83 @@
+# labelpull
+Pull the latest Labelbox annotations into a tidy, ontology-agnostic table.
+The Labelbox SDK already exports a project's labels and streams them. What it
+doesn't give you is a *tabular* view of that deeply nested JSON, the correctness
+logic to pick the right label when a row was reviewed, or a workflow status that
+is always populated. `labelpull` is exactly that thin layer on top of the SDK.
+## Quick start
+```bash
+pip install 'labelpull[live]'
+export LABELBOX_API_KEY=...                 # Labelbox → Workspace settings → API keys
+labelpull pull <PROJECT_ID> -o labels.csv   # <PROJECT_ID> is in your project's URL
+```
+You get one row per annotation (any ontology):
+```
+global_key,data_row_id,feature_kind,feature_name,value,workflow_status,labeled_by,created_at,parent_feature_id
+photo_001.jpg,clz…,label,,,Done,bot@lab.org,2026-06-05T08:00:00Z,
+photo_001.jpg,clz…,radio,Species,Ficus insipida,Done,bot@lab.org,2026-06-05T08:00:00Z,
+photo_001.jpg,clz…,checklist,Organs,leaf;flower,Done,bot@lab.org,2026-06-05T08:00:00Z,
+```
+> The first row of each photo (`feature_kind=label`, empty `feature_name`) is a
+> marker that the photo *was reached and labelled* — it carries who/when even
+> when the annotator left it blank, so empty-but-labelled photos still show up.
+> Ignore it if you only want answers: filter to `feature_kind != "label"`.
+## Install
+```bash
+pip install labelpull          # offline parsing + CLI (no SDK)
+pip install 'labelpull[live]'  # + the Labelbox SDK, for live pulls from the API
+```
+## CLI
+```bash
+labelpull pull <PROJECT_ID> -o labels.csv               # everything, generic long CSV
+labelpull pull <PROJECT_ID> --status Done               # only verified rows
+labelpull pull <PROJECT_ID> --since 2026-06-01          # only labels created since a date
+labelpull pull <PROJECT_ID> --from-export export.ndjson # offline: a UI "Export" file, no API key
+```
+`--status` takes `ToLabel | InReview | InRework | Done`. Every run prints a
+summary (rows, labelled count, feature kinds, latest label timestamp).
+If your project is a single-classification task and you want one row per item
+instead of the long format, filter the CSV to your feature (e.g. keep
+`feature_name == "Species"`), or write a 10-line `Adapter` (see below).
+## Library
+```python
+import labelpull
+rows = list(labelpull.export("proj_id", status="Done"))   # live; needs labelpull[live]
+# or, offline from a UI export:
+# rows = labelpull.read_export_file("export.ndjson")
+features = [f for r in rows for f in labelpull.flatten(r, "proj_id")]
+labelpull.write_csv("labels.csv", labelpull.GenericAdapter(), features)
+print(labelpull.summarize(rows, features))
+```
+`flatten()` handles radio / checklist / text classifications and bbox / polygon /
+line / point / mask objects (with nested classifications linked to their parent),
+and always selects the most recently created label so a QC-reviewed row reports
+the reviewer's answer, not the annotator's.
+## Custom output shape
+`GenericAdapter` (the default) writes one row per feature. To collapse features
+into a project-specific wide table, write an `Adapter` — given the flattened
+`FeatureRow`s, yield your own columns. `SpeciesAdapter` is a worked example
+(it pivots a `Taxon` radio + `Organs` checklist into one row per photo):
+```bash
+labelpull pull <PROJECT_ID> --schema species -o taxa.csv
+```

{labelpull-0.1.0 → labelpull-0.1.1}/pyproject.toml RENAMED Viewed

@@ -4,7 +4,7 @@ build-backend = "hatchling.build"
 [project]
 name = "labelpull"
-version = "0.1.0"
+version = "0.1.1"
 description = "Pull the latest Labelbox annotations into a tidy, ontology-agnostic table."
 readme = "README.md"
 requires-python = ">=3.10"

{labelpull-0.1.0 → labelpull-0.1.1}/src/labelpull/__init__.py RENAMED Viewed

@@ -25,7 +25,7 @@ from labelpull.core import (
     summarize,
 )
-__version__ = "0.1.0"
+__version__ = "0.1.1"
 __all__ = [
     "ADAPTERS",

labelpull-0.1.0/PKG-INFO DELETED Viewed

@@ -1,69 +0,0 @@
-Metadata-Version: 2.4
-Name: labelpull
-Version: 0.1.0
-Summary: Pull the latest Labelbox annotations into a tidy, ontology-agnostic table.
-Author-email: Wietze Suijker <wietze.suijker@gmail.com>
-License-Expression: MIT
-Requires-Python: >=3.10
-Requires-Dist: typer>=0.9
-Provides-Extra: dev
-Requires-Dist: mypy>=1.8; extra == 'dev'
-Requires-Dist: pytest-cov>=4.1; extra == 'dev'
-Requires-Dist: pytest>=7.4; extra == 'dev'
-Requires-Dist: ruff>=0.4; extra == 'dev'
-Provides-Extra: live
-Requires-Dist: labelbox>=7.0; extra == 'live'
-Description-Content-Type: text/markdown
-# labelpull
-Pull the latest Labelbox annotations into a tidy, ontology-agnostic table.
-The Labelbox SDK already exports a project's labels and streams them. What it
-doesn't give you is a *tabular* view of that deeply nested JSON, the correctness
-logic to pick the right label when a row was reviewed, or a workflow status that
-is always populated. `labelpull` is exactly that thin layer on top of the SDK.
-## Install
-```bash
-pip install labelpull            # offline parsing + CLI
-pip install 'labelpull[live]'    # + the Labelbox SDK for live pulls
-```
-## CLI
-```bash
-export LABELBOX_API_KEY=...
-labelpull pull <PROJECT_ID> -o labels.csv               # generic long CSV (any ontology)
-labelpull pull <PROJECT_ID> --status Done               # only verified rows
-labelpull pull <PROJECT_ID> --since 2026-06-01          # only the latest labels
-labelpull pull <PROJECT_ID> --from-export export.ndjson # offline, no API key
-labelpull pull <PROJECT_ID> --schema species -o taxa.csv # speciesfirst Taxon/Organs wide CSV
-```
-`--schema generic` (default) writes one row per feature — every classification
-and object, any ontology:
-```
-global_key,data_row_id,feature_kind,feature_name,value,workflow_status,labeled_by,created_at,parent_feature_id
-```
-## Library
-```python
-import labelpull
-rows = list(labelpull.export("proj_id", status="Done"))   # or read_export_file("export.ndjson")
-features = [f for r in rows for f in labelpull.flatten(r, "proj_id")]
-labelpull.write_csv("labels.csv", labelpull.GenericAdapter(), features)
-print(labelpull.summarize(rows, features))
-```
-`flatten()` handles radio / checklist / text classifications and bbox / polygon /
-line / point / mask objects (with nested classifications linked to their parent),
-and always selects the most recently created label so a QC-reviewed row reports
-the reviewer's answer, not the annotator's.
-Write your own `Adapter` to collapse features into a project-specific wide table;
-`SpeciesAdapter` is the reference implementation.

labelpull-0.1.0/README.md DELETED Viewed

@@ -1,52 +0,0 @@
-# labelpull
-Pull the latest Labelbox annotations into a tidy, ontology-agnostic table.
-The Labelbox SDK already exports a project's labels and streams them. What it
-doesn't give you is a *tabular* view of that deeply nested JSON, the correctness
-logic to pick the right label when a row was reviewed, or a workflow status that
-is always populated. `labelpull` is exactly that thin layer on top of the SDK.
-## Install
-```bash
-pip install labelpull            # offline parsing + CLI
-pip install 'labelpull[live]'    # + the Labelbox SDK for live pulls
-```
-## CLI
-```bash
-export LABELBOX_API_KEY=...
-labelpull pull <PROJECT_ID> -o labels.csv               # generic long CSV (any ontology)
-labelpull pull <PROJECT_ID> --status Done               # only verified rows
-labelpull pull <PROJECT_ID> --since 2026-06-01          # only the latest labels
-labelpull pull <PROJECT_ID> --from-export export.ndjson # offline, no API key
-labelpull pull <PROJECT_ID> --schema species -o taxa.csv # speciesfirst Taxon/Organs wide CSV
-```
-`--schema generic` (default) writes one row per feature — every classification
-and object, any ontology:
-```
-global_key,data_row_id,feature_kind,feature_name,value,workflow_status,labeled_by,created_at,parent_feature_id
-```
-## Library
-```python
-import labelpull
-rows = list(labelpull.export("proj_id", status="Done"))   # or read_export_file("export.ndjson")
-features = [f for r in rows for f in labelpull.flatten(r, "proj_id")]
-labelpull.write_csv("labels.csv", labelpull.GenericAdapter(), features)
-print(labelpull.summarize(rows, features))
-```
-`flatten()` handles radio / checklist / text classifications and bbox / polygon /
-line / point / mask objects (with nested classifications linked to their parent),
-and always selects the most recently created label so a QC-reviewed row reports
-the reviewer's answer, not the annotator's.
-Write your own `Adapter` to collapse features into a project-specific wide table;
-`SpeciesAdapter` is the reference implementation.