labelpull 0.1.0__tar.gz → 0.1.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,100 @@
1
+ Metadata-Version: 2.4
2
+ Name: labelpull
3
+ Version: 0.1.1
4
+ Summary: Pull the latest Labelbox annotations into a tidy, ontology-agnostic table.
5
+ Author-email: Wietze Suijker <wietze.suijker@gmail.com>
6
+ License-Expression: MIT
7
+ Requires-Python: >=3.10
8
+ Requires-Dist: typer>=0.9
9
+ Provides-Extra: dev
10
+ Requires-Dist: mypy>=1.8; extra == 'dev'
11
+ Requires-Dist: pytest-cov>=4.1; extra == 'dev'
12
+ Requires-Dist: pytest>=7.4; extra == 'dev'
13
+ Requires-Dist: ruff>=0.4; extra == 'dev'
14
+ Provides-Extra: live
15
+ Requires-Dist: labelbox>=7.0; extra == 'live'
16
+ Description-Content-Type: text/markdown
17
+
18
+ # labelpull
19
+
20
+ Pull the latest Labelbox annotations into a tidy, ontology-agnostic table.
21
+
22
+ The Labelbox SDK already exports a project's labels and streams them. What it
23
+ doesn't give you is a *tabular* view of that deeply nested JSON, the correctness
24
+ logic to pick the right label when a row was reviewed, or a workflow status that
25
+ is always populated. `labelpull` is exactly that thin layer on top of the SDK.
26
+
27
+ ## Quick start
28
+
29
+ ```bash
30
+ pip install 'labelpull[live]'
31
+ export LABELBOX_API_KEY=... # Labelbox → Workspace settings → API keys
32
+ labelpull pull <PROJECT_ID> -o labels.csv # <PROJECT_ID> is in your project's URL
33
+ ```
34
+
35
+ You get one row per annotation (any ontology):
36
+
37
+ ```
38
+ global_key,data_row_id,feature_kind,feature_name,value,workflow_status,labeled_by,created_at,parent_feature_id
39
+ photo_001.jpg,clz…,label,,,Done,bot@lab.org,2026-06-05T08:00:00Z,
40
+ photo_001.jpg,clz…,radio,Species,Ficus insipida,Done,bot@lab.org,2026-06-05T08:00:00Z,
41
+ photo_001.jpg,clz…,checklist,Organs,leaf;flower,Done,bot@lab.org,2026-06-05T08:00:00Z,
42
+ ```
43
+
44
+ > The first row of each photo (`feature_kind=label`, empty `feature_name`) is a
45
+ > marker that the photo *was reached and labelled* — it carries who/when even
46
+ > when the annotator left it blank, so empty-but-labelled photos still show up.
47
+ > Ignore it if you only want answers: filter to `feature_kind != "label"`.
48
+
49
+ ## Install
50
+
51
+ ```bash
52
+ pip install labelpull # offline parsing + CLI (no SDK)
53
+ pip install 'labelpull[live]' # + the Labelbox SDK, for live pulls from the API
54
+ ```
55
+
56
+ ## CLI
57
+
58
+ ```bash
59
+ labelpull pull <PROJECT_ID> -o labels.csv # everything, generic long CSV
60
+ labelpull pull <PROJECT_ID> --status Done # only verified rows
61
+ labelpull pull <PROJECT_ID> --since 2026-06-01 # only labels created since a date
62
+ labelpull pull <PROJECT_ID> --from-export export.ndjson # offline: a UI "Export" file, no API key
63
+ ```
64
+
65
+ `--status` takes `ToLabel | InReview | InRework | Done`. Every run prints a
66
+ summary (rows, labelled count, feature kinds, latest label timestamp).
67
+
68
+ If your project is a single-classification task and you want one row per item
69
+ instead of the long format, filter the CSV to your feature (e.g. keep
70
+ `feature_name == "Species"`), or write a 10-line `Adapter` (see below).
71
+
72
+ ## Library
73
+
74
+ ```python
75
+ import labelpull
76
+
77
+ rows = list(labelpull.export("proj_id", status="Done")) # live; needs labelpull[live]
78
+ # or, offline from a UI export:
79
+ # rows = labelpull.read_export_file("export.ndjson")
80
+
81
+ features = [f for r in rows for f in labelpull.flatten(r, "proj_id")]
82
+ labelpull.write_csv("labels.csv", labelpull.GenericAdapter(), features)
83
+ print(labelpull.summarize(rows, features))
84
+ ```
85
+
86
+ `flatten()` handles radio / checklist / text classifications and bbox / polygon /
87
+ line / point / mask objects (with nested classifications linked to their parent),
88
+ and always selects the most recently created label so a QC-reviewed row reports
89
+ the reviewer's answer, not the annotator's.
90
+
91
+ ## Custom output shape
92
+
93
+ `GenericAdapter` (the default) writes one row per feature. To collapse features
94
+ into a project-specific wide table, write an `Adapter` — given the flattened
95
+ `FeatureRow`s, yield your own columns. `SpeciesAdapter` is a worked example
96
+ (it pivots a `Taxon` radio + `Organs` checklist into one row per photo):
97
+
98
+ ```bash
99
+ labelpull pull <PROJECT_ID> --schema species -o taxa.csv
100
+ ```
@@ -0,0 +1,83 @@
1
+ # labelpull
2
+
3
+ Pull the latest Labelbox annotations into a tidy, ontology-agnostic table.
4
+
5
+ The Labelbox SDK already exports a project's labels and streams them. What it
6
+ doesn't give you is a *tabular* view of that deeply nested JSON, the correctness
7
+ logic to pick the right label when a row was reviewed, or a workflow status that
8
+ is always populated. `labelpull` is exactly that thin layer on top of the SDK.
9
+
10
+ ## Quick start
11
+
12
+ ```bash
13
+ pip install 'labelpull[live]'
14
+ export LABELBOX_API_KEY=... # Labelbox → Workspace settings → API keys
15
+ labelpull pull <PROJECT_ID> -o labels.csv # <PROJECT_ID> is in your project's URL
16
+ ```
17
+
18
+ You get one row per annotation (any ontology):
19
+
20
+ ```
21
+ global_key,data_row_id,feature_kind,feature_name,value,workflow_status,labeled_by,created_at,parent_feature_id
22
+ photo_001.jpg,clz…,label,,,Done,bot@lab.org,2026-06-05T08:00:00Z,
23
+ photo_001.jpg,clz…,radio,Species,Ficus insipida,Done,bot@lab.org,2026-06-05T08:00:00Z,
24
+ photo_001.jpg,clz…,checklist,Organs,leaf;flower,Done,bot@lab.org,2026-06-05T08:00:00Z,
25
+ ```
26
+
27
+ > The first row of each photo (`feature_kind=label`, empty `feature_name`) is a
28
+ > marker that the photo *was reached and labelled* — it carries who/when even
29
+ > when the annotator left it blank, so empty-but-labelled photos still show up.
30
+ > Ignore it if you only want answers: filter to `feature_kind != "label"`.
31
+
32
+ ## Install
33
+
34
+ ```bash
35
+ pip install labelpull # offline parsing + CLI (no SDK)
36
+ pip install 'labelpull[live]' # + the Labelbox SDK, for live pulls from the API
37
+ ```
38
+
39
+ ## CLI
40
+
41
+ ```bash
42
+ labelpull pull <PROJECT_ID> -o labels.csv # everything, generic long CSV
43
+ labelpull pull <PROJECT_ID> --status Done # only verified rows
44
+ labelpull pull <PROJECT_ID> --since 2026-06-01 # only labels created since a date
45
+ labelpull pull <PROJECT_ID> --from-export export.ndjson # offline: a UI "Export" file, no API key
46
+ ```
47
+
48
+ `--status` takes `ToLabel | InReview | InRework | Done`. Every run prints a
49
+ summary (rows, labelled count, feature kinds, latest label timestamp).
50
+
51
+ If your project is a single-classification task and you want one row per item
52
+ instead of the long format, filter the CSV to your feature (e.g. keep
53
+ `feature_name == "Species"`), or write a 10-line `Adapter` (see below).
54
+
55
+ ## Library
56
+
57
+ ```python
58
+ import labelpull
59
+
60
+ rows = list(labelpull.export("proj_id", status="Done")) # live; needs labelpull[live]
61
+ # or, offline from a UI export:
62
+ # rows = labelpull.read_export_file("export.ndjson")
63
+
64
+ features = [f for r in rows for f in labelpull.flatten(r, "proj_id")]
65
+ labelpull.write_csv("labels.csv", labelpull.GenericAdapter(), features)
66
+ print(labelpull.summarize(rows, features))
67
+ ```
68
+
69
+ `flatten()` handles radio / checklist / text classifications and bbox / polygon /
70
+ line / point / mask objects (with nested classifications linked to their parent),
71
+ and always selects the most recently created label so a QC-reviewed row reports
72
+ the reviewer's answer, not the annotator's.
73
+
74
+ ## Custom output shape
75
+
76
+ `GenericAdapter` (the default) writes one row per feature. To collapse features
77
+ into a project-specific wide table, write an `Adapter` — given the flattened
78
+ `FeatureRow`s, yield your own columns. `SpeciesAdapter` is a worked example
79
+ (it pivots a `Taxon` radio + `Organs` checklist into one row per photo):
80
+
81
+ ```bash
82
+ labelpull pull <PROJECT_ID> --schema species -o taxa.csv
83
+ ```
@@ -4,7 +4,7 @@ build-backend = "hatchling.build"
4
4
 
5
5
  [project]
6
6
  name = "labelpull"
7
- version = "0.1.0"
7
+ version = "0.1.1"
8
8
  description = "Pull the latest Labelbox annotations into a tidy, ontology-agnostic table."
9
9
  readme = "README.md"
10
10
  requires-python = ">=3.10"
@@ -25,7 +25,7 @@ from labelpull.core import (
25
25
  summarize,
26
26
  )
27
27
 
28
- __version__ = "0.1.0"
28
+ __version__ = "0.1.1"
29
29
 
30
30
  __all__ = [
31
31
  "ADAPTERS",
labelpull-0.1.0/PKG-INFO DELETED
@@ -1,69 +0,0 @@
1
- Metadata-Version: 2.4
2
- Name: labelpull
3
- Version: 0.1.0
4
- Summary: Pull the latest Labelbox annotations into a tidy, ontology-agnostic table.
5
- Author-email: Wietze Suijker <wietze.suijker@gmail.com>
6
- License-Expression: MIT
7
- Requires-Python: >=3.10
8
- Requires-Dist: typer>=0.9
9
- Provides-Extra: dev
10
- Requires-Dist: mypy>=1.8; extra == 'dev'
11
- Requires-Dist: pytest-cov>=4.1; extra == 'dev'
12
- Requires-Dist: pytest>=7.4; extra == 'dev'
13
- Requires-Dist: ruff>=0.4; extra == 'dev'
14
- Provides-Extra: live
15
- Requires-Dist: labelbox>=7.0; extra == 'live'
16
- Description-Content-Type: text/markdown
17
-
18
- # labelpull
19
-
20
- Pull the latest Labelbox annotations into a tidy, ontology-agnostic table.
21
-
22
- The Labelbox SDK already exports a project's labels and streams them. What it
23
- doesn't give you is a *tabular* view of that deeply nested JSON, the correctness
24
- logic to pick the right label when a row was reviewed, or a workflow status that
25
- is always populated. `labelpull` is exactly that thin layer on top of the SDK.
26
-
27
- ## Install
28
-
29
- ```bash
30
- pip install labelpull # offline parsing + CLI
31
- pip install 'labelpull[live]' # + the Labelbox SDK for live pulls
32
- ```
33
-
34
- ## CLI
35
-
36
- ```bash
37
- export LABELBOX_API_KEY=...
38
- labelpull pull <PROJECT_ID> -o labels.csv # generic long CSV (any ontology)
39
- labelpull pull <PROJECT_ID> --status Done # only verified rows
40
- labelpull pull <PROJECT_ID> --since 2026-06-01 # only the latest labels
41
- labelpull pull <PROJECT_ID> --from-export export.ndjson # offline, no API key
42
- labelpull pull <PROJECT_ID> --schema species -o taxa.csv # speciesfirst Taxon/Organs wide CSV
43
- ```
44
-
45
- `--schema generic` (default) writes one row per feature — every classification
46
- and object, any ontology:
47
-
48
- ```
49
- global_key,data_row_id,feature_kind,feature_name,value,workflow_status,labeled_by,created_at,parent_feature_id
50
- ```
51
-
52
- ## Library
53
-
54
- ```python
55
- import labelpull
56
-
57
- rows = list(labelpull.export("proj_id", status="Done")) # or read_export_file("export.ndjson")
58
- features = [f for r in rows for f in labelpull.flatten(r, "proj_id")]
59
- labelpull.write_csv("labels.csv", labelpull.GenericAdapter(), features)
60
- print(labelpull.summarize(rows, features))
61
- ```
62
-
63
- `flatten()` handles radio / checklist / text classifications and bbox / polygon /
64
- line / point / mask objects (with nested classifications linked to their parent),
65
- and always selects the most recently created label so a QC-reviewed row reports
66
- the reviewer's answer, not the annotator's.
67
-
68
- Write your own `Adapter` to collapse features into a project-specific wide table;
69
- `SpeciesAdapter` is the reference implementation.
labelpull-0.1.0/README.md DELETED
@@ -1,52 +0,0 @@
1
- # labelpull
2
-
3
- Pull the latest Labelbox annotations into a tidy, ontology-agnostic table.
4
-
5
- The Labelbox SDK already exports a project's labels and streams them. What it
6
- doesn't give you is a *tabular* view of that deeply nested JSON, the correctness
7
- logic to pick the right label when a row was reviewed, or a workflow status that
8
- is always populated. `labelpull` is exactly that thin layer on top of the SDK.
9
-
10
- ## Install
11
-
12
- ```bash
13
- pip install labelpull # offline parsing + CLI
14
- pip install 'labelpull[live]' # + the Labelbox SDK for live pulls
15
- ```
16
-
17
- ## CLI
18
-
19
- ```bash
20
- export LABELBOX_API_KEY=...
21
- labelpull pull <PROJECT_ID> -o labels.csv # generic long CSV (any ontology)
22
- labelpull pull <PROJECT_ID> --status Done # only verified rows
23
- labelpull pull <PROJECT_ID> --since 2026-06-01 # only the latest labels
24
- labelpull pull <PROJECT_ID> --from-export export.ndjson # offline, no API key
25
- labelpull pull <PROJECT_ID> --schema species -o taxa.csv # speciesfirst Taxon/Organs wide CSV
26
- ```
27
-
28
- `--schema generic` (default) writes one row per feature — every classification
29
- and object, any ontology:
30
-
31
- ```
32
- global_key,data_row_id,feature_kind,feature_name,value,workflow_status,labeled_by,created_at,parent_feature_id
33
- ```
34
-
35
- ## Library
36
-
37
- ```python
38
- import labelpull
39
-
40
- rows = list(labelpull.export("proj_id", status="Done")) # or read_export_file("export.ndjson")
41
- features = [f for r in rows for f in labelpull.flatten(r, "proj_id")]
42
- labelpull.write_csv("labels.csv", labelpull.GenericAdapter(), features)
43
- print(labelpull.summarize(rows, features))
44
- ```
45
-
46
- `flatten()` handles radio / checklist / text classifications and bbox / polygon /
47
- line / point / mask objects (with nested classifications linked to their parent),
48
- and always selects the most recently created label so a QC-reviewed row reports
49
- the reviewer's answer, not the annotator's.
50
-
51
- Write your own `Adapter` to collapse features into a project-specific wide table;
52
- `SpeciesAdapter` is the reference implementation.
File without changes
File without changes
File without changes