hirundo 0.1.9__py3-none-any.whl → 0.1.18__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
- Metadata-Version: 2.1
1
+ Metadata-Version: 2.4
2
2
  Name: hirundo
3
- Version: 0.1.9
3
+ Version: 0.1.18
4
4
  Summary: This package is used to interface with Hirundo's platform. It provides a simple API to optimize your ML datasets.
5
5
  Author-email: Hirundo <dev@hirundo.io>
6
6
  License: MIT License
@@ -31,7 +31,6 @@ Requires-Dist: typer>=0.12.3
31
31
  Requires-Dist: httpx>=0.27.0
32
32
  Requires-Dist: stamina>=24.2.0
33
33
  Requires-Dist: httpx-sse>=0.4.0
34
- Requires-Dist: pandas>=2.2.2
35
34
  Requires-Dist: tqdm>=4.66.5
36
35
  Provides-Extra: dev
37
36
  Requires-Dist: pyyaml>=6.0.1; extra == "dev"
@@ -50,13 +49,13 @@ Requires-Dist: pytest-asyncio>=0.23.6; extra == "dev"
50
49
  Requires-Dist: uv>=0.5.8; extra == "dev"
51
50
  Requires-Dist: pre-commit>=3.7.1; extra == "dev"
52
51
  Requires-Dist: virtualenv>=20.6.6; extra == "dev"
53
- Requires-Dist: ruff>=0.8.2; extra == "dev"
52
+ Requires-Dist: ruff>=0.11.6; extra == "dev"
54
53
  Requires-Dist: bumpver; extra == "dev"
55
54
  Requires-Dist: platformdirs>=4.3.6; extra == "dev"
56
55
  Requires-Dist: safety>=3.2.13; extra == "dev"
57
56
  Provides-Extra: docs
58
57
  Requires-Dist: sphinx>=7.4.7; extra == "docs"
59
- Requires-Dist: sphinx-autobuild>=2024.4.16; extra == "docs"
58
+ Requires-Dist: sphinx-autobuild>=2024.9.3; extra == "docs"
60
59
  Requires-Dist: sphinx-click>=5.0.1; extra == "docs"
61
60
  Requires-Dist: autodoc_pydantic>=2.2.0; extra == "docs"
62
61
  Requires-Dist: furo; extra == "docs"
@@ -64,6 +63,11 @@ Requires-Dist: sphinx-multiversion; extra == "docs"
64
63
  Requires-Dist: esbonio; extra == "docs"
65
64
  Requires-Dist: starlette>0.40.0; extra == "docs"
66
65
  Requires-Dist: markupsafe>=3.0.2; extra == "docs"
66
+ Provides-Extra: pandas
67
+ Requires-Dist: pandas>=2.2.3; extra == "pandas"
68
+ Provides-Extra: polars
69
+ Requires-Dist: polars>=1.0.0; extra == "polars"
70
+ Dynamic: license-file
67
71
 
68
72
  # Hirundo
69
73
 
@@ -71,40 +75,62 @@ This package exposes access to Hirundo APIs for dataset optimization for Machine
71
75
 
72
76
  Dataset optimization is currently available for datasets labelled for classification and object detection.
73
77
 
74
-
75
78
  Support dataset storage configs include:
76
- - Google Cloud (GCP) Storage
77
- - Amazon Web Services (AWS) S3
78
- - Git LFS (Large File Storage) repositories (e.g. GitHub or HuggingFace)
79
+
80
+ - Google Cloud (GCP) Storage
81
+ - Amazon Web Services (AWS) S3
82
+ - Git LFS (Large File Storage) repositories (e.g. GitHub or HuggingFace)
83
+
84
+ Note: This Python package must be used alongside a Hirundo server, either the SaaS platform, a custom VPC deployment or an on-premises installation.
79
85
 
80
86
  Optimizing a classification dataset
81
87
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
82
88
 
83
- Currently ``hirundo`` requires a CSV file with the following columns (all columns are required):
84
- - ``image_path``: The location of the image within the dataset ``root``
85
- - ``label``: The label of the image, i.e. which the class that was annotated for this image
89
+ Currently `hirundo` requires a CSV file with the following columns (all columns are required):
90
+
91
+ - `image_path`: The location of the image within the dataset `data_root_url`
92
+ - `class_name`: The semantic label, i.e. the class name of the class that the image was annotated as belonging to
93
+
94
+ And outputs two Pandas DataFrames with the dataset columns as well as:
95
+
96
+ Suspect DataFrame (filename: `mislabel_suspects.csv`) columns:
86
97
 
87
- And outputs a CSV with the same columns and:
88
- - ``suspect_level``: mislabel suspect level
89
- - ``suggested_label``: suggested label
90
- - ``suggested_label_conf``: suggested label confidence
98
+ - ``suspect_score``: mislabel suspect score
99
+ - ``suspect_level``: mislabel suspect level
100
+ - ``suspect_rank``: mislabel suspect ranking
101
+ - ``suggested_class_name``: suggested semantic label
102
+ - ``suggested_class_conf``: suggested semantic label confidence
103
+
104
+ Errors and warnings DataFrame (filename: `invalid_data.csv`) columns:
105
+
106
+ - ``status``: status message (one of ``NO_LABELS`` / ``MISSING_IMAGE`` / ``INVALID_IMAGE``)
91
107
 
92
108
  Optimizing an object detection (OD) dataset
93
109
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
94
110
 
95
111
  Currently ``hirundo`` requires a CSV file with the following columns (all columns are required):
96
- - ``image_path``: The location of the image within the dataset ``root``
97
- - ``bbox_id``: The index of the bounding box within the dataset. Used to indicate label suspects
98
- - ``label``: The label of the image, i.e. which the class that was annotated for this image
99
- - ``x1``, ``y1``, ``x2``, ``y2``: The bounding box coordinates of the object within the image
100
112
 
101
- And outputs a CSV with the same columns and:
102
- - ``suspect_level``: object mislabel suspect level
103
- - ``suggested_label``: suggested object label
104
- - ``suggested_label_conf``: suggested object label confidence
113
+ - ``image_path``: The location of the image within the dataset ``data_root_url``
114
+ - ``object_id``: The ID of the bounding box within the dataset. Used to indicate object suspects
115
+ - ``class_name``: Object semantic label, i.e. the class name of the object that was annotated
116
+ - ``xmin``: leftmost horizontal pixel coordinate of the object's bounding box
117
+ - ``ymin``: uppermost vertical pixel coordinate of the object's bounding box
118
+ - ``xmax``: rightmost horizontal pixel coordinate of the object's bounding box
119
+ - ``ymax``: lowermost vertical pixel coordinate of the object's bounding box
105
120
 
106
- Note: This Python package must be used alongside a Hirundo server, either the SaaS platform, a custom VPC deployment or an on-premises installation.
107
121
 
122
+ And outputs two Pandas DataFrames with the dataset columns as well as:
123
+
124
+ Suspect DataFrame (filename: `mislabel_suspects.csv`) columns:
125
+
126
+ - ``suspect_score``: object mislabel suspect score
127
+ - ``suspect_level``: object mislabel suspect level
128
+ - ``suspect_rank``: object mislabel suspect ranking
129
+ - ``suggested_class_name``: suggested object semantic label
130
+ - ``suggested_class_conf``: suggested object semantic label confidence
131
+
132
+ Errors and warnings DataFrame (filename: `invalid_data.csv`) columns:
133
+ - ``status``: status message (one of ``NO_LABELS`` / ``MISSING_IMAGE`` / ``INVALID_IMAGE`` / ``INVALID_BBOX`` / ``INVALID_BBOX_SIZE``)
108
134
 
109
135
  ## Installation
110
136
 
@@ -113,6 +139,7 @@ You can install the codebase with a simple `pip install hirundo` to install the
113
139
  ## Usage
114
140
 
115
141
  Classification example:
142
+
116
143
  ```python
117
144
  from hirundo import (
118
145
  HirundoCSV,
@@ -148,7 +175,6 @@ results = test_dataset.check_run()
148
175
  print(results)
149
176
  ```
150
177
 
151
-
152
178
  Object detection example:
153
179
 
154
180
  ```python
@@ -165,7 +191,7 @@ from hirundo import (
165
191
  git_storage = StorageGit(
166
192
  repo=GitRepo(
167
193
  name="BDD-100k-validation-dataset",
168
- repository_url="https://git@hf.co/datasets/hirundo-io/bdd100k-validation-only.git",
194
+ repository_url="https://huggingface.co/datasets/hirundo-io/bdd100k-validation-only",
169
195
  ),
170
196
  branch="main",
171
197
  )
@@ -183,21 +209,6 @@ test_dataset = OptimizationDataset(
183
209
  path="/BDD100K Val from Hirundo.zip/bdd100k/bdd100k.csv"
184
210
  ),
185
211
  ),
186
- classes=[
187
- "traffic light",
188
- "traffic sign",
189
- "car",
190
- "pedestrian",
191
- "bus",
192
- "truck",
193
- "rider",
194
- "bicycle",
195
- "motorcycle",
196
- "train",
197
- "other vehicle",
198
- "other person",
199
- "trailer",
200
- ],
201
212
  )
202
213
 
203
214
  test_dataset.run_optimization()
@@ -205,8 +216,8 @@ results = test_dataset.check_run()
205
216
  print(results)
206
217
  ```
207
218
 
208
- Note: Currently we only support the main CPython release 3.9, 3.10 and 3.11. PyPy support may be introduced in the future.
219
+ Note: Currently we only support the main CPython release 3.9, 3.10, 3.11, 3.12 & 3.13. PyPy support may be introduced in the future.
209
220
 
210
221
  ## Further documentation
211
222
 
212
- To learn more about how to use this library, please visit the [http://docs.hirundo.io/](documentation) or see the Google Colab examples.
223
+ To learn more about how to use this library, please visit the [http://docs.hirundo.io/](documentation) or see the [Google Colab examples](https://github.com/Hirundo-io/hirundo-client/tree/main/notebooks).
@@ -0,0 +1,25 @@
1
+ hirundo/__init__.py,sha256=1Uy9UZhaZPQQSMfAOJ0A_Of70tM8_MDq-HHdhrmpO6g,1301
2
+ hirundo/__main__.py,sha256=wcCrL4PjG51r5wVKqJhcoJPTLfHW0wNbD31DrUN0MWI,28
3
+ hirundo/_constraints.py,sha256=tgJfvp7ydyXilT8ViNk837rNRlpOVXLLeCSMt_YUUYA,6013
4
+ hirundo/_dataframe.py,sha256=sXEEbCNcLi83wyU9ii884YikCzfASo_3nnrDxhuCv7U,758
5
+ hirundo/_env.py,sha256=efX2sjvYlHkFr2Lcstelei67YSTFpVGT0l08ZsfiMuE,622
6
+ hirundo/_headers.py,sha256=3hybpD_X4SODv3cFZPt9AjGY2vvZaag5OKT3z1SHSjA,521
7
+ hirundo/_http.py,sha256=izlnuxStyPugjTAbD8Lo30tA4lZJ5d3kOENNduqrbX4,573
8
+ hirundo/_iter_sse_retrying.py,sha256=U331_wZRIbVzi-jnMqo8bp9jBC8MtFBLEs-X0ZvhSDw,4634
9
+ hirundo/_timeouts.py,sha256=gE58NU0t2e4KgKq2sk5rZcezDJAkgvRIbM5AVYFY6Ho,86
10
+ hirundo/_urls.py,sha256=0C85EbL0T-Bj25vJwjNs_obUG8ROSADpmbFdTAyhzlw,1375
11
+ hirundo/cli.py,sha256=5Tn0eXZGG92BR9HJYUaYozjFbS1t6UTw_I2R0tZBE04,7824
12
+ hirundo/dataset_enum.py,sha256=QnS3fy1OF4wvUtiIAHubKRhc611idS8huopEEolgqEM,1217
13
+ hirundo/dataset_optimization.py,sha256=fXi8MeI0PWwSyc5NuOzCrkgXT_mz24NV-dGOHDPkBR0,31256
14
+ hirundo/dataset_optimization_results.py,sha256=A9YyF5zaZXVtzeDE08I_05v90dhZQADpSjDcS_6eLMc,1129
15
+ hirundo/git.py,sha256=8LVnF4WCjZsxMHoRaVxbLiDAKpGCBEwlcZp7a30n9Zo,6573
16
+ hirundo/labeling.py,sha256=zXQCaqfdaLIG4qbzFGbb94L3FDdRMpdzHwbrDJE07Yk,5006
17
+ hirundo/logger.py,sha256=MUqrYp0fBlxWFhGl6P5t19_uqO7T_PNhrLN5bqY3i7s,275
18
+ hirundo/storage.py,sha256=y7cr_dngkfZq0gKnwWxrSqUXb1SycGpwFRVmS9Cn3h8,15942
19
+ hirundo/unzip.py,sha256=XJqvt2m5pWR-G-fnzgW75VOdd-K4_Rw2r4wiEhZgKZA,8245
20
+ hirundo-0.1.18.dist-info/licenses/LICENSE,sha256=fusGGjqT2RGlU6kbkaOk7d-gDnsjk17wq67AO0mwBZI,1065
21
+ hirundo-0.1.18.dist-info/METADATA,sha256=F_F0-EfUxVVCcgFue_hwCtxfIfmqBlwnpvzELuhMkAc,9302
22
+ hirundo-0.1.18.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
23
+ hirundo-0.1.18.dist-info/entry_points.txt,sha256=4ZtnA_Nl1Af8fLnHp3lwjbGDEGU1S6ujb_JwtuQ7ZPM,44
24
+ hirundo-0.1.18.dist-info/top_level.txt,sha256=cmyNqrNZOAYxnywJGFI1AJBLe4SkH8HGsfFx6ncdrbI,8
25
+ hirundo-0.1.18.dist-info/RECORD,,
@@ -1,5 +1,5 @@
1
1
  Wheel-Version: 1.0
2
- Generator: setuptools (75.6.0)
2
+ Generator: setuptools (80.9.0)
3
3
  Root-Is-Purelib: true
4
4
  Tag: py3-none-any
5
5
 
hirundo/enum.py DELETED
@@ -1,23 +0,0 @@
1
- from enum import Enum
2
-
3
-
4
- class LabelingType(str, Enum):
5
- """
6
- Enum indicate what type of labeling is used for the given dataset.
7
- Supported types are:
8
- """
9
-
10
- SINGLE_LABEL_CLASSIFICATION = "SingleLabelClassification"
11
- OBJECT_DETECTION = "ObjectDetection"
12
- SPEECH_TO_TEXT = "SpeechToText"
13
-
14
-
15
- class DatasetMetadataType(str, Enum):
16
- """
17
- Enum indicate what type of metadata is provided for the given dataset.
18
- Supported types are:
19
- """
20
-
21
- HIRUNDO_CSV = "HirundoCSV"
22
- COCO = "COCO"
23
- YOLO = "YOLO"
@@ -1,20 +0,0 @@
1
- hirundo/__init__.py,sha256=U_wcm3e0r1T66OQ7KHlWaOiwlPxf6e4RkTxA5uvaOOA,781
2
- hirundo/__main__.py,sha256=wcCrL4PjG51r5wVKqJhcoJPTLfHW0wNbD31DrUN0MWI,28
3
- hirundo/_constraints.py,sha256=gRv7fXwtjPGqYWIhkVYxu1B__3PdlYRqFyDkTpa9f74,1032
4
- hirundo/_env.py,sha256=dXUFPeEL1zPe-eBdWD4_WZvlgiY2cpWuVDzf41Qjuto,609
5
- hirundo/_headers.py,sha256=ggTyBwVT3nGyPidCcmYMX6pv0idzMxCI2S1BJQE-Bbs,253
6
- hirundo/_http.py,sha256=izlnuxStyPugjTAbD8Lo30tA4lZJ5d3kOENNduqrbX4,573
7
- hirundo/_iter_sse_retrying.py,sha256=U331_wZRIbVzi-jnMqo8bp9jBC8MtFBLEs-X0ZvhSDw,4634
8
- hirundo/_timeouts.py,sha256=IfX8-mrLp809-A_xSLv1DhIqZnO-Qvy4FcTtOtvqLog,42
9
- hirundo/cli.py,sha256=4-pdV483zqRJl8d-R9p_9YOGlehOnoMJzb3XAAdPRb0,6634
10
- hirundo/dataset_optimization.py,sha256=CuSrauzXiSa4kGBREao3nn-vmLVwMKTeHM7yEXesuso,33756
11
- hirundo/enum.py,sha256=ZEYBP-lrlVqfNWptlmw7JgLNhCyDirtWWPtoMvtg2AE,531
12
- hirundo/git.py,sha256=zzpEHGqoQXwOBQzNSmyf5lpUMc2FbomPqiokwMc4M8o,6777
13
- hirundo/logger.py,sha256=MUqrYp0fBlxWFhGl6P5t19_uqO7T_PNhrLN5bqY3i7s,275
14
- hirundo/storage.py,sha256=RsEmtbn79_iCY7pE1AKcBoAEqzXNkOc_UPUTaxSE0BM,16075
15
- hirundo-0.1.9.dist-info/LICENSE,sha256=fusGGjqT2RGlU6kbkaOk7d-gDnsjk17wq67AO0mwBZI,1065
16
- hirundo-0.1.9.dist-info/METADATA,sha256=8jjs7OGtVZZwFmyfdFGoTxC-de-1V6OLFJW26pYOB2E,8363
17
- hirundo-0.1.9.dist-info/WHEEL,sha256=PZUExdf71Ui_so67QXpySuHtCi3-J3wvF4ORK6k_S8U,91
18
- hirundo-0.1.9.dist-info/entry_points.txt,sha256=4ZtnA_Nl1Af8fLnHp3lwjbGDEGU1S6ujb_JwtuQ7ZPM,44
19
- hirundo-0.1.9.dist-info/top_level.txt,sha256=cmyNqrNZOAYxnywJGFI1AJBLe4SkH8HGsfFx6ncdrbI,8
20
- hirundo-0.1.9.dist-info/RECORD,,