hafnia 0.1.27__tar.gz → 0.2.1__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {hafnia-0.1.27 → hafnia-0.2.1}/.github/workflows/ci_cd.yaml +1 -1
- {hafnia-0.1.27 → hafnia-0.2.1}/.github/workflows/publish_docker.yaml +1 -1
- {hafnia-0.1.27 → hafnia-0.2.1}/.vscode/launch.json +2 -2
- {hafnia-0.1.27 → hafnia-0.2.1}/.vscode/settings.json +10 -2
- {hafnia-0.1.27 → hafnia-0.2.1}/PKG-INFO +209 -99
- hafnia-0.2.1/README.md +447 -0
- {hafnia-0.1.27 → hafnia-0.2.1}/docs/cli.md +3 -9
- hafnia-0.2.1/examples/example_dataset_recipe.py +165 -0
- hafnia-0.2.1/examples/example_hafnia_dataset.py +129 -0
- {hafnia-0.1.27 → hafnia-0.2.1}/examples/example_torchvision_dataloader.py +14 -8
- {hafnia-0.1.27 → hafnia-0.2.1}/pyproject.toml +7 -2
- {hafnia-0.1.27 → hafnia-0.2.1}/src/cli/__main__.py +2 -2
- {hafnia-0.1.27 → hafnia-0.2.1}/src/cli/config.py +17 -4
- hafnia-0.2.1/src/cli/dataset_cmds.py +60 -0
- {hafnia-0.1.27 → hafnia-0.2.1}/src/cli/runc_cmds.py +1 -1
- hafnia-0.2.1/src/hafnia/data/__init__.py +3 -0
- hafnia-0.2.1/src/hafnia/data/factory.py +23 -0
- hafnia-0.2.1/src/hafnia/dataset/dataset_helpers.py +91 -0
- hafnia-0.2.1/src/hafnia/dataset/dataset_names.py +72 -0
- hafnia-0.2.1/src/hafnia/dataset/dataset_recipe/dataset_recipe.py +327 -0
- hafnia-0.2.1/src/hafnia/dataset/dataset_recipe/recipe_transforms.py +53 -0
- hafnia-0.2.1/src/hafnia/dataset/dataset_recipe/recipe_types.py +140 -0
- hafnia-0.2.1/src/hafnia/dataset/dataset_upload_helper.py +468 -0
- hafnia-0.2.1/src/hafnia/dataset/hafnia_dataset.py +624 -0
- hafnia-0.2.1/src/hafnia/dataset/operations/dataset_stats.py +15 -0
- hafnia-0.2.1/src/hafnia/dataset/operations/dataset_transformations.py +82 -0
- hafnia-0.2.1/src/hafnia/dataset/operations/table_transformations.py +183 -0
- hafnia-0.2.1/src/hafnia/dataset/primitives/__init__.py +16 -0
- hafnia-0.2.1/src/hafnia/dataset/primitives/bbox.py +137 -0
- hafnia-0.2.1/src/hafnia/dataset/primitives/bitmask.py +182 -0
- hafnia-0.2.1/src/hafnia/dataset/primitives/classification.py +56 -0
- hafnia-0.2.1/src/hafnia/dataset/primitives/point.py +25 -0
- hafnia-0.2.1/src/hafnia/dataset/primitives/polygon.py +100 -0
- hafnia-0.2.1/src/hafnia/dataset/primitives/primitive.py +44 -0
- hafnia-0.2.1/src/hafnia/dataset/primitives/segmentation.py +51 -0
- hafnia-0.2.1/src/hafnia/dataset/primitives/utils.py +51 -0
- {hafnia-0.1.27 → hafnia-0.2.1}/src/hafnia/experiment/hafnia_logger.py +7 -7
- hafnia-0.2.1/src/hafnia/helper_testing.py +108 -0
- {hafnia-0.1.27 → hafnia-0.2.1}/src/hafnia/http.py +5 -3
- {hafnia-0.1.27 → hafnia-0.2.1}/src/hafnia/platform/__init__.py +2 -2
- hafnia-0.2.1/src/hafnia/platform/datasets.py +197 -0
- {hafnia-0.1.27 → hafnia-0.2.1}/src/hafnia/platform/download.py +85 -23
- hafnia-0.2.1/src/hafnia/torch_helpers.py +255 -0
- {hafnia-0.1.27 → hafnia-0.2.1}/src/hafnia/utils.py +21 -2
- hafnia-0.2.1/src/hafnia/visualizations/colors.py +267 -0
- hafnia-0.2.1/src/hafnia/visualizations/image_visualizations.py +202 -0
- hafnia-0.2.1/tests/conftest.py +81 -0
- hafnia-0.2.1/tests/data/expected_images/test_samples/test_check_dataset[caltech-101].png +0 -0
- hafnia-0.2.1/tests/data/expected_images/test_samples/test_check_dataset[caltech-256].png +0 -0
- hafnia-0.2.1/tests/data/expected_images/test_samples/test_check_dataset[cifar100].png +0 -0
- hafnia-0.2.1/tests/data/expected_images/test_samples/test_check_dataset[cifar10].png +0 -0
- hafnia-0.2.1/tests/data/expected_images/test_samples/test_check_dataset[coco-2017].png +0 -0
- hafnia-0.2.1/tests/data/expected_images/test_samples/test_check_dataset[midwest-vehicle-detection].png +0 -0
- hafnia-0.2.1/tests/data/expected_images/test_samples/test_check_dataset[mnist].png +0 -0
- hafnia-0.2.1/tests/data/expected_images/test_samples/test_check_dataset[tiny-dataset].png +0 -0
- hafnia-0.2.1/tests/data/expected_images/test_samples/test_dataset_draw_image_and_target[caltech-101].png +0 -0
- hafnia-0.2.1/tests/data/expected_images/test_samples/test_dataset_draw_image_and_target[caltech-256].png +0 -0
- hafnia-0.2.1/tests/data/expected_images/test_samples/test_dataset_draw_image_and_target[cifar100].png +0 -0
- hafnia-0.2.1/tests/data/expected_images/test_samples/test_dataset_draw_image_and_target[cifar10].png +0 -0
- hafnia-0.2.1/tests/data/expected_images/test_samples/test_dataset_draw_image_and_target[coco-2017].png +0 -0
- hafnia-0.2.1/tests/data/expected_images/test_samples/test_dataset_draw_image_and_target[midwest-vehicle-detection].png +0 -0
- hafnia-0.2.1/tests/data/expected_images/test_samples/test_dataset_draw_image_and_target[mnist].png +0 -0
- hafnia-0.2.1/tests/data/expected_images/test_samples/test_dataset_draw_image_and_target[tiny-dataset].png +0 -0
- hafnia-0.2.1/tests/data/expected_images/test_visualizations/test_blur_anonymization[coco-2017].png +0 -0
- hafnia-0.2.1/tests/data/expected_images/test_visualizations/test_blur_anonymization[tiny-dataset].png +0 -0
- hafnia-0.2.1/tests/data/expected_images/test_visualizations/test_draw_annotations[coco-2017].png +0 -0
- hafnia-0.2.1/tests/data/expected_images/test_visualizations/test_draw_annotations[tiny-dataset].png +0 -0
- hafnia-0.2.1/tests/data/expected_images/test_visualizations/test_mask_region[coco-2017].png +0 -0
- hafnia-0.2.1/tests/data/expected_images/test_visualizations/test_mask_region[tiny-dataset].png +0 -0
- hafnia-0.2.1/tests/data/micro_test_datasets/coco-2017/annotations.jsonl +3 -0
- hafnia-0.2.1/tests/data/micro_test_datasets/coco-2017/annotations.parquet +0 -0
- hafnia-0.2.1/tests/data/micro_test_datasets/coco-2017/data/182a2c0a3ce312cf.jpg +0 -0
- hafnia-0.2.1/tests/data/micro_test_datasets/coco-2017/data/4e95c6eb6209880a.jpg +0 -0
- hafnia-0.2.1/tests/data/micro_test_datasets/coco-2017/data/cf86c7a23edb55ce.jpg +0 -0
- hafnia-0.2.1/tests/data/micro_test_datasets/coco-2017/dataset_info.json +232 -0
- hafnia-0.2.1/tests/data/micro_test_datasets/tiny-dataset/annotations.jsonl +3 -0
- hafnia-0.2.1/tests/data/micro_test_datasets/tiny-dataset/annotations.parquet +0 -0
- hafnia-0.2.1/tests/data/micro_test_datasets/tiny-dataset/data/222bbd5721a8a86e.png +0 -0
- hafnia-0.2.1/tests/data/micro_test_datasets/tiny-dataset/data/3251d85443622e4c.png +0 -0
- hafnia-0.2.1/tests/data/micro_test_datasets/tiny-dataset/data/3657ababa44af9b6.png +0 -0
- hafnia-0.2.1/tests/data/micro_test_datasets/tiny-dataset/dataset_info.json +108 -0
- hafnia-0.2.1/tests/dataset/dataset_recipe/test_dataset_recipe_helpers.py +120 -0
- hafnia-0.2.1/tests/dataset/dataset_recipe/test_dataset_recipes.py +260 -0
- hafnia-0.2.1/tests/dataset/dataset_recipe/test_recipe_transformations.py +224 -0
- hafnia-0.2.1/tests/dataset/operations/test_dataset_transformations.py +0 -0
- hafnia-0.2.1/tests/dataset/operations/test_table_transformations.py +94 -0
- hafnia-0.2.1/tests/dataset/test_colors.py +8 -0
- hafnia-0.2.1/tests/dataset/test_dataset_helpers.py +79 -0
- hafnia-0.2.1/tests/dataset/test_hafnia_dataset.py +110 -0
- hafnia-0.2.1/tests/dataset/test_shape_primitives.py +70 -0
- {hafnia-0.1.27 → hafnia-0.2.1}/tests/test_check_example_scripts.py +4 -3
- {hafnia-0.1.27 → hafnia-0.2.1}/tests/test_cli.py +47 -44
- {hafnia-0.1.27 → hafnia-0.2.1}/tests/test_hafnia_logger.py +8 -10
- hafnia-0.2.1/tests/test_samples.py +171 -0
- hafnia-0.2.1/tests/test_visualizations.py +62 -0
- {hafnia-0.1.27 → hafnia-0.2.1}/uv.lock +135 -658
- hafnia-0.1.27/README.md +0 -342
- hafnia-0.1.27/examples/dataset_builder.py +0 -537
- hafnia-0.1.27/examples/example_load_dataset.py +0 -14
- hafnia-0.1.27/src/cli/data_cmds.py +0 -53
- hafnia-0.1.27/src/hafnia/data/__init__.py +0 -3
- hafnia-0.1.27/src/hafnia/data/factory.py +0 -67
- hafnia-0.1.27/src/hafnia/torch_helpers.py +0 -170
- hafnia-0.1.27/tests/test_samples.py +0 -174
- {hafnia-0.1.27 → hafnia-0.2.1}/.devcontainer/devcontainer.json +0 -0
- {hafnia-0.1.27 → hafnia-0.2.1}/.devcontainer/hooks/post_create +0 -0
- {hafnia-0.1.27 → hafnia-0.2.1}/.github/dependabot.yaml +0 -0
- {hafnia-0.1.27 → hafnia-0.2.1}/.github/workflows/Dockerfile +0 -0
- {hafnia-0.1.27 → hafnia-0.2.1}/.github/workflows/build.yaml +0 -0
- {hafnia-0.1.27 → hafnia-0.2.1}/.github/workflows/check_release.yaml +0 -0
- {hafnia-0.1.27 → hafnia-0.2.1}/.github/workflows/lint.yaml +0 -0
- {hafnia-0.1.27 → hafnia-0.2.1}/.github/workflows/publish_pypi.yaml +0 -0
- {hafnia-0.1.27 → hafnia-0.2.1}/.github/workflows/tests.yaml +0 -0
- {hafnia-0.1.27 → hafnia-0.2.1}/.gitignore +0 -0
- {hafnia-0.1.27 → hafnia-0.2.1}/.pre-commit-config.yaml +0 -0
- {hafnia-0.1.27 → hafnia-0.2.1}/.python-version +0 -0
- {hafnia-0.1.27 → hafnia-0.2.1}/.vscode/extensions.json +0 -0
- {hafnia-0.1.27 → hafnia-0.2.1}/LICENSE +0 -0
- {hafnia-0.1.27 → hafnia-0.2.1}/docs/release.md +0 -0
- {hafnia-0.1.27 → hafnia-0.2.1}/examples/example_logger.py +0 -0
- {hafnia-0.1.27 → hafnia-0.2.1}/src/cli/__init__.py +0 -0
- {hafnia-0.1.27 → hafnia-0.2.1}/src/cli/consts.py +0 -0
- {hafnia-0.1.27 → hafnia-0.2.1}/src/cli/experiment_cmds.py +0 -0
- {hafnia-0.1.27 → hafnia-0.2.1}/src/cli/profile_cmds.py +0 -0
- {hafnia-0.1.27 → hafnia-0.2.1}/src/cli/recipe_cmds.py +0 -0
- {hafnia-0.1.27 → hafnia-0.2.1}/src/hafnia/__init__.py +0 -0
- {hafnia-0.1.27 → hafnia-0.2.1}/src/hafnia/experiment/__init__.py +0 -0
- {hafnia-0.1.27 → hafnia-0.2.1}/src/hafnia/log.py +0 -0
- {hafnia-0.1.27 → hafnia-0.2.1}/src/hafnia/platform/builder.py +0 -0
- {hafnia-0.1.27 → hafnia-0.2.1}/src/hafnia/platform/experiment.py +0 -0
- {hafnia-0.1.27 → hafnia-0.2.1}/tests/test_builder.py +0 -0
- {hafnia-0.1.27 → hafnia-0.2.1}/tests/test_utils.py +0 -0
|
@@ -48,12 +48,12 @@
|
|
|
48
48
|
],
|
|
49
49
|
},
|
|
50
50
|
{
|
|
51
|
-
"name": "debug (hafnia
|
|
51
|
+
"name": "debug (hafnia dataset X)",
|
|
52
52
|
"type": "debugpy",
|
|
53
53
|
"request": "launch",
|
|
54
54
|
"program": "${workspaceFolder}/src/cli/__main__.py",
|
|
55
55
|
"args": [
|
|
56
|
-
"
|
|
56
|
+
"dataset",
|
|
57
57
|
"download",
|
|
58
58
|
"mnist",
|
|
59
59
|
//"./.data",
|
|
@@ -22,11 +22,19 @@
|
|
|
22
22
|
|
|
23
23
|
"python.testing.pytestArgs": [
|
|
24
24
|
"tests",
|
|
25
|
-
"-vv"
|
|
25
|
+
"-vv",
|
|
26
|
+
"--durations=20",
|
|
26
27
|
],
|
|
27
28
|
"python.testing.unittestEnabled": false,
|
|
28
29
|
"python.testing.pytestEnabled": true,
|
|
29
30
|
"cSpell.words": [
|
|
30
|
-
"
|
|
31
|
+
"bboxes",
|
|
32
|
+
"Bitmask",
|
|
33
|
+
"bitmasks",
|
|
34
|
+
"flatnonzero",
|
|
35
|
+
"fromarray",
|
|
36
|
+
"HAFNIA",
|
|
37
|
+
"ndarray",
|
|
38
|
+
"rprint"
|
|
31
39
|
]
|
|
32
40
|
}
|
|
@@ -1,22 +1,27 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: hafnia
|
|
3
|
-
Version: 0.1
|
|
3
|
+
Version: 0.2.1
|
|
4
4
|
Summary: Python SDK for communication with Hafnia platform.
|
|
5
5
|
Author-email: Milestone Systems <hafniaplatform@milestone.dk>
|
|
6
6
|
License-File: LICENSE
|
|
7
7
|
Requires-Python: >=3.10
|
|
8
8
|
Requires-Dist: boto3>=1.35.91
|
|
9
9
|
Requires-Dist: click>=8.1.8
|
|
10
|
-
Requires-Dist: datasets>=3.2.0
|
|
11
10
|
Requires-Dist: emoji>=2.14.1
|
|
12
11
|
Requires-Dist: flatten-dict>=0.4.2
|
|
12
|
+
Requires-Dist: more-itertools>=10.7.0
|
|
13
|
+
Requires-Dist: opencv-python-headless>=4.11.0.86
|
|
13
14
|
Requires-Dist: pathspec>=0.12.1
|
|
14
15
|
Requires-Dist: pillow>=11.1.0
|
|
16
|
+
Requires-Dist: polars>=1.30.0
|
|
15
17
|
Requires-Dist: pyarrow>=18.1.0
|
|
18
|
+
Requires-Dist: pycocotools>=2.0.10
|
|
16
19
|
Requires-Dist: pydantic>=2.10.4
|
|
17
20
|
Requires-Dist: rich>=13.9.4
|
|
21
|
+
Requires-Dist: s5cmd>=0.2.0
|
|
18
22
|
Requires-Dist: seedir>=0.5.0
|
|
19
23
|
Requires-Dist: tqdm>=4.67.1
|
|
24
|
+
Requires-Dist: xxhash>=3.5.0
|
|
20
25
|
Description-Content-Type: text/markdown
|
|
21
26
|
|
|
22
27
|
# Hafnia
|
|
@@ -28,8 +33,8 @@ The package includes the following interfaces:
|
|
|
28
33
|
|
|
29
34
|
- `cli`: A Command Line Interface (CLI) to 1) configure/connect to Hafnia's [Training-aaS](https://hafnia.readme.io/docs/training-as-a-service) and 2) create and
|
|
30
35
|
launch recipe scripts.
|
|
31
|
-
- `hafnia`: A python package
|
|
32
|
-
|
|
36
|
+
- `hafnia`: A python package including `HafniaDataset` to manage datasets and `HafniaLogger` to do
|
|
37
|
+
experiment tracking.
|
|
33
38
|
|
|
34
39
|
|
|
35
40
|
## The Concept: Training as a Service (Training-aaS)
|
|
@@ -76,7 +81,7 @@ Copy the key and save it for later use.
|
|
|
76
81
|
1. Download `mnist` from terminal to verify that your configuration is working.
|
|
77
82
|
|
|
78
83
|
```bash
|
|
79
|
-
hafnia
|
|
84
|
+
hafnia dataset download mnist --force
|
|
80
85
|
```
|
|
81
86
|
|
|
82
87
|
## Getting started: Loading datasets samples
|
|
@@ -84,115 +89,220 @@ With Hafnia configured on your local machine, it is now possible to download
|
|
|
84
89
|
and explore the dataset sample with a python script:
|
|
85
90
|
|
|
86
91
|
```python
|
|
87
|
-
from hafnia.data import load_dataset
|
|
92
|
+
from hafnia.data import load_dataset, get_dataset_path
|
|
93
|
+
from hafnia.dataset.hafnia_dataset import HafniaDataset
|
|
88
94
|
|
|
89
|
-
|
|
95
|
+
# To download the sample dataset use:
|
|
96
|
+
path_dataset = get_dataset_path("midwest-vehicle-detection")
|
|
90
97
|
```
|
|
91
98
|
|
|
92
|
-
|
|
93
|
-
|
|
94
|
-
|
|
99
|
+
This will download the dataset sample `midwest-vehicle-detection` to the local `.data/datasets/` folder
|
|
100
|
+
in a human readable format.
|
|
101
|
+
|
|
102
|
+
Images are stored in the `data` folder, general dataset information is stored in `dataset_info.json`
|
|
103
|
+
and annotations are stored as both `annotations.jsonl`(jsonl) and `annotations.parquet`.
|
|
104
|
+
|
|
105
|
+
```bash
|
|
106
|
+
$ cd .data/datasets/
|
|
107
|
+
$ tree midwest-vehicle-detection
|
|
108
|
+
midwest-vehicle-detection
|
|
109
|
+
└── sample
|
|
110
|
+
├── annotations.jsonl
|
|
111
|
+
├── annotations.parquet
|
|
112
|
+
├── data
|
|
113
|
+
│ ├── video_0026a86b-2f43-49f2-a17c-59244d10a585_1fps_mp4_frame_00000.png
|
|
114
|
+
│ ....
|
|
115
|
+
│ ├── video_ff17d777-e783-44e2-9bff-a4adac73de4b_1fps_mp4_frame_00000.png
|
|
116
|
+
│ └── video_ff17d777-e783-44e2-9bff-a4adac73de4b_1fps_mp4_frame_00100.png
|
|
117
|
+
└── dataset_info.json
|
|
118
|
+
|
|
119
|
+
3 directories, 217 files
|
|
120
|
+
```
|
|
121
|
+
|
|
122
|
+
You can interact with data as you want, but we also provide `HafniaDataset`
|
|
123
|
+
for loading/saving, managing and interacting with the dataset.
|
|
124
|
+
|
|
125
|
+
We recommend to visit and potentially execute the example script [examples/example_hafnia_dataset.py](examples/example_hafnia_dataset.py)
|
|
126
|
+
to see how to use the `HafniaDataset` class and its methods.
|
|
127
|
+
|
|
128
|
+
Below is a short introduction to the `HafniaDataset` class.
|
|
129
|
+
|
|
130
|
+
```python
|
|
131
|
+
from hafnia.dataset.hafnia_dataset import HafniaDataset, Sample
|
|
132
|
+
|
|
133
|
+
# Load dataset
|
|
134
|
+
dataset = HafniaDataset.read_from_path(path_dataset)
|
|
135
|
+
|
|
136
|
+
# Alternatively, you can use the 'load_dataset' function to download and load dataset in one go.
|
|
137
|
+
# dataset = load_dataset("midwest-vehicle-detection")
|
|
138
|
+
|
|
139
|
+
# Print dataset information
|
|
140
|
+
dataset.print_stats()
|
|
141
|
+
|
|
142
|
+
# Create a dataset split for training
|
|
143
|
+
dataset_train = dataset.create_split_dataset("train")
|
|
144
|
+
```
|
|
145
|
+
|
|
146
|
+
The `HafniaDataset` object provides a convenient way to interact with the dataset, including methods for
|
|
147
|
+
creating splits, accessing samples, printing statistics, saving to and loading from disk.
|
|
148
|
+
|
|
149
|
+
In essence, the `HafniaDataset` class contains `dataset.info` with dataset information
|
|
150
|
+
and `dataset.samples` with annotations as a polars DataFrame
|
|
95
151
|
|
|
96
152
|
```python
|
|
97
|
-
|
|
98
|
-
|
|
99
|
-
|
|
100
|
-
|
|
101
|
-
|
|
102
|
-
|
|
103
|
-
|
|
104
|
-
|
|
105
|
-
|
|
106
|
-
|
|
107
|
-
|
|
108
|
-
|
|
109
|
-
test: Dataset({
|
|
110
|
-
features: ['image_id', 'image', 'height', 'width', 'objects', 'Weather', 'Surface Conditions'],
|
|
111
|
-
num_rows: 21
|
|
112
|
-
})
|
|
113
|
-
})
|
|
153
|
+
# Annotations are stored in a polars DataFrame
|
|
154
|
+
print(dataset.samples.head(2))
|
|
155
|
+
shape: (2, 14)
|
|
156
|
+
┌──────────────┬─────────────────────────────────┬────────┬───────┬───┬─────────────────────────────────┬──────────┬──────────┬─────────────────────────────────┐
|
|
157
|
+
│ sample_index ┆ file_name ┆ height ┆ width ┆ … ┆ objects ┆ bitmasks ┆ polygons ┆ meta │
|
|
158
|
+
│ --- ┆ --- ┆ --- ┆ --- ┆ ┆ --- ┆ --- ┆ --- ┆ --- │
|
|
159
|
+
│ u32 ┆ str ┆ i64 ┆ i64 ┆ ┆ list[struct[11]] ┆ null ┆ null ┆ struct[5] │
|
|
160
|
+
╞══════════════╪═════════════════════════════════╪════════╪═══════╪═══╪═════════════════════════════════╪══════════╪══════════╪═════════════════════════════════╡
|
|
161
|
+
│ 0 ┆ /home/ubuntu/code/hafnia/.data… ┆ 1080 ┆ 1920 ┆ … ┆ [{0.0492,0.0357,0.2083,0.23,"V… ┆ null ┆ null ┆ {120.0,1.0,"2024-07-10T18:30:0… │
|
|
162
|
+
│ 100 ┆ /home/ubuntu/code/hafnia/.data… ┆ 1080 ┆ 1920 ┆ … ┆ [{0.146382,0.078704,0.42963,0.… ┆ null ┆ null ┆ {120.0,1.0,"2024-07-10T18:30:0… │
|
|
163
|
+
└──────────────┴─────────────────────────────────┴────────┴───────┴───┴─────────────────────────────────┴──────────┴──────────┴─────────────────────────────────┘
|
|
164
|
+
```
|
|
114
165
|
|
|
166
|
+
```python
|
|
167
|
+
# General dataset information is stored in `dataset.info`
|
|
168
|
+
rich.print(dataset.info)
|
|
169
|
+
DatasetInfo(
|
|
170
|
+
dataset_name='midwest-vehicle-detection',
|
|
171
|
+
version='1.0.0',
|
|
172
|
+
tasks=[
|
|
173
|
+
TaskInfo(
|
|
174
|
+
primitive=<class 'hafnia.dataset.primitives.Bbox'>,
|
|
175
|
+
class_names=[
|
|
176
|
+
'Person',
|
|
177
|
+
'Vehicle.Bicycle',
|
|
178
|
+
'Vehicle.Motorcycle',
|
|
179
|
+
'Vehicle.Car',
|
|
180
|
+
'Vehicle.Van',
|
|
181
|
+
'Vehicle.RV',
|
|
182
|
+
'Vehicle.Single_Truck',
|
|
183
|
+
'Vehicle.Combo_Truck',
|
|
184
|
+
'Vehicle.Pickup_Truck',
|
|
185
|
+
'Vehicle.Trailer',
|
|
186
|
+
'Vehicle.Emergency_Vehicle',
|
|
187
|
+
'Vehicle.Bus',
|
|
188
|
+
'Vehicle.Heavy_Duty_Vehicle'
|
|
189
|
+
],
|
|
190
|
+
name='bboxes'
|
|
191
|
+
),
|
|
192
|
+
TaskInfo(primitive=<class 'hafnia.dataset.primitives.Classification'>, class_names=['Clear', 'Foggy'], name='Weather'),
|
|
193
|
+
TaskInfo(primitive=<class 'hafnia.dataset.primitives.Classification'>, class_names=['Dry', 'Wet'], name='Surface Conditions')
|
|
194
|
+
],
|
|
195
|
+
meta={
|
|
196
|
+
'n_videos': 109,
|
|
197
|
+
'n_cameras': 20,
|
|
198
|
+
'duration': 13080.0,
|
|
199
|
+
'duration_average': 120.0,
|
|
200
|
+
...
|
|
201
|
+
}
|
|
202
|
+
)
|
|
115
203
|
```
|
|
116
204
|
|
|
117
|
-
|
|
118
|
-
Each
|
|
205
|
+
You can iterate and access samples in the dataset using the `HafniaDataset` object.
|
|
206
|
+
Each sample contain image and annotations information.
|
|
119
207
|
|
|
120
|
-
The features of the dataset can be viewed with the `features` attribute.
|
|
121
208
|
```python
|
|
122
|
-
|
|
123
|
-
|
|
124
|
-
|
|
125
|
-
|
|
126
|
-
|
|
127
|
-
|
|
128
|
-
|
|
129
|
-
|
|
130
|
-
id=None),
|
|
131
|
-
length=-1,
|
|
132
|
-
id=None),
|
|
133
|
-
'class_idx': ClassLabel(names=['Vehicle.Bicycle',
|
|
134
|
-
'Vehicle.Motorcycle',
|
|
135
|
-
'Vehicle.Car',
|
|
136
|
-
'Vehicle.Van',
|
|
137
|
-
'Vehicle.RV',
|
|
138
|
-
'Vehicle.Single_Truck',
|
|
139
|
-
'Vehicle.Combo_Truck',
|
|
140
|
-
'Vehicle.Pickup_Truck',
|
|
141
|
-
'Vehicle.Trailer',
|
|
142
|
-
'Vehicle.Emergency_Vehicle',
|
|
143
|
-
'Vehicle.Bus',
|
|
144
|
-
'Vehicle.Heavy_Duty_Vehicle'],
|
|
145
|
-
id=None),
|
|
146
|
-
'class_name': Value(dtype='string', id=None),
|
|
147
|
-
'id': Value(dtype='string', id=None)},
|
|
148
|
-
length=-1,
|
|
149
|
-
id=None),
|
|
150
|
-
'width': Value(dtype='int64', id=None)}
|
|
209
|
+
from hafnia.dataset.hafnia_dataset import HafniaDataset, Sample
|
|
210
|
+
# Access the first sample in the dataset either by index or by iterating over the dataset
|
|
211
|
+
sample_dict = dataset[0]
|
|
212
|
+
|
|
213
|
+
for sample_dict in dataset:
|
|
214
|
+
sample = Sample(**sample_dict)
|
|
215
|
+
print(sample.sample_id, sample.objects)
|
|
216
|
+
break
|
|
151
217
|
```
|
|
218
|
+
Not that it is possible to create a `Sample` object from the sample dictionary.
|
|
219
|
+
This is useful for accessing the image and annotations in a structured way.
|
|
152
220
|
|
|
153
|
-
View the first sample in the training set:
|
|
154
221
|
```python
|
|
155
|
-
#
|
|
156
|
-
|
|
157
|
-
|
|
158
|
-
|
|
159
|
-
|
|
160
|
-
|
|
161
|
-
|
|
162
|
-
'Surface Conditions': 0,
|
|
163
|
-
'objects': {'bbox': [[441, 180, 121, 126],
|
|
164
|
-
[549, 151, 131, 103],
|
|
165
|
-
[1845, 722, 68, 130],
|
|
166
|
-
[1810, 571, 110, 149]],
|
|
167
|
-
'class_idx': [7, 7, 2, 2],
|
|
168
|
-
'class_name': ['Vehicle.Pickup_Truck',
|
|
169
|
-
'Vehicle.Pickup_Truck',
|
|
170
|
-
'Vehicle.Car',
|
|
171
|
-
'Vehicle.Car'],
|
|
172
|
-
'id': ['HW6WiLAJ', 'T/ccFpRi', 'CS0O8B6W', 'DKrJGzjp']},
|
|
173
|
-
'width': 1920}
|
|
222
|
+
# By unpacking the sample dictionary, you can create a `Sample` object.
|
|
223
|
+
sample = Sample(**sample_dict)
|
|
224
|
+
|
|
225
|
+
# Use the `Sample` object to easily read image and draw annotations
|
|
226
|
+
image = sample.read_image()
|
|
227
|
+
image_annotations = sample.draw_annotations()
|
|
228
|
+
```
|
|
174
229
|
|
|
230
|
+
Note that the `Sample` object contains all information about the sample, including image and metadata.
|
|
231
|
+
It also contain annotations as primitive types such as `Bbox`, `Classification`.
|
|
232
|
+
|
|
233
|
+
```python
|
|
234
|
+
rich.print(sample)
|
|
235
|
+
Sample(
|
|
236
|
+
sample_index=120,
|
|
237
|
+
file_name='/home/ubuntu/code/hafnia/.data/datasets/midwest-vehicle-detection/data/343403325f27e390.png',
|
|
238
|
+
height=1080,
|
|
239
|
+
width=1920,
|
|
240
|
+
split='train',
|
|
241
|
+
is_sample=True,
|
|
242
|
+
collection_index=None,
|
|
243
|
+
collection_id=None,
|
|
244
|
+
remote_path='s3://mdi-production-midwest-vehicle-detection/sample/data/343403325f27e390.png',
|
|
245
|
+
classifications=[
|
|
246
|
+
Classification(
|
|
247
|
+
class_name='Clear',
|
|
248
|
+
class_idx=0,
|
|
249
|
+
object_id=None,
|
|
250
|
+
confidence=None,
|
|
251
|
+
ground_truth=True,
|
|
252
|
+
task_name='Weather',
|
|
253
|
+
meta=None
|
|
254
|
+
),
|
|
255
|
+
Classification(
|
|
256
|
+
class_name='Day',
|
|
257
|
+
class_idx=3,
|
|
258
|
+
object_id=None,
|
|
259
|
+
confidence=None,
|
|
260
|
+
ground_truth=True,
|
|
261
|
+
task_name='Time of Day',
|
|
262
|
+
meta=None
|
|
263
|
+
),
|
|
264
|
+
...
|
|
265
|
+
],
|
|
266
|
+
objects=[
|
|
267
|
+
Bbox(
|
|
268
|
+
height=0.0492,
|
|
269
|
+
width=0.0357,
|
|
270
|
+
top_left_x=0.2083,
|
|
271
|
+
top_left_y=0.23,
|
|
272
|
+
class_name='Vehicle.Car',
|
|
273
|
+
class_idx=3,
|
|
274
|
+
object_id='cXT4NRVu',
|
|
275
|
+
confidence=None,
|
|
276
|
+
ground_truth=True,
|
|
277
|
+
task_name='bboxes',
|
|
278
|
+
meta=None
|
|
279
|
+
),
|
|
280
|
+
Bbox(
|
|
281
|
+
height=0.0457,
|
|
282
|
+
width=0.0408,
|
|
283
|
+
top_left_x=0.2521,
|
|
284
|
+
top_left_y=0.2153,
|
|
285
|
+
class_name='Vehicle.Car',
|
|
286
|
+
class_idx=3,
|
|
287
|
+
object_id='MelbIIDU',
|
|
288
|
+
confidence=None,
|
|
289
|
+
ground_truth=True,
|
|
290
|
+
task_name='bboxes',
|
|
291
|
+
meta=None
|
|
292
|
+
),
|
|
293
|
+
...
|
|
294
|
+
],
|
|
295
|
+
bitmasks=None, # Optional a list of Bitmask objects List[Bitmask]
|
|
296
|
+
polygons=None, # Optional a list of Polygon objects List[Polygon]
|
|
297
|
+
meta={
|
|
298
|
+
'video.data_duration': 120.0,
|
|
299
|
+
'video.data_fps': 1.0,
|
|
300
|
+
...
|
|
301
|
+
}
|
|
302
|
+
)
|
|
175
303
|
```
|
|
176
304
|
|
|
177
|
-
|
|
178
|
-
We have defined a set of features that are common across all datasets in the Hafnia data library.
|
|
179
|
-
|
|
180
|
-
- `image`: The image itself, stored as a PIL image
|
|
181
|
-
- `height`: The height of the image in pixels
|
|
182
|
-
- `width`: The width of the image in pixels
|
|
183
|
-
- `[IMAGE_CLASSIFICATION_TASK]`: [Optional] Image classification tasks are top-level `ClassLabel` feature.
|
|
184
|
-
`ClassLabel` is a Hugging Face feature that maps class indices to class names.
|
|
185
|
-
In above example we have two classification tasks:
|
|
186
|
-
- `Weather`: Classifies the weather conditions in the image, with possible values `Clear` and `Foggy`
|
|
187
|
-
- `Surface Conditions`: Classifies the surface conditions in the image, with possible values `Dry` and `Wet`
|
|
188
|
-
- `objects`: A dictionary containing information about objects in the image, including:
|
|
189
|
-
- `bbox`: Bounding boxes for each object, represented with a list of bounding box coordinates
|
|
190
|
-
`[xmin, ymin, bbox_width, bbox_height]`. Each bounding box is defined with a top-left corner coordinate
|
|
191
|
-
`(xmin, ymin)` and bounding box width and height `(bbox_width, bbox_height)` in pixels.
|
|
192
|
-
- `class_idx`: Class indices for each detected object. This is a
|
|
193
|
-
`ClassLabel` feature that maps to the `class_name` feature.
|
|
194
|
-
- `class_name`: Class names for each detected object
|
|
195
|
-
- `id`: Unique identifiers for each detected object
|
|
305
|
+
To learn more, view and potentially execute the example script [examples/example_hafnia_dataset.py](examples/example_hafnia_dataset.py).
|
|
196
306
|
|
|
197
307
|
### Dataset Locally vs. Training-aaS
|
|
198
308
|
An important feature of `load_dataset` is that it will return the full dataset
|
|
@@ -354,7 +464,7 @@ curl -LsSf https://astral.sh/uv/install.sh | sh
|
|
|
354
464
|
Create virtual environment and install python dependencies
|
|
355
465
|
|
|
356
466
|
```bash
|
|
357
|
-
uv sync
|
|
467
|
+
uv sync --dev
|
|
358
468
|
```
|
|
359
469
|
|
|
360
470
|
Run tests:
|