hafnia 0.1.25__py3-none-any.whl → 0.1.26__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: hafnia
3
- Version: 0.1.25
3
+ Version: 0.1.26
4
4
  Summary: Python SDK for communication with Hafnia platform.
5
5
  Author-email: Milestone Systems <hafniaplatform@milestone.dk>
6
6
  License-File: LICENSE
@@ -17,10 +17,6 @@ Requires-Dist: pydantic>=2.10.4
17
17
  Requires-Dist: rich>=13.9.4
18
18
  Requires-Dist: seedir>=0.5.0
19
19
  Requires-Dist: tqdm>=4.67.1
20
- Provides-Extra: torch
21
- Requires-Dist: flatten-dict>=0.4.2; extra == 'torch'
22
- Requires-Dist: torch>=2.6.0; extra == 'torch'
23
- Requires-Dist: torchvision>=0.21.0; extra == 'torch'
24
20
  Description-Content-Type: text/markdown
25
21
 
26
22
  # Hafnia
@@ -91,14 +87,117 @@ and explore the dataset sample with a python script:
91
87
  from hafnia.data import load_dataset
92
88
 
93
89
  dataset_splits = load_dataset("mnist")
94
- print(dataset_splits)
95
- print(dataset_splits["train"])
96
90
  ```
91
+
92
+ ### Dataset Format
97
93
  The returned sample dataset is a [hugging face dataset](https://huggingface.co/docs/datasets/index)
98
94
  and contains train, validation and test splits.
99
95
 
96
+ ```python
97
+ print(dataset_splits)
98
+
99
+ # Output:
100
+ >>> DatasetDict({
101
+ train: Dataset({
102
+ features: ['image_id', 'image', 'height', 'width', 'objects', 'Weather', 'Surface Conditions'],
103
+ num_rows: 172
104
+ })
105
+ validation: Dataset({
106
+ features: ['image_id', 'image', 'height', 'width', 'objects', 'Weather', 'Surface Conditions'],
107
+ num_rows: 21
108
+ })
109
+ test: Dataset({
110
+ features: ['image_id', 'image', 'height', 'width', 'objects', 'Weather', 'Surface Conditions'],
111
+ num_rows: 21
112
+ })
113
+ })
114
+
115
+ ```
116
+
117
+ A Hugging Face dataset is a dictionary with splits, where each split is a `Dataset` object.
118
+ Each `Dataset` is structured as a table with a set of columns (also called features) and a row for each sample.
119
+
120
+ The features of the dataset can be viewed with the `features` attribute.
121
+ ```python
122
+ # View features of the train split
123
+ pprint.pprint(dataset["train"].features)
124
+ {'Surface Conditions': ClassLabel(names=['Dry', 'Wet'], id=None),
125
+ 'Weather': ClassLabel(names=['Clear', 'Foggy'], id=None),
126
+ 'height': Value(dtype='int64', id=None),
127
+ 'image': Image(mode=None, decode=True, id=None),
128
+ 'image_id': Value(dtype='int64', id=None),
129
+ 'objects': Sequence(feature={'bbox': Sequence(feature=Value(dtype='int64',
130
+ id=None),
131
+ length=-1,
132
+ id=None),
133
+ 'class_idx': ClassLabel(names=['Vehicle.Bicycle',
134
+ 'Vehicle.Motorcycle',
135
+ 'Vehicle.Car',
136
+ 'Vehicle.Van',
137
+ 'Vehicle.RV',
138
+ 'Vehicle.Single_Truck',
139
+ 'Vehicle.Combo_Truck',
140
+ 'Vehicle.Pickup_Truck',
141
+ 'Vehicle.Trailer',
142
+ 'Vehicle.Emergency_Vehicle',
143
+ 'Vehicle.Bus',
144
+ 'Vehicle.Heavy_Duty_Vehicle'],
145
+ id=None),
146
+ 'class_name': Value(dtype='string', id=None),
147
+ 'id': Value(dtype='string', id=None)},
148
+ length=-1,
149
+ id=None),
150
+ 'width': Value(dtype='int64', id=None)}
151
+ ```
152
+
153
+ View the first sample in the training set:
154
+ ```python
155
+ # Print sample from the training set
156
+ pprint.pprint(dataset["train"][0])
157
+
158
+ {'image': <PIL.PngImagePlugin.PngImageFile image mode=RGB size=1920x1080 at 0x79D6292C5ED0>,
159
+ 'image_id': 4920,
160
+ 'height': 1080,
161
+ 'Weather': 0,
162
+ 'Surface Conditions': 0,
163
+ 'objects': {'bbox': [[441, 180, 121, 126],
164
+ [549, 151, 131, 103],
165
+ [1845, 722, 68, 130],
166
+ [1810, 571, 110, 149]],
167
+ 'class_idx': [7, 7, 2, 2],
168
+ 'class_name': ['Vehicle.Pickup_Truck',
169
+ 'Vehicle.Pickup_Truck',
170
+ 'Vehicle.Car',
171
+ 'Vehicle.Car'],
172
+ 'id': ['HW6WiLAJ', 'T/ccFpRi', 'CS0O8B6W', 'DKrJGzjp']},
173
+ 'width': 1920}
174
+
175
+ ```
176
+
177
+ For hafnia based datasets, we want to standardized how a dataset and dataset tasks are represented.
178
+ We have defined a set of features that are common across all datasets in the Hafnia data library.
179
+
180
+ - `image`: The image itself, stored as a PIL image
181
+ - `height`: The height of the image in pixels
182
+ - `width`: The width of the image in pixels
183
+ - `[IMAGE_CLASSIFICATION_TASK]`: [Optional] Image classification tasks are top-level `ClassLabel` feature.
184
+ `ClassLabel` is a Hugging Face feature that maps class indices to class names.
185
+ In above example we have two classification tasks:
186
+ - `Weather`: Classifies the weather conditions in the image, with possible values `Clear` and `Foggy`
187
+ - `Surface Conditions`: Classifies the surface conditions in the image, with possible values `Dry` and `Wet`
188
+ - `objects`: A dictionary containing information about objects in the image, including:
189
+ - `bbox`: Bounding boxes for each object, represented with a list of bounding box coordinates
190
+ `[xmin, ymin, bbox_width, bbox_height]`. Each bounding box is defined with a top-left corner coordinate
191
+ `(xmin, ymin)` and bounding box width and height `(bbox_width, bbox_height)` in pixels.
192
+ - `class_idx`: Class indices for each detected object. This is a
193
+ `ClassLabel` feature that maps to the `class_name` feature.
194
+ - `class_name`: Class names for each detected object
195
+ - `id`: Unique identifiers for each detected object
196
+
197
+ ### Dataset Locally vs. Training-aaS
100
198
  An important feature of `load_dataset` is that it will return the full dataset
101
- when loaded on the Hafnia platform.
199
+ when loaded with Training-aaS on the Hafnia platform.
200
+
102
201
  This enables seamlessly switching between running/validating a training script
103
202
  locally (on the sample dataset) and running full model trainings with Training-aaS (on the full dataset).
104
203
  without changing code or configurations for the training script.
@@ -160,12 +259,58 @@ with a dataloader that performs data augmentations and batching of the dataset a
160
259
  To support this, we have provided a torch dataloader example script
161
260
  [example_torchvision_dataloader.py](./examples/example_torchvision_dataloader.py).
162
261
 
163
- The script demonstrates how to make a dataloader with data augmentation (`torchvision.transforms.v2`)
164
- and a helper function for visualizing image and labels.
262
+ The script demonstrates how to load a dataset sample, apply data augmentations using
263
+ `torchvision.transforms.v2`, and visualize the dataset with `torch_helpers.draw_image_and_targets`.
264
+
265
+ Note also how `torch_helpers.TorchVisionCollateFn` is used in combination with the `DataLoader` from
266
+ `torch.utils.data` to handle the dataset's collate function.
165
267
 
166
268
  The dataloader and visualization function supports computer vision tasks
167
269
  and datasets available in the data library.
168
270
 
271
+ ```python
272
+ # Load Hugging Face dataset
273
+ dataset_splits = load_dataset("midwest-vehicle-detection")
274
+
275
+ # Define transforms
276
+ train_transforms = v2.Compose(
277
+ [
278
+ v2.RandomResizedCrop(size=(224, 224), antialias=True),
279
+ v2.RandomHorizontalFlip(p=0.5),
280
+ v2.ToDtype(torch.float32, scale=True),
281
+ v2.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
282
+ ]
283
+ )
284
+ test_transforms = v2.Compose(
285
+ [
286
+ v2.Resize(size=(224, 224), antialias=True),
287
+ v2.ToDtype(torch.float32, scale=True),
288
+ v2.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
289
+ ]
290
+ )
291
+
292
+ keep_metadata = True
293
+ train_dataset = torch_helpers.TorchvisionDataset(
294
+ dataset_splits["train"], transforms=train_transforms, keep_metadata=keep_metadata
295
+ )
296
+ test_dataset = torch_helpers.TorchvisionDataset(
297
+ dataset_splits["test"], transforms=test_transforms, keep_metadata=keep_metadata
298
+ )
299
+
300
+ # Visualize sample
301
+ image, targets = train_dataset[0]
302
+ visualize_image = torch_helpers.draw_image_and_targets(image=image, targets=targets)
303
+ pil_image = torchvision.transforms.functional.to_pil_image(visualize_image)
304
+ pil_image.save("visualized_labels.png")
305
+
306
+ # Create DataLoaders - using TorchVisionCollateFn
307
+ collate_fn = torch_helpers.TorchVisionCollateFn(
308
+ skip_stacking=["objects.bbox", "objects.class_idx"]
309
+ )
310
+ train_loader = DataLoader(train_dataset, batch_size=8, shuffle=True, collate_fn=collate_fn)
311
+ ```
312
+
313
+
169
314
  ## Example: Training-aaS
170
315
  By combining logging and dataset loading, we can now construct our model training recipe.
171
316
 
@@ -206,10 +351,10 @@ Install uv
206
351
  curl -LsSf https://astral.sh/uv/install.sh | sh
207
352
  ```
208
353
 
209
- Install python dependencies including developer (`--dev`) and optional dependencies (`--all-extras`).
354
+ Create virtual environment and install python dependencies
210
355
 
211
356
  ```bash
212
- uv sync --all-extras --dev
357
+ uv sync
213
358
  ```
214
359
 
215
360
  Run tests:
@@ -0,0 +1,27 @@
1
+ cli/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
2
+ cli/__main__.py,sha256=MX0RT1BP3t59rzCvdUqfw39Kw05HOF4OEtjDTwIU9h8,1594
3
+ cli/config.py,sha256=R9w0NKIOtIxRKNs7ieeUrIKwRkrTlK5PqOVjc5VYljE,4923
4
+ cli/consts.py,sha256=sj0MRwbbCT2Yl77FPddck1VWkFxp7QY6I9l1o75j_aE,963
5
+ cli/data_cmds.py,sha256=BQiythAPwAwudgdUa68v50a345uw5flrcDiBHLGp9lo,1460
6
+ cli/experiment_cmds.py,sha256=L-k_ZJ4B7I4cA8OvHcheSwXM6nx9aTF9G7eKBzAcOzQ,1961
7
+ cli/profile_cmds.py,sha256=-HQcFgYI6Rqaefi0Nj-91KhiqPKUj7zOaiJWbHx_bac,3196
8
+ cli/recipe_cmds.py,sha256=qnMfF-te47HXNkgyA0hm9X3etDQsqMnrVEGDCrzVjZU,1462
9
+ cli/runc_cmds.py,sha256=6fHMi_dEd8g3Cx9PEfU4gJMZf5-G0IUPDcZh6DNq8Mw,4953
10
+ hafnia/__init__.py,sha256=Zphq-cQoX95Z11zm4lkrU-YiAJxddR7IBfwDkxeHoDE,108
11
+ hafnia/http.py,sha256=psCWdNKfKYiBrYD6bezat1AeHh77JJtJrPePiUAjTIk,2948
12
+ hafnia/log.py,sha256=sWF8tz78yBtwZ9ddzm19L1MBSBJ3L4G704IGeT1_OEU,784
13
+ hafnia/torch_helpers.py,sha256=P_Jl4IwqUebKVCOXNe6iTorJZA3S-3d92HV274UHIko,7456
14
+ hafnia/utils.py,sha256=mJ5aOjSVSOrrQnpmaKLK71ld5jYpmtd3HciTIk_Wk88,4659
15
+ hafnia/data/__init__.py,sha256=Pntmo_1fst8OhyrHB60jQ8mhJJ4hL38tdjLvt0YXEJo,73
16
+ hafnia/data/factory.py,sha256=4fZDkWNyOK1QNCmsxsXfSztPJkJW_HBIa_PTdGCYHCM,2551
17
+ hafnia/experiment/__init__.py,sha256=OEFE6HqhO5zcTCLZcPcPVjIg7wMFFnvZ1uOtAVhRz7M,85
18
+ hafnia/experiment/hafnia_logger.py,sha256=usL5pl7XLJP-g1vZrwvbky5YiD6Bg-rOODYYAX5z43I,6830
19
+ hafnia/platform/__init__.py,sha256=I-VIVXDxwBAUzxx8Zx0g_wykyDdFGTsjb_mYLmvxk2Y,443
20
+ hafnia/platform/builder.py,sha256=OFPnOjE3bAbWjUgYErXtffDKTiW_9ol95eVzKqL27WM,5433
21
+ hafnia/platform/download.py,sha256=t055axPNHlXTYCQgZHOS2YMQt1I2_bc4G8dltsOKttY,4760
22
+ hafnia/platform/experiment.py,sha256=-nAfTmn1c8sE6pHDCTNZvWDTopkXndarJAPIGvsnk60,2389
23
+ hafnia-0.1.26.dist-info/METADATA,sha256=Lds8gx_ffd8_l9kByvK_e-HPFehSSUv8E_85d8ZelSE,14990
24
+ hafnia-0.1.26.dist-info/WHEEL,sha256=qtCwoSJWgHk21S1Kb4ihdzI2rlJ1ZKaIurTj_ngOhyQ,87
25
+ hafnia-0.1.26.dist-info/entry_points.txt,sha256=FCJVIQ8GP2VE9I3eeGVF5eLxVDNW_01pOJCpG_CGnMM,45
26
+ hafnia-0.1.26.dist-info/licenses/LICENSE,sha256=wLZw1B7_mod_CO1H8LXqQgfqlWD6QceJR8--LJYRZGE,1078
27
+ hafnia-0.1.26.dist-info/RECORD,,
hafnia/platform/api.py DELETED
@@ -1,12 +0,0 @@
1
- import urllib3
2
-
3
- from hafnia.http import fetch
4
-
5
-
6
- def get_organization_id(endpoint: str, api_key: str) -> str:
7
- headers = {"X-APIKEY": api_key}
8
- try:
9
- org_info = fetch(endpoint, headers=headers)
10
- except urllib3.exceptions.HTTPError as e:
11
- raise ValueError("Failed to fetch organization ID. Verify platform URL and API key.") from e
12
- return org_info[0]["id"]
@@ -1,111 +0,0 @@
1
- import os
2
- import subprocess
3
- import sys
4
- from dataclasses import dataclass
5
- from pathlib import Path
6
- from typing import Dict
7
-
8
- from hafnia.log import logger
9
-
10
-
11
- @dataclass
12
- class PythonModule:
13
- """Dataclass to store Python module details."""
14
-
15
- module_name: str
16
- runner_path: str
17
-
18
-
19
- def handle_mount(source: str) -> None:
20
- """
21
- Mounts the Hafnia environment by adding source directories to PYTHONPATH.
22
-
23
- Args:
24
- source (str): Path to the root directory containing 'src' and 'scripts' subdirectories
25
-
26
- Raises:
27
- FileNotFoundError: If the required directory structure is not found
28
- """
29
- source_path = Path(source)
30
- src_dir = source_path / "src"
31
- scripts_dir = source_path / "scripts"
32
-
33
- if not src_dir.exists() and not scripts_dir.exists():
34
- logger.error(f"Filestructure is not supported. Expected 'src' and 'scripts' directories in {source_path}.")
35
- exit(1)
36
-
37
- sys.path.extend([src_dir.as_posix(), scripts_dir.as_posix()])
38
- python_path = os.getenv("PYTHONPATH", "")
39
- os.environ["PYTHONPATH"] = f"{python_path}:{src_dir.as_posix()}:{scripts_dir.as_posix()}"
40
- logger.info(f"Mounted codebase from {source_path}")
41
-
42
-
43
- def collect_python_modules(directory: Path) -> Dict[str, PythonModule]:
44
- """
45
- Collects Python modules from a directory and its subdirectories.
46
-
47
- This function dynamically imports Python modules found in the specified directory,
48
- excluding files that start with '_' or '.'. It's used to discover available tasks
49
- in the Hafnia environment.
50
-
51
- Args:
52
- directory (Path): The directory to search for Python modules
53
-
54
- Returns:
55
- Dict[str, Dict[str, str]]: A dictionary mapping task names to module details, where each detail contains:
56
- - module_name (str): The full module name
57
- - runner_path (str): The absolute path to the module file
58
- """
59
- from importlib.util import module_from_spec, spec_from_file_location
60
-
61
- modules = {}
62
- for fname in directory.rglob("*.py"):
63
- if fname.name.startswith("-"):
64
- continue
65
-
66
- task_name = fname.stem
67
- module_name = f"{directory.name}.{task_name}"
68
-
69
- spec = spec_from_file_location(module_name, fname)
70
- if spec is None:
71
- logger.warning(f"Was not able to load {module_name} from {fname}")
72
- continue
73
- if spec.loader is None:
74
- logger.warning(f"Loader is None for {module_name} from {fname}")
75
- continue
76
- module = module_from_spec(spec)
77
- spec.loader.exec_module(module)
78
-
79
- modules[task_name] = PythonModule(module_name, str(fname.resolve()))
80
-
81
- return modules
82
-
83
-
84
- def handle_launch(task: str) -> None:
85
- """
86
- Launch and execute a specified Hafnia task.
87
-
88
- This function verifies the Hafnia environment status, locates the task script,
89
- and executes it in a subprocess with output streaming.
90
-
91
- Args:
92
- task (str): Name of the task to execute
93
-
94
- Raises:
95
- ValueError: If the task is not found or scripts directory is not in PYTHONPATH
96
- """
97
- recipe_dir = os.getenv("RECIPE_DIR", None)
98
- if recipe_dir is None:
99
- raise ValueError("RECIPE_DIR environment variable not set.")
100
- handle_mount(recipe_dir)
101
- scripts_dir = [p for p in sys.path if "scripts" in p][0]
102
- scripts = collect_python_modules(Path(scripts_dir))
103
- if task not in scripts:
104
- available_tasks = ", ".join(sorted(scripts.keys()))
105
- logger.error(f"Task '{task}' not found. Available tasks: {available_tasks}")
106
- exit(1)
107
- try:
108
- subprocess.check_call(["python", scripts[task].runner_path], stdout=sys.stdout, stderr=sys.stdout)
109
- except subprocess.CalledProcessError as e:
110
- logger.error(f"Error executing task: {str(e)}")
111
- exit(1)
@@ -1,29 +0,0 @@
1
- cli/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
2
- cli/__main__.py,sha256=8JgZHtFpWAOUlEFvV0YWviLwesWSA-PTYH_v9COl2xw,1786
3
- cli/config.py,sha256=Js_dCn39l7hLhA3ovHorOyVqj-LCLzUg_figSy4jNjs,5279
4
- cli/consts.py,sha256=ybpWMhjkrqqevL7eVmYtdn_13a5-bV_5lCpA6_Wzcz0,964
5
- cli/data_cmds.py,sha256=BQiythAPwAwudgdUa68v50a345uw5flrcDiBHLGp9lo,1460
6
- cli/experiment_cmds.py,sha256=nJCnI0kzmFJ1_vmxIzOYWk_2eiiw1Ub0j02jXi2vW_s,2239
7
- cli/profile_cmds.py,sha256=Rg-5wLHSWlZhNPUZBO7LdyJS-Y-SgI6qKLoAac2gSdk,2534
8
- cli/recipe_cmds.py,sha256=TnUAoO643NeSio8akVUEJHs6Ttuu2JuprxyTPqzzb4k,1592
9
- cli/runc_cmds.py,sha256=6qvVfjxQ_1nkm7lrrIzYETdnBzfiXrmdnWo4jpbbdPk,4830
10
- hafnia/__init__.py,sha256=Zphq-cQoX95Z11zm4lkrU-YiAJxddR7IBfwDkxeHoDE,108
11
- hafnia/http.py,sha256=rID6Krn9wRGXwsJYvpffsFlt5cwxFgkcihYppqtdT-8,2974
12
- hafnia/log.py,sha256=ii--Q6IThsWOluRp_Br9WGhwBtKChU80BXk5pK_NU5A,819
13
- hafnia/torch_helpers.py,sha256=P_Jl4IwqUebKVCOXNe6iTorJZA3S-3d92HV274UHIko,7456
14
- hafnia/utils.py,sha256=jLq2S8n7W4HS7TsXnDgxTze463Mcatd_wC6pd54a7Os,4221
15
- hafnia/data/__init__.py,sha256=Pntmo_1fst8OhyrHB60jQ8mhJJ4hL38tdjLvt0YXEJo,73
16
- hafnia/data/factory.py,sha256=scsXrAHlBEP16AJH8RyQ1fyzhei5GxIwsmMgwEru3Pc,2536
17
- hafnia/experiment/__init__.py,sha256=OEFE6HqhO5zcTCLZcPcPVjIg7wMFFnvZ1uOtAVhRz7M,85
18
- hafnia/experiment/hafnia_logger.py,sha256=8baV6SUtCVIijypU-FfgAOIyWIf_eeJ5a62oFzQesmc,6794
19
- hafnia/platform/__init__.py,sha256=Oz1abs40hEKspLg6mVIokdtsp1tZJF9Pndv8uSMOgtQ,522
20
- hafnia/platform/api.py,sha256=aJvlQGjzqm-D3WYb2xTEYX60YoJoWN_kyYdlkvqt_MI,382
21
- hafnia/platform/builder.py,sha256=6xLy64a4cytMZEfqiA0kPzxiATEBbHXmDbf7igTMAiM,6595
22
- hafnia/platform/download.py,sha256=AWnlSYj9FD7GvZ_-9Sw5jrcxi3RyBSSUVph8U9T9ZbQ,4711
23
- hafnia/platform/executor.py,sha256=8E6cGmEMr5xYb3OReBuWj8ZnVXc0Es0UkfPamsmjH4g,3759
24
- hafnia/platform/experiment.py,sha256=951ppXdrp075pW2xGFOM0oiGYGE1I53tP9azQjjIUe8,2305
25
- hafnia-0.1.25.dist-info/METADATA,sha256=Q5dBhUXq-6lgaIVwR2ndWPsF7GFu4m8-G7dIjcW0iug,8660
26
- hafnia-0.1.25.dist-info/WHEEL,sha256=qtCwoSJWgHk21S1Kb4ihdzI2rlJ1ZKaIurTj_ngOhyQ,87
27
- hafnia-0.1.25.dist-info/entry_points.txt,sha256=FCJVIQ8GP2VE9I3eeGVF5eLxVDNW_01pOJCpG_CGnMM,45
28
- hafnia-0.1.25.dist-info/licenses/LICENSE,sha256=wLZw1B7_mod_CO1H8LXqQgfqlWD6QceJR8--LJYRZGE,1078
29
- hafnia-0.1.25.dist-info/RECORD,,