mlarray 0.0.52__tar.gz → 0.0.53__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (58) hide show
  1. {mlarray-0.0.52 → mlarray-0.0.53}/PKG-INFO +2 -10
  2. {mlarray-0.0.52 → mlarray-0.0.53}/README.md +1 -9
  3. mlarray-0.0.53/docs/cli.md +29 -0
  4. mlarray-0.0.53/mlarray/cli.py +60 -0
  5. {mlarray-0.0.52 → mlarray-0.0.53}/mlarray/mlarray.py +116 -49
  6. {mlarray-0.0.52 → mlarray-0.0.53}/mlarray.egg-info/PKG-INFO +2 -10
  7. {mlarray-0.0.52 → mlarray-0.0.53}/mlarray.egg-info/SOURCES.txt +0 -8
  8. {mlarray-0.0.52 → mlarray-0.0.53}/tests/test_optimization.py +0 -38
  9. mlarray-0.0.52/bench/.gitignore +0 -2
  10. mlarray-0.0.52/bench/README.md +0 -56
  11. mlarray-0.0.52/bench/bench_convert_nii_to_mla_random_read.py +0 -586
  12. mlarray-0.0.52/bench/bench_io_blosc2_layouts.py +0 -1178
  13. mlarray-0.0.52/bench/helper/print_mla_layouts.py +0 -85
  14. mlarray-0.0.52/docs/cli.md +0 -34
  15. mlarray-0.0.52/mlarray/blosc2_layout_strategies.py +0 -766
  16. mlarray-0.0.52/mlarray/cli.py +0 -133
  17. mlarray-0.0.52/tests/test_cli.py +0 -139
  18. {mlarray-0.0.52 → mlarray-0.0.53}/.github/workflows/workflow.yml +0 -0
  19. {mlarray-0.0.52 → mlarray-0.0.53}/.gitignore +0 -0
  20. {mlarray-0.0.52 → mlarray-0.0.53}/LICENSE +0 -0
  21. {mlarray-0.0.52 → mlarray-0.0.53}/MANIFEST.in +0 -0
  22. {mlarray-0.0.52 → mlarray-0.0.53}/assets/banner.png +0 -0
  23. {mlarray-0.0.52 → mlarray-0.0.53}/assets/banner.png~ +0 -0
  24. {mlarray-0.0.52 → mlarray-0.0.53}/docs/api.md +0 -0
  25. {mlarray-0.0.52 → mlarray-0.0.53}/docs/index.md +0 -0
  26. {mlarray-0.0.52 → mlarray-0.0.53}/docs/optimization.md +0 -0
  27. {mlarray-0.0.52 → mlarray-0.0.53}/docs/schema.md +0 -0
  28. {mlarray-0.0.52 → mlarray-0.0.53}/docs/usage.md +0 -0
  29. {mlarray-0.0.52 → mlarray-0.0.53}/docs/why.md +0 -0
  30. {mlarray-0.0.52 → mlarray-0.0.53}/examples/example_asarray.py +0 -0
  31. {mlarray-0.0.52 → mlarray-0.0.53}/examples/example_bboxes_only.py +0 -0
  32. {mlarray-0.0.52 → mlarray-0.0.53}/examples/example_channel.py +0 -0
  33. {mlarray-0.0.52 → mlarray-0.0.53}/examples/example_compress_decompress.py +0 -0
  34. {mlarray-0.0.52 → mlarray-0.0.53}/examples/example_compressed_vs_uncompressed.py +0 -0
  35. {mlarray-0.0.52 → mlarray-0.0.53}/examples/example_in_memory_constructors.py +0 -0
  36. {mlarray-0.0.52 → mlarray-0.0.53}/examples/example_metadata_only.py +0 -0
  37. {mlarray-0.0.52 → mlarray-0.0.53}/examples/example_non_spatial.py +0 -0
  38. {mlarray-0.0.52 → mlarray-0.0.53}/examples/example_open.py +0 -0
  39. {mlarray-0.0.52 → mlarray-0.0.53}/examples/example_save_load.py +0 -0
  40. {mlarray-0.0.52 → mlarray-0.0.53}/mkdocs.yml +0 -0
  41. {mlarray-0.0.52 → mlarray-0.0.53}/mlarray/__init__.py +0 -0
  42. {mlarray-0.0.52 → mlarray-0.0.53}/mlarray/meta.py +0 -0
  43. {mlarray-0.0.52 → mlarray-0.0.53}/mlarray/utils.py +0 -0
  44. {mlarray-0.0.52 → mlarray-0.0.53}/mlarray.egg-info/dependency_links.txt +0 -0
  45. {mlarray-0.0.52 → mlarray-0.0.53}/mlarray.egg-info/entry_points.txt +0 -0
  46. {mlarray-0.0.52 → mlarray-0.0.53}/mlarray.egg-info/requires.txt +0 -0
  47. {mlarray-0.0.52 → mlarray-0.0.53}/mlarray.egg-info/top_level.txt +0 -0
  48. {mlarray-0.0.52 → mlarray-0.0.53}/pyproject.toml +0 -0
  49. {mlarray-0.0.52 → mlarray-0.0.53}/setup.cfg +0 -0
  50. {mlarray-0.0.52 → mlarray-0.0.53}/tests/test_asarray.py +0 -0
  51. {mlarray-0.0.52 → mlarray-0.0.53}/tests/test_bboxes.py +0 -0
  52. {mlarray-0.0.52 → mlarray-0.0.53}/tests/test_compress_decompress.py +0 -0
  53. {mlarray-0.0.52 → mlarray-0.0.53}/tests/test_constructors.py +0 -0
  54. {mlarray-0.0.52 → mlarray-0.0.53}/tests/test_create.py +0 -0
  55. {mlarray-0.0.52 → mlarray-0.0.53}/tests/test_meta_safety.py +0 -0
  56. {mlarray-0.0.52 → mlarray-0.0.53}/tests/test_metadata.py +0 -0
  57. {mlarray-0.0.52 → mlarray-0.0.53}/tests/test_open.py +0 -0
  58. {mlarray-0.0.52 → mlarray-0.0.53}/tests/test_usage.py +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: mlarray
3
- Version: 0.0.52
3
+ Version: 0.0.53
4
4
  Summary: Array format specialized for Machine Learning with Blosc2 backend and standardized metadata.
5
5
  Author-email: Karol Gotkowski <karol.gotkowski@dkfz.de>
6
6
  License: MIT
@@ -236,18 +236,10 @@ mlarray_header sample.mla
236
236
 
237
237
  ### mlarray_convert
238
238
 
239
- Convert between MLArray and NIfTI/NRRD files.
240
-
241
- When converting from NIfTI/NRRD to MLArray, source metadata is copied into
242
- `meta.source`.
243
-
244
- When converting from MLArray to NIfTI/NRRD, only `meta.source` is copied into
245
- the output header. Spatial metadata (`spacing`, `origin`, `direction`) is set
246
- explicitly from `meta.spatial`.
239
+ Convert a NIfTI or NRRD file to MLArray and copy metadata.
247
240
 
248
241
  ```bash
249
242
  mlarray_convert sample.nii.gz output.mla
250
- mlarray_convert sample.mla output.nii.gz
251
243
  ```
252
244
 
253
245
  ## Contributing
@@ -202,18 +202,10 @@ mlarray_header sample.mla
202
202
 
203
203
  ### mlarray_convert
204
204
 
205
- Convert between MLArray and NIfTI/NRRD files.
206
-
207
- When converting from NIfTI/NRRD to MLArray, source metadata is copied into
208
- `meta.source`.
209
-
210
- When converting from MLArray to NIfTI/NRRD, only `meta.source` is copied into
211
- the output header. Spatial metadata (`spacing`, `origin`, `direction`) is set
212
- explicitly from `meta.spatial`.
205
+ Convert a NIfTI or NRRD file to MLArray and copy metadata.
213
206
 
214
207
  ```bash
215
208
  mlarray_convert sample.nii.gz output.mla
216
- mlarray_convert sample.mla output.nii.gz
217
209
  ```
218
210
 
219
211
  ## Contributing
@@ -0,0 +1,29 @@
1
+ # CLI
2
+
3
+ MLArray includes a small command-line interface for common tasks such as **inspecting file headers** and **converting existing image formats** into MLArray. This is especially useful when you want to quickly verify metadata, debug a dataset, or batch-convert files without writing Python code.
4
+
5
+ The CLI currently focuses on core workflows (header inspection and conversion). Support for converting a wider range of image formats will be added over time.
6
+
7
+ ---
8
+
9
+ ## `mlarray_header`
10
+
11
+ Print the metadata header from a `.mla` file.
12
+
13
+ This command is useful for quickly checking spatial metadata, stored schemas, and other file-level information without loading the full array into memory.
14
+
15
+ ```bash
16
+ mlarray_header sample.mla
17
+ ```
18
+
19
+ ---
20
+
21
+ ## `mlarray_convert`
22
+
23
+ Convert a NIfTI or NRRD file to MLArray and copy metadata.
24
+
25
+ This provides an easy way to bring existing medical imaging data into an MLArray-based workflow while preserving the original metadata for downstream analysis and visualization.
26
+
27
+ ```bash
28
+ mlarray_convert sample.nii.gz output.mla
29
+ ```
@@ -0,0 +1,60 @@
1
+ import argparse
2
+ import json
3
+ from typing import Union
4
+ from pathlib import Path
5
+ from mlarray import MLArray
6
+ from mlarray.meta import _meta_internal_write
7
+
8
+ try:
9
+ from medvol import MedVol
10
+ except ImportError:
11
+ MedVol = None
12
+
13
+
14
+ def print_header(filepath: Union[str, Path]) -> None:
15
+ """Print the MLArray metadata header for a file.
16
+
17
+ Args:
18
+ filepath: Path to a ".mla" file.
19
+ """
20
+ meta = MLArray(filepath).meta
21
+ if meta is None:
22
+ print("null")
23
+ return
24
+ print(json.dumps(meta.to_plain(include_none=True), indent=2, sort_keys=True))
25
+
26
+
27
+ def convert_to_mlarray(load_filepath: Union[str, Path], save_filepath: Union[str, Path]):
28
+ if MedVol is None:
29
+ raise RuntimeError("medvol is required for mlarray_convert; install with 'pip install mlarray[all]'.")
30
+ image_meta_format = None
31
+ if str(load_filepath).endswith(f".nii.gz") or str(load_filepath).endswith(f".nii"):
32
+ image_meta_format = "nifti"
33
+ elif str(load_filepath).endswith(f".nrrd"):
34
+ image_meta_format = "nrrd"
35
+ image_medvol = MedVol(load_filepath)
36
+ image_mlarray = MLArray(image_medvol.array, spacing=image_medvol.spacing, origin=image_medvol.origin, direction=image_medvol.direction, meta=image_medvol.header)
37
+ with _meta_internal_write():
38
+ image_mlarray.meta._image_meta_format = image_meta_format
39
+ image_mlarray.save(save_filepath)
40
+
41
+
42
+ def cli_print_header() -> None:
43
+ parser = argparse.ArgumentParser(
44
+ prog="mlarray_header",
45
+ description="Print the MLArray metadata header for a file.",
46
+ )
47
+ parser.add_argument("filepath", help="Path to a .mla file.")
48
+ args = parser.parse_args()
49
+ print_header(args.filepath)
50
+
51
+
52
+ def cli_convert_to_mlarray() -> None:
53
+ parser = argparse.ArgumentParser(
54
+ prog="mlarray_convert",
55
+ description="Convert a NiFTi or NRRD file to MLArray and copy all metadata.",
56
+ )
57
+ parser.add_argument("load_filepath", help="Path to the NiFTi (.nii.gz, .nii) or NRRD (.nrrd) file to load.")
58
+ parser.add_argument("save_filepath", help="Path to the MLArray (.mla) file to save.")
59
+ args = parser.parse_args()
60
+ convert_to_mlarray(args.load_filepath, args.save_filepath)
@@ -12,9 +12,6 @@ from mlarray.meta import (
12
12
  _spatial_axis_mask,
13
13
  _meta_internal_write,
14
14
  )
15
- from mlarray.blosc2_layout_strategies import (
16
- comp_blosc2_params_spatial_only_magnitude,
17
- )
18
15
  from mlarray.utils import is_serializable
19
16
  import pickle
20
17
  import gzip
@@ -1327,8 +1324,8 @@ class MLArray:
1327
1324
  @classmethod
1328
1325
  def comp_blosc2_params(
1329
1326
  cls,
1330
- image_size: Tuple[int, ...],
1331
- patch_size: Tuple[int, ...],
1327
+ image_size: Union[Tuple[int, int], Tuple[int, int, int], Tuple[int, int, int, int]],
1328
+ patch_size: Union[Tuple[int, int], Tuple[int, int, int]],
1332
1329
  spatial_axis_mask: Optional[list[bool]] = None,
1333
1330
  bytes_per_pixel: int = 4, # 4 byte are float32
1334
1331
  l1_cache_size_per_core_in_bytes: int = 32768, # 1 Kibibyte (KiB) = 2^10 Byte; 32 KiB = 32768 Byte
@@ -1336,35 +1333,31 @@ class MLArray:
1336
1333
  safety_factor: float = 0.8 # we dont will the caches to the brim. 0.8 means we target 80% of the caches
1337
1334
  ):
1338
1335
  """
1339
- Compute recommended Blosc2 chunk and block sizes from a patch-size hint.
1340
-
1341
- This method uses the ``comp_blosc2_params_spatial_only_magnitude``
1342
- strategy from :mod:`mlarray.blosc2_layout_strategies`.
1343
-
1344
- Strategy summary:
1345
- 1. Split axes into spatial and non-spatial using
1346
- ``spatial_axis_mask``.
1347
- 2. Keep non-spatial axes at ``1`` in both blocks and chunks so the
1348
- layout is driven by spatial patch sampling instead of stretching
1349
- cache budgets across non-spatial dimensions.
1350
- 3. Grow block sizes along spatial axes under the L1 cache budget.
1351
- Growth is weighted by the relative magnitude of the requested
1352
- patch size, so larger patch axes are allowed to grow faster.
1353
- 4. Grow chunk sizes in multiples of the block sizes under the L3
1354
- cache budget, again weighted by patch-size magnitude.
1355
- 5. Enforce structural constraints that keep the layout regular:
1356
- non-clipped spatial axes stay even, and non-clipped chunk axes
1357
- remain multiples of their corresponding block axes.
1358
-
1359
- This strategy supports arbitrary numbers of spatial and non-spatial axes
1360
- as long as the patch size dimensionality matches the number of spatial axes.
1336
+ Computes a recommended block and chunk size for saving arrays with Blosc v2.
1337
+
1338
+ Blosc2 NDIM documentation:
1339
+ "Having a second partition allows for greater flexibility in fitting different partitions to different CPU cache levels.
1340
+ Typically, the first partition (also known as chunks) should be sized to fit within the L3 cache,
1341
+ while the second partition (also known as blocks) should be sized to fit within the L2 or L1 caches,
1342
+ depending on whether the priority is compression ratio or speed."
1343
+ (Source: https://www.blosc.org/posts/blosc2-ndim-intro/)
1344
+
1345
+ Our approach is not fully optimized for this yet.
1346
+ Currently, we aim to fit the uncompressed block within the L1 cache, accepting that it might occasionally spill over into L2, which we consider acceptable.
1347
+
1348
+ Note: This configuration is specifically optimized for nnU-Net data loading, where each read operation is performed by a single core, so multi-threading is not an option.
1349
+
1350
+ The default cache values are based on an older Intel 4110 CPU with 32KB L1, 128KB L2, and 1408KB L3 cache per core.
1351
+ We haven't further optimized for modern CPUs with larger caches, as our data must still be compatible with the older systems.
1361
1352
 
1362
1353
  Args:
1363
- image_size (Tuple[int, ...]): Full array shape.
1364
- patch_size (Tuple[int, ...]): Patch size over spatial axes only.
1365
- spatial_axis_mask (Optional[list[bool]]): Mask indicating for every
1366
- array axis whether it is spatial. If omitted, all axes are
1367
- treated as spatial.
1354
+ image_size (Union[Tuple[int, int], Tuple[int, int, int], Tuple[int, int, int, int]]):
1355
+ Image shape. Use a 2D, 3D, or 4D size; 2D/3D inputs are
1356
+ internally expanded to 4D (with non-spatial axes first).
1357
+ patch_size (Union[Tuple[int, int], Tuple[int, int, int]]): Patch
1358
+ size for spatial dimensions. Use a 2-tuple (x, y) or 3-tuple
1359
+ (x, y, z).
1360
+ spatial_axis_mask (Optional[list[bool]]): Mask indicating for every axis whether it is spatial or not.
1368
1361
  bytes_per_pixel (int): Number of bytes per element. Defaults to 4
1369
1362
  for float32.
1370
1363
  l1_cache_size_per_core_in_bytes (int): L1 cache per core in bytes.
@@ -1374,15 +1367,93 @@ class MLArray:
1374
1367
  Returns:
1375
1368
  Tuple[List[int], List[int]]: Recommended chunk size and block size.
1376
1369
  """
1377
- return comp_blosc2_params_spatial_only_magnitude(
1378
- image_size=tuple(int(v) for v in image_size),
1379
- patch_size=tuple(int(v) for v in patch_size),
1380
- spatial_axis_mask=spatial_axis_mask,
1381
- bytes_per_pixel=bytes_per_pixel,
1382
- l1_cache_size_per_core_in_bytes=l1_cache_size_per_core_in_bytes,
1383
- l3_cache_size_per_core_in_bytes=l3_cache_size_per_core_in_bytes,
1384
- safety_factor=safety_factor,
1385
- )
1370
+ def _move_index_list(a, src, dst):
1371
+ a = list(a)
1372
+ x = a.pop(src)
1373
+ a.insert(dst, x)
1374
+ return a
1375
+
1376
+ num_squeezes = 0
1377
+ if len(image_size) == 2:
1378
+ image_size = (1, 1, *image_size)
1379
+ num_squeezes = 2
1380
+ elif len(image_size) == 3:
1381
+ image_size = (1, *image_size)
1382
+ num_squeezes = 1
1383
+
1384
+ non_spatial_axis = None
1385
+ if spatial_axis_mask is not None:
1386
+ non_spatial_axis_mask = [not b for b in spatial_axis_mask]
1387
+ if sum(non_spatial_axis_mask) > 1:
1388
+ raise RuntimeError("Automatic blosc2 optimization currently only supports one non-spatial axis. Please set chunk and block size manually.")
1389
+ non_spatial_axis = next((i for i, v in enumerate(non_spatial_axis_mask) if v), None)
1390
+ if non_spatial_axis is not None:
1391
+ image_size = _move_index_list(image_size, non_spatial_axis+num_squeezes, 0)
1392
+
1393
+ if len(image_size) != 4:
1394
+ raise RuntimeError("Image size must be 4D.")
1395
+
1396
+ if not (len(patch_size) == 2 or len(patch_size) == 3):
1397
+ raise RuntimeError("Patch size must be 2D or 3D.")
1398
+
1399
+ non_spatial_size = image_size[0]
1400
+ if len(patch_size) == 2:
1401
+ patch_size = [1, *patch_size]
1402
+ patch_size = np.array(patch_size)
1403
+ block_size = np.array((non_spatial_size, *[2 ** (max(0, math.ceil(math.log2(i)))) for i in patch_size]))
1404
+
1405
+ # shrink the block size until it fits in L1
1406
+ estimated_nbytes_block = np.prod(block_size) * bytes_per_pixel
1407
+ while estimated_nbytes_block > (l1_cache_size_per_core_in_bytes * safety_factor):
1408
+ # pick largest deviation from patch_size that is not 1
1409
+ axis_order = np.argsort(block_size[1:] / patch_size)[::-1]
1410
+ idx = 0
1411
+ picked_axis = axis_order[idx]
1412
+ while block_size[picked_axis + 1] == 1 or block_size[picked_axis + 1] == 1:
1413
+ idx += 1
1414
+ picked_axis = axis_order[idx]
1415
+ # now reduce that axis to the next lowest power of 2
1416
+ block_size[picked_axis + 1] = 2 ** (max(0, math.floor(math.log2(block_size[picked_axis + 1] - 1))))
1417
+ block_size[picked_axis + 1] = min(block_size[picked_axis + 1], image_size[picked_axis + 1])
1418
+ estimated_nbytes_block = np.prod(block_size) * bytes_per_pixel
1419
+
1420
+ block_size = np.array([min(i, j) for i, j in zip(image_size, block_size)])
1421
+
1422
+ # note: there is no use extending the chunk size to 3d when we have a 2d patch size! This would unnecessarily
1423
+ # load data into L3
1424
+ # now tile the blocks into chunks until we hit image_size or the l3 cache per core limit
1425
+ chunk_size = deepcopy(block_size)
1426
+ estimated_nbytes_chunk = np.prod(chunk_size) * bytes_per_pixel
1427
+ while estimated_nbytes_chunk < (l3_cache_size_per_core_in_bytes * safety_factor):
1428
+ if patch_size[0] == 1 and all([i == j for i, j in zip(chunk_size[2:], image_size[2:])]):
1429
+ break
1430
+ if all([i == j for i, j in zip(chunk_size, image_size)]):
1431
+ break
1432
+ # find axis that deviates from block_size the most
1433
+ axis_order = np.argsort(chunk_size[1:] / block_size[1:])
1434
+ idx = 0
1435
+ picked_axis = axis_order[idx]
1436
+ while chunk_size[picked_axis + 1] == image_size[picked_axis + 1] or patch_size[picked_axis] == 1:
1437
+ idx += 1
1438
+ picked_axis = axis_order[idx]
1439
+ chunk_size[picked_axis + 1] += block_size[picked_axis + 1]
1440
+ chunk_size[picked_axis + 1] = min(chunk_size[picked_axis + 1], image_size[picked_axis + 1])
1441
+ estimated_nbytes_chunk = np.prod(chunk_size) * bytes_per_pixel
1442
+ if np.mean([i / j for i, j in zip(chunk_size[1:], patch_size)]) > 1.5:
1443
+ # chunk size should not exceed patch size * 1.5 on average
1444
+ chunk_size[picked_axis + 1] -= block_size[picked_axis + 1]
1445
+ break
1446
+ # better safe than sorry
1447
+ chunk_size = [min(i, j) for i, j in zip(image_size, chunk_size)]
1448
+
1449
+ if non_spatial_axis is not None:
1450
+ block_size = _move_index_list(block_size, 0, non_spatial_axis+num_squeezes)
1451
+ chunk_size = _move_index_list(chunk_size, 0, non_spatial_axis+num_squeezes)
1452
+
1453
+ block_size = block_size[num_squeezes:]
1454
+ chunk_size = chunk_size[num_squeezes:]
1455
+
1456
+ return [int(value) for value in chunk_size], [int(value) for value in block_size]
1386
1457
 
1387
1458
  def _open(
1388
1459
  self,
@@ -1814,6 +1885,9 @@ class MLArray:
1814
1885
  MetaBlosc2: Validated Blosc2 metadata instance.
1815
1886
  """
1816
1887
  num_spatial_axes = sum(spatial_axis_mask)
1888
+ num_non_spatial_axes = sum([not b for b in spatial_axis_mask])
1889
+ if patch_size is not None and patch_size != "default" and (num_spatial_axes == 1 or num_spatial_axes > 3 or num_non_spatial_axes > 1):
1890
+ raise NotImplementedError("Chunk and block size optimization based on patch size is only implemented for 2D and 3D spatial images with at most one further non-spatial axis. Please set the chunk and block size manually or set to None for blosc2 to determine a chunk and block size.")
1817
1891
  if patch_size is not None and patch_size != "default" and (chunk_size is not None or block_size is not None):
1818
1892
  raise RuntimeError("patch_size and chunk_size / block_size cannot both be explicitly set.")
1819
1893
  if (chunk_size is not None and block_size is None) or (chunk_size is None and block_size is not None):
@@ -1830,14 +1904,7 @@ class MLArray:
1830
1904
  if chunk_size is not None or block_size is not None:
1831
1905
  patch_size = None
1832
1906
 
1833
- patch_size = [patch_size] * num_spatial_axes if isinstance(patch_size, int) else patch_size
1834
-
1835
- if patch_size is not None and num_spatial_axes == 0:
1836
- raise RuntimeError(
1837
- "Automatic patch-size optimization requires at least one spatial axis. "
1838
- "Set patch_size=None and provide chunk_size/block_size manually, "
1839
- "or let Blosc2 determine the layout."
1840
- )
1907
+ patch_size = [patch_size] * len(shape) if isinstance(patch_size, int) else patch_size
1841
1908
 
1842
1909
  if patch_size is not None:
1843
1910
  chunk_size, block_size = MLArray.comp_blosc2_params(shape, patch_size, spatial_axis_mask, bytes_per_pixel=dtype_itemsize)
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: mlarray
3
- Version: 0.0.52
3
+ Version: 0.0.53
4
4
  Summary: Array format specialized for Machine Learning with Blosc2 backend and standardized metadata.
5
5
  Author-email: Karol Gotkowski <karol.gotkowski@dkfz.de>
6
6
  License: MIT
@@ -236,18 +236,10 @@ mlarray_header sample.mla
236
236
 
237
237
  ### mlarray_convert
238
238
 
239
- Convert between MLArray and NIfTI/NRRD files.
240
-
241
- When converting from NIfTI/NRRD to MLArray, source metadata is copied into
242
- `meta.source`.
243
-
244
- When converting from MLArray to NIfTI/NRRD, only `meta.source` is copied into
245
- the output header. Spatial metadata (`spacing`, `origin`, `direction`) is set
246
- explicitly from `meta.spatial`.
239
+ Convert a NIfTI or NRRD file to MLArray and copy metadata.
247
240
 
248
241
  ```bash
249
242
  mlarray_convert sample.nii.gz output.mla
250
- mlarray_convert sample.mla output.nii.gz
251
243
  ```
252
244
 
253
245
  ## Contributing
@@ -5,7 +5,6 @@ README.md
5
5
  mkdocs.yml
6
6
  pyproject.toml
7
7
  ./mlarray/__init__.py
8
- ./mlarray/blosc2_layout_strategies.py
9
8
  ./mlarray/cli.py
10
9
  ./mlarray/meta.py
11
10
  ./mlarray/mlarray.py
@@ -13,11 +12,6 @@ pyproject.toml
13
12
  .github/workflows/workflow.yml
14
13
  assets/banner.png
15
14
  assets/banner.png~
16
- bench/.gitignore
17
- bench/README.md
18
- bench/bench_convert_nii_to_mla_random_read.py
19
- bench/bench_io_blosc2_layouts.py
20
- bench/helper/print_mla_layouts.py
21
15
  docs/api.md
22
16
  docs/cli.md
23
17
  docs/index.md
@@ -36,7 +30,6 @@ examples/example_non_spatial.py
36
30
  examples/example_open.py
37
31
  examples/example_save_load.py
38
32
  mlarray/__init__.py
39
- mlarray/blosc2_layout_strategies.py
40
33
  mlarray/cli.py
41
34
  mlarray/meta.py
42
35
  mlarray/mlarray.py
@@ -49,7 +42,6 @@ mlarray.egg-info/requires.txt
49
42
  mlarray.egg-info/top_level.txt
50
43
  tests/test_asarray.py
51
44
  tests/test_bboxes.py
52
- tests/test_cli.py
53
45
  tests/test_compress_decompress.py
54
46
  tests/test_constructors.py
55
47
  tests/test_create.py
@@ -5,7 +5,6 @@ from pathlib import Path
5
5
  import numpy as np
6
6
 
7
7
  from mlarray import MLArray, MLARRAY_DEFAULT_PATCH_SIZE
8
- from mlarray.meta import MetaSpatial
9
8
 
10
9
 
11
10
  def _make_array(shape=(16, 32, 32), seed=0, dtype=np.float32):
@@ -112,43 +111,6 @@ class TestOptimizationExamples(unittest.TestCase):
112
111
  self.assertIsNotNone(loaded.meta.blosc2.chunk_size)
113
112
  self.assertIsNotNone(loaded.meta.blosc2.block_size)
114
113
 
115
- def test_patch_optimization_supports_multiple_non_spatial_axes(self):
116
- with tempfile.TemporaryDirectory() as tmpdir:
117
- array = _make_array(shape=(2, 3, 16, 32, 32))
118
- path = Path(tmpdir) / "multi-non-spatial.mla"
119
- axis_labels = [
120
- MetaSpatial.AxisLabel.channel,
121
- MetaSpatial.AxisLabel.temporal,
122
- MetaSpatial.AxisLabel.spatial_z,
123
- MetaSpatial.AxisLabel.spatial_y,
124
- MetaSpatial.AxisLabel.spatial_x,
125
- ]
126
-
127
- MLArray(array, axis_labels=axis_labels, patch_size=8).save(path)
128
- loaded = MLArray(path)
129
-
130
- self.assertEqual(loaded.meta.blosc2.patch_size, [8, 8, 8])
131
- self.assertEqual(len(loaded.meta.blosc2.chunk_size), 5)
132
- self.assertEqual(len(loaded.meta.blosc2.block_size), 5)
133
- self.assertEqual(loaded.meta.blosc2.chunk_size[:2], [1, 1])
134
- self.assertEqual(loaded.meta.blosc2.block_size[:2], [1, 1])
135
-
136
- def test_patch_optimization_supports_more_than_three_spatial_axes(self):
137
- array = _make_array(shape=(2, 6, 8, 10, 12))
138
- axis_labels = [
139
- MetaSpatial.AxisLabel.channel,
140
- MetaSpatial.AxisLabel.spatial,
141
- MetaSpatial.AxisLabel.spatial,
142
- MetaSpatial.AxisLabel.spatial,
143
- MetaSpatial.AxisLabel.spatial,
144
- ]
145
-
146
- image = MLArray(array, axis_labels=axis_labels, patch_size=(2, 4, 4, 6))
147
-
148
- self.assertEqual(image.meta.blosc2.patch_size, [2, 4, 4, 6])
149
- self.assertEqual(len(image.meta.blosc2.chunk_size), 5)
150
- self.assertEqual(len(image.meta.blosc2.block_size), 5)
151
-
152
114
 
153
115
  if __name__ == "__main__":
154
116
  unittest.main()
@@ -1,2 +0,0 @@
1
- data/
2
- results/
@@ -1,56 +0,0 @@
1
- # Benchmark Scripts
2
-
3
- This folder contains benchmarking scripts for MLArray IO/layout experiments.
4
-
5
- ## `bench_io_blosc2_layouts.py`
6
-
7
- Benchmarks IO throughput across:
8
-
9
- - layout method(s) based on `comp_blosc2_params` (currently baseline copy only)
10
- - image size tiers (`small`, `medium`, `large`, `very_large`)
11
- - 2D / 3D / 4D-total array cases with spatial and optional non-spatial axis
12
- - multiple patch sizes (2D and 3D patch vectors)
13
- - `MLArray.open(...)` mode/mmap combinations
14
- - operations:
15
- - `read_full`
16
- - `read_patch_random`
17
- - `write_patch_random`
18
- - warm and cold cache runs
19
-
20
- Outputs are printed to console and written to:
21
-
22
- - `bench/results/bench_io_blosc2_layouts.csv`
23
- - `bench/results/bench_io_blosc2_layouts.json`
24
-
25
- ### Example
26
-
27
- ```bash
28
- python bench/bench_io_blosc2_layouts.py \
29
- --tiers small medium \
30
- --runs 3 \
31
- --cache-mode both \
32
- --nthreads 1
33
- ```
34
-
35
- If you hit native segfaults in Blosc2 during long runs, isolate each measured run
36
- in a subprocess (slower, but robust):
37
-
38
- ```bash
39
- python bench/bench_io_blosc2_layouts.py \
40
- --tiers small medium \
41
- --runs 3 \
42
- --cache-mode both \
43
- --nthreads 1 \
44
- --isolate-runs
45
- ```
46
-
47
- ### Cold cache note (Linux)
48
-
49
- For cold-cache read measurements, the script drops Linux page cache **after the dataset has been created on disk and immediately before measured open/read runs**.
50
-
51
- This requires root:
52
-
53
- - run as root, or
54
- - run via `sudo`
55
-
56
- If cache dropping fails, those runs are recorded with error status in results.