PyPI - careamics - Versions diffs - 0.0.11__py3-none-any.whl → 0.0.13__py3-none-any.whl - Mend

careamics 0.0.11py3-none-any.whl → 0.0.13py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of careamics might be problematic. Click here for more details.

Files changed (98) hide show

careamics/careamist.py +24 -7
careamics/cli/utils.py +1 -1
careamics/config/algorithms/n2v_algorithm_model.py +1 -1
careamics/config/architectures/unet_model.py +3 -0
careamics/config/callback_model.py +23 -34
careamics/config/configuration.py +55 -4
careamics/config/configuration_factories.py +288 -23
careamics/config/data/__init__.py +2 -0
careamics/config/data/data_model.py +41 -4
careamics/config/data/ng_data_model.py +381 -0
careamics/config/data/patching_strategies/__init__.py +14 -0
careamics/config/data/patching_strategies/_overlapping_patched_model.py +103 -0
careamics/config/data/patching_strategies/_patched_model.py +56 -0
careamics/config/data/patching_strategies/random_patching_model.py +21 -0
careamics/config/data/patching_strategies/sequential_patching_model.py +25 -0
careamics/config/data/patching_strategies/tiled_patching_model.py +40 -0
careamics/config/data/patching_strategies/whole_patching_model.py +12 -0
careamics/config/inference_model.py +6 -3
careamics/config/optimizer_models.py +1 -3
careamics/config/support/supported_data.py +7 -0
careamics/config/support/supported_patching_strategies.py +22 -0
careamics/config/training_model.py +0 -2
careamics/config/validators/validator_utils.py +4 -3
careamics/dataset/dataset_utils/iterate_over_files.py +2 -2
careamics/dataset/in_memory_dataset.py +2 -1
careamics/dataset/iterable_dataset.py +2 -2
careamics/dataset/iterable_pred_dataset.py +2 -2
careamics/dataset/iterable_tiled_pred_dataset.py +2 -2
careamics/dataset/patching/patching.py +3 -2
careamics/dataset/tiling/lvae_tiled_patching.py +16 -6
careamics/dataset/tiling/tiled_patching.py +2 -1
careamics/dataset_ng/README.md +212 -0
careamics/dataset_ng/dataset.py +229 -0
careamics/dataset_ng/demos/bsd68_demo.ipynb +361 -0
careamics/dataset_ng/demos/care_U2OS_demo.ipynb +330 -0
careamics/dataset_ng/demos/demo_custom_image_stack.ipynb +734 -0
careamics/dataset_ng/demos/demo_datamodule.ipynb +447 -0
careamics/dataset_ng/{demo_dataset.ipynb → demos/demo_dataset.ipynb} +60 -53
careamics/dataset_ng/{demo_patch_extractor.py → demos/demo_patch_extractor.py} +7 -9
careamics/dataset_ng/demos/mouse_nuclei_demo.ipynb +292 -0
careamics/dataset_ng/factory.py +451 -0
careamics/dataset_ng/legacy_interoperability.py +170 -0
careamics/dataset_ng/patch_extractor/__init__.py +3 -8
careamics/dataset_ng/patch_extractor/demo_custom_image_stack_loader.py +7 -5
careamics/dataset_ng/patch_extractor/image_stack/__init__.py +4 -1
careamics/dataset_ng/patch_extractor/image_stack/czi_image_stack.py +360 -0
careamics/dataset_ng/patch_extractor/image_stack/image_stack_protocol.py +5 -1
careamics/dataset_ng/patch_extractor/image_stack/in_memory_image_stack.py +1 -1
careamics/dataset_ng/patch_extractor/image_stack_loader.py +5 -75
careamics/dataset_ng/patch_extractor/patch_extractor.py +5 -4
careamics/dataset_ng/patch_extractor/patch_extractor_factory.py +114 -105
careamics/dataset_ng/patching_strategies/__init__.py +6 -1
careamics/dataset_ng/patching_strategies/patching_strategy_protocol.py +31 -0
careamics/dataset_ng/patching_strategies/random_patching.py +5 -1
careamics/dataset_ng/patching_strategies/sequential_patching.py +5 -5
careamics/dataset_ng/patching_strategies/tiling_strategy.py +172 -0
careamics/dataset_ng/patching_strategies/whole_sample.py +36 -0
careamics/file_io/read/get_func.py +2 -1
careamics/lightning/dataset_ng/__init__.py +1 -0
careamics/lightning/dataset_ng/data_module.py +678 -0
careamics/lightning/dataset_ng/lightning_modules/__init__.py +9 -0
careamics/lightning/dataset_ng/lightning_modules/care_module.py +97 -0
careamics/lightning/dataset_ng/lightning_modules/n2v_module.py +106 -0
careamics/lightning/dataset_ng/lightning_modules/unet_module.py +212 -0
careamics/lightning/lightning_module.py +5 -1
careamics/lightning/predict_data_module.py +2 -1
careamics/lightning/train_data_module.py +2 -1
careamics/losses/loss_factory.py +2 -1
careamics/lvae_training/dataset/__init__.py +8 -3
careamics/lvae_training/dataset/config.py +3 -3
careamics/lvae_training/dataset/ms_dataset_ref.py +1067 -0
careamics/lvae_training/dataset/multich_dataset.py +46 -17
careamics/lvae_training/dataset/multicrop_dset.py +196 -0
careamics/lvae_training/dataset/types.py +3 -3
careamics/lvae_training/dataset/utils/index_manager.py +259 -0
careamics/lvae_training/eval_utils.py +93 -3
careamics/model_io/bioimage/bioimage_utils.py +1 -1
careamics/model_io/bioimage/model_description.py +1 -1
careamics/model_io/bmz_io.py +1 -1
careamics/model_io/model_io_utils.py +2 -2
careamics/models/activation.py +2 -1
careamics/prediction_utils/prediction_outputs.py +1 -1
careamics/prediction_utils/stitch_prediction.py +1 -1
careamics/transforms/compose.py +1 -0
careamics/transforms/n2v_manipulate_torch.py +15 -9
careamics/transforms/normalize.py +18 -7
careamics/transforms/pixel_manipulation_torch.py +59 -92
careamics/utils/lightning_utils.py +25 -11
careamics/utils/metrics.py +2 -1
careamics/utils/torch_utils.py +23 -0
{careamics-0.0.11.dist-info → careamics-0.0.13.dist-info}/METADATA +12 -11
{careamics-0.0.11.dist-info → careamics-0.0.13.dist-info}/RECORD +95 -69
careamics/dataset_ng/dataset/__init__.py +0 -3
careamics/dataset_ng/dataset/dataset.py +0 -184
careamics/dataset_ng/demo_patch_extractor_factory.py +0 -37
{careamics-0.0.11.dist-info → careamics-0.0.13.dist-info}/WHEEL +0 -0
{careamics-0.0.11.dist-info → careamics-0.0.13.dist-info}/entry_points.txt +0 -0
{careamics-0.0.11.dist-info → careamics-0.0.13.dist-info}/licenses/LICENSE +0 -0

careamics/config/support/supported_data.py CHANGED Viewed

@@ -16,12 +16,15 @@ class SupportedData(str, BaseEnum):
         Array data.
     TIFF : str
         TIFF image data.
+    CZI : str
+        CZI image data.
     CUSTOM : str
         Custom data.
     """
     ARRAY = "array"
     TIFF = "tiff"
+    CZI = "czi"
     CUSTOM = "custom"
     # ZARR = "zarr"
@@ -78,6 +81,8 @@ class SupportedData(str, BaseEnum):
             raise NotImplementedError(f"Data '{data_type}' is not loaded from a file.")
         elif data_type == cls.TIFF:
             return "*.tif*"
+        elif data_type == cls.CZI:
+            return "*.czi"
         elif data_type == cls.CUSTOM:
             return "*.*"
         else:
@@ -102,6 +107,8 @@ class SupportedData(str, BaseEnum):
             raise NotImplementedError(f"Data '{data_type}' is not loaded from a file.")
         elif data_type == cls.TIFF:
             return ".tiff"
+        elif data_type == cls.CZI:
+            return ".czi"
         elif data_type == cls.CUSTOM:
             # TODO: improve this message
             raise NotImplementedError("Custom extensions have to be passed elsewhere.")

careamics/config/support/supported_patching_strategies.py ADDED Viewed

@@ -0,0 +1,22 @@
+"""Patching strategies supported by Careamics."""
+from careamics.utils import BaseEnum
+class SupportedPatchingStrategy(str, BaseEnum):
+    """Patching strategies supported by Careamics."""
+    FIXED_RANDOM = "fixed_random"
+    """Fixed random patching strategy, used during training."""
+    RANDOM = "random"
+    """Random patching strategy, used during training."""
+    # SEQUENTIAL = "sequential"
+    # """Sequential patching strategy, used during training."""
+    TILED = "tiled"
+    """Tiled patching strategy, used during prediction."""
+    WHOLE = "whole"
+    """Whole image patching strategy, used during prediction."""

careamics/config/training_model.py CHANGED Viewed

@@ -39,8 +39,6 @@ class TrainingConfig(BaseModel):
     """Maximum number of steps to train for. -1 means no limit."""
     check_val_every_n_epoch: int = Field(default=1, ge=1)
     """Validation step frequency."""
-    enable_progress_bar: bool = Field(default=True)
-    """Whether to enable the progress bar."""
     accumulate_grad_batches: int = Field(default=1, ge=1)
     """Number of batches to accumulate gradients over before stepping the optimizer."""
     gradient_clip_val: Optional[Union[int, float]] = None

careamics/config/validators/validator_utils.py CHANGED Viewed

@@ -4,7 +4,8 @@ Validator functions.
 These functions are used to validate dimensions and axes of inputs.
 """
-from typing import Optional, Union
+from collections.abc import Sequence
+from typing import Optional
 _AXES = "STCZYX"
@@ -79,14 +80,14 @@ def value_ge_than_8_power_of_2(
 def patch_size_ge_than_8_power_of_2(
-    patch_list: Optional[Union[list[int], Union[tuple[int, ...]]]],
+    patch_list: Optional[Sequence[int]],
 ) -> None:
     """
     Validate that each entry is greater or equal than 8 and a power of 2.
     Parameters
     ----------
-    patch_list : list or typle of int, or None
+    patch_list : Sequence of int, or None
         Patch size.
     Raises

careamics/dataset/dataset_utils/iterate_over_files.py CHANGED Viewed

@@ -2,9 +2,9 @@
 from __future__ import annotations
-from collections.abc import Generator
+from collections.abc import Callable, Generator
 from pathlib import Path
-from typing import Callable, Optional, Union
+from typing import Optional, Union
 from numpy.typing import NDArray
 from torch.utils.data import get_worker_info

careamics/dataset/in_memory_dataset.py CHANGED Viewed

@@ -3,8 +3,9 @@
 from __future__ import annotations
 import copy
+from collections.abc import Callable
 from pathlib import Path
-from typing import Any, Callable, Optional, Union
+from typing import Any, Optional, Union
 import numpy as np
 from torch.utils.data import Dataset

careamics/dataset/iterable_dataset.py CHANGED Viewed

@@ -3,9 +3,9 @@
 from __future__ import annotations
 import copy
-from collections.abc import Generator
+from collections.abc import Callable, Generator
 from pathlib import Path
-from typing import Callable, Optional
+from typing import Optional
 import numpy as np
 from torch.utils.data import IterableDataset

careamics/dataset/iterable_pred_dataset.py CHANGED Viewed

@@ -2,9 +2,9 @@
 from __future__ import annotations
-from collections.abc import Generator
+from collections.abc import Callable, Generator
 from pathlib import Path
-from typing import Any, Callable
+from typing import Any
 from numpy.typing import NDArray
 from torch.utils.data import IterableDataset

careamics/dataset/iterable_tiled_pred_dataset.py CHANGED Viewed

@@ -2,9 +2,9 @@
 from __future__ import annotations
-from collections.abc import Generator
+from collections.abc import Callable, Generator
 from pathlib import Path
-from typing import Any, Callable
+from typing import Any
 from numpy.typing import NDArray
 from torch.utils.data import IterableDataset

careamics/dataset/patching/patching.py CHANGED Viewed

@@ -1,8 +1,9 @@
 """Patching functions."""
+from collections.abc import Callable
 from dataclasses import dataclass
 from pathlib import Path
-from typing import Callable, Union
+from typing import Union
 import numpy as np
 from numpy.typing import NDArray
@@ -89,7 +90,7 @@ def prepare_patches_supervised(
     """
     means, stds, num_samples = 0, 0, 0
     all_patches, all_targets = [], []
-    for train_filename, target_filename in zip(train_files, target_files):
+    for train_filename, target_filename in zip(train_files, target_files, strict=False):
         try:
             sample: np.ndarray = read_source_func(train_filename, axes)
             target: np.ndarray = read_source_func(target_filename, axes)

careamics/dataset/tiling/lvae_tiled_patching.py CHANGED Viewed

@@ -78,7 +78,9 @@ def extract_tiles(
                 ...,
                 *[
                     slice(coords, coords + extent)
-                    for coords, extent in zip(crop_coords_start, tile_size)
+                    for coords, extent in zip(
+                        crop_coords_start, tile_size, strict=False
+                    )
                 ],
             )
             tile = sample[crop_slices]
@@ -159,11 +161,14 @@ def compute_tile_info_legacy(
     # --- combine start and end
     stitch_coords = tuple(
-        (start, end) for start, end in zip(stitch_coords_start, stitch_coords_end)
+        (start, end)
+        for start, end in zip(stitch_coords_start, stitch_coords_end, strict=False)
     )
     overlap_crop_coords = tuple(
         (start, end)
-        for start, end in zip(overlap_crop_coords_start, overlap_crop_coords_end)
+        for start, end in zip(
+            overlap_crop_coords_start, overlap_crop_coords_end, strict=False
+        )
     )
     tile_info = TileInformation(
@@ -229,11 +234,14 @@ def compute_tile_info(
     # --- combine start and end
     stitch_coords = tuple(
-        (start, end) for start, end in zip(stitch_coords_start, stitch_coords_end)
+        (start, end)
+        for start, end in zip(stitch_coords_start, stitch_coords_end, strict=False)
     )
     overlap_crop_coords = tuple(
         (start, end)
-        for start, end in zip(overlap_crop_coords_start, overlap_crop_coords_end)
+        for start, end in zip(
+            overlap_crop_coords_start, overlap_crop_coords_end, strict=False
+        )
     )
     # --- Check if last tile
@@ -284,7 +292,9 @@ def compute_padding(
     pad_before = overlaps // 2
     pad_after = covered_shape - data_shape[-len(tile_size) :] - pad_before
-    return tuple((before, after) for before, after in zip(pad_before, pad_after))
+    return tuple(
+        (before, after) for before, after in zip(pad_before, pad_after, strict=False)
+    )
 def n_tiles_1d(axis_size: int, tile_size: int, overlap: int) -> int:

careamics/dataset/tiling/tiled_patching.py CHANGED Viewed

@@ -127,7 +127,7 @@ def extract_tiles(
         # Rearrange crop coordinates from a list of coordinate pairs per axis to a list
         # grouped by type.
         all_crop_coords, all_stitch_coords, all_overlap_crop_coords = zip(
-            *crop_and_stitch_coords_list
+            *crop_and_stitch_coords_list, strict=False
         )
         # Maximum tile index
@@ -139,6 +139,7 @@ def extract_tiles(
                 itertools.product(*all_crop_coords),
                 itertools.product(*all_stitch_coords),
                 itertools.product(*all_overlap_crop_coords),
+                strict=False,
             )
         ):
             # Extract tile from the sample

careamics/dataset_ng/README.md ADDED Viewed

@@ -0,0 +1,212 @@
+# The CAREamics Dataset
+Welcome to the CAREamics dataset!
+A PyTorch based dataset, designed to be used with microscopy data. It is universal for the training, validation and prediction stages of a machine learning pipeline.
+The key ethos is to create a modular and maintainable dataset comprised of swappable components that interact through interfaces. This should facilitate a smooth development process when extending the dataset's function to new features, and also enable advanced users to easily customize the dataset to their needs, by writing custom components. This is achieved by following a few key software engineering principles, detailed at the end of this README file.
+## Dataset Component overview
+```mermaid
+---
+title: CAREamicsDataset
+---
+classDiagram
+    class CAREamicsDataset{
+        +PatchExtractor input_extractor
+        +Optional[PatchExtractor] target_extractor
+        +PatchingStrategy patching_strategy
+        +list~Transform~ transforms
+        +\_\_getitem\_\_(int index) NDArray
+    }
+    class PatchingStrategy{
+        <<interface>>
+        +n_patches int
+        +get_patch_spec(index: int) PatchSpecs
+    }
+    class RandomPatchingStrategy{
+    }
+    class FixedRandomPatchingStrategy{
+    }
+    class SequentialPatchingStrategy{
+    }
+    class TilingStrategy{
+        +get_patch_spec(index: int) TileSpecs
+    }
+    class PatchExtractor{
+        +list~ImageStack~ image_stacks
+        +extract_patch(PatchSpecs) NDArray
+    }
+    class PatchSpecs {
+        <<TypedDict>>
+        +int data_idx
+        +int sample_idx
+        +Sequence~int~ coords
+        +Sequence~int~ patch_size
+    }
+        class TileSpecs {
+        <<TypedDict>>
+        +Sequence~int~ crop_coords
+        +Sequence~int~ crop_size
+        +Sequence~int~ stitch_coords
+    }
+    class ImageStack{
+        <<interface>>
+        +Union[Path, Literal["array"]] source
+        +Sequence~int~ data_shape
+        +DTypeLike data_type
+        +extract_patch(sample_idx, coords, patch_size) NDArray
+    }
+    class InMemoryImageStack {
+    }
+    class ZarrImageStack {
+        +Path source
+    }
+    CAREamicsDataset --* PatchExtractor: Is composed of
+    CAREamicsDataset --* PatchingStrategy: Is composed of
+    PatchExtractor --o ImageStack: Aggregates
+    ImageStack <|-- InMemoryImageStack: Implements
+    ImageStack <|-- ZarrImageStack: Implements
+    PatchingStrategy <|-- RandomPatchingStrategy: Implements
+    PatchingStrategy <|-- FixedRandomPatchingStrategy: Implements
+    PatchingStrategy <|-- SequentialPatchingStrategy: Implements
+    PatchingStrategy <|-- TilingStrategy: Implements
+    PatchSpecs <|-- TileSpecs: Inherits from
+```
+### `ImageStack` and implementations
+This interface represents a set of image data, which can be saved with any subset of the
+axes STCZYX, in any order, see below for a description of the dimensions. The `ImageStack`
+interface's job is to act as an adapter for different data storage types, so that higher
+level classes can access the image data without having to know the implementation details of
+how to load or read data from each storage type. This means we can decide to support new storage
+types by implementing a new concrete `ImageStack` class without having to change anything
+in the `CAREamistDataset` class. Advanced users can also choose to create their own
+`ImageStack` class if they want to work with their own data storage type.
+The interface provides an `extract_patch` method which will produce a patch from the image,
+as a NumPy array, with the dimensions C(Z)YX. This method should be thought of as simply
+a wrapper for the equivalent to NumPy slicing for each of the storage types.
+#### Concrete implementations
+- `InMemoryImageStack`: The underlying data is stored as a NumPy array in memory. It has some
+additional constructor methods to load the data from known file formats such as TIFF files.
+- `ZarrImageStack`: The underlying data is stored as a ZARR file on disk.
+#### Axes description
+- S is a generic sample dimension,
+- T is a time dimension,
+- C is a channel dimension,
+- Z is a spatial dimension,
+- Y is a spatial dimension,
+- X is a spatial dimension.
+### `PatchExtractor`
+The `PatchExtractor` class aggregates many `ImageStack` instances, this allows for multiple
+images with different dimensions, and possibly different storage types to be treated as a single entity.
+The class has an `extract_patch` method to extract a patch from any one of its `ImageStack`
+objects. It can also possibly be extended when extra logic to extract patches is needed,
+for example when constructing lateral-context inputs for the MicroSplit LVAE models.
+### `PatchingStrategy`
+The `PatchingStrategy` class is an interface to generate patch specifications, where each of the
+concrete implementations produce a set of patch specifications using a different strategy.
+It has a `n_patches` attribute that can be accessed to find out how many patches the
+strategy will produce, given the shapes of the image stacks it has been initialized with.
+This is needed by the `CAREamicsDataset` to return its length.
+Most importantly it has a `get_patch_spec` method, that takes an index and returns a
+patch specification. For deterministic patching strategies, this method will always
+return the same patch specification given the same index, but there are also random strategies
+where the returned patch specification will change every time. The given index can never
+be greater than `n_patches`.
+#### Concrete implementations
+- `RandomPatchingStrategy`: this strategy will produce random patches that will change
+even if the `extract_patch` method is called with the same index.
+- `FixedRandomPatchingStrategy`: this strategy will produce random patches, but the patch
+will be the same if the `extract_patch` method is called with the same index. This is
+useful for making sure validation is comparable epoch to epoch.
+- `SequentialPatchingStrategy`: this strategy is deterministic and the patches will be
+sequential with some specified overlap.
+- `TilingStrategy`: this strategy is deterministic and the patches will be
+sequential with some specified overlap. Rather than a `PatchSpecs` dictionary it will
+produce a `TileSpecs` dictionary which includes some extra fields that are used for
+stitching the tiles back together.
+#### PatchSpecs
+The `get_patch_spec` returns a dictionary containing the keys `data_idx`, `sample_idx`, `coords` and `patch_size`.
+These are the exact arguments that the `PatchExtractor.extract_patch` method takes. The patch specification
+produced by the patching strategy is received by the `PatchExtractor` to in-turn produce an image patch.
+For type hinting, `PatchSpecs` is defined as a `TypedDict`.
+## Key Principles
+The aim of all these principles is to create a system of interacting classes that have
+low coupling. This allows for one section to be changed or extended without breaking functionality
+elsewhere in the codebase.
+### Composition over inheritance
+The principle of composition over inheritance is: rather than using inheritance to
+extend or change the behavior of a class, instead, a class can be composed of modules
+that can be swapped to extend or change behavior.
+The reason to use composition is that it promotes the easy reuse of the underlying
+components, it can prevent a subclass explosion, and it leads to a maintainable and
+easily extendable design. A software architecture based on composition is normally
+maintainable and extendable because if a component needs to change then the whole class
+shouldn't have to be refactored and if a new feature needs to be added, usually an additional
+component can be added to the class.
+The `CAREamicsDataset` is composed of `PatchExtractor` and `PatchingStrategy` and `Transfrom` components.
+The `PatchingStrategy` classes implement an interface so the dataset can switch between
+different strategies. The `PatchExtractor` is composed of many `ImageStack` instances,
+new image stacks can be added to extend the type of data that the dataset can read from.
+### Dependency Inversion
+The dependency inversion principle states:
+1. High-level modules should not depend on low-level modules. Both high-level and
+low-level modules should depend on abstractions (e.g. interfaces).
+2. Abstractions should not depend on details (concrete implementations). Details should
+depend on abstractions.
+In other words high level modules that provide complex logic should be easily reusable
+and not depend on implementation details of low-level modules that provide utility functionality.
+This can be achieved by introducing abstractions that decouple high and low level modules.
+An example of the dependency inversion principle in use is how the `PatchExtractor` only
+depends on the `ImageStack` interface, and does not have to have any knowledge of the
+concrete implementations. The concrete `ImageStack` implementations also do not have
+any knowledge of the `PatchExtractor` or any other higher-level functionality that the
+dataset needs.
+### Single Responsibility Principle
+Each component should have a small scope of responsibility that is easily defined. This
+should make the code easier to maintain and hopefully reduce the number of places in the
+code that have to change when introducing a new feature.
+- `ImageStack` responsibility: to act as an adapter for loading and reading image data
+from different underlying storage.
+- `PatchExtractor` responsibility: to extract patches from a set of image stacks.
+- `PatchingStrategy` responsibility: to produce patch specifications given an index, through
+an interface that hides the underlying implementation.
+- `CAREamicsDataset` responsibility: to orchestrate the interactions of its underlying
+components to produce an input patch (and target patch when required) given an index.

careamics 0.0.11__py3-none-any.whl → 0.0.13__py3-none-any.whl

Potentially problematic release.

careamics 0.0.11py3-none-any.whl → 0.0.13py3-none-any.whl