PyPI - datamint - Versions diffs - 2.3.5__tar.gz → 2.4.0__tar.gz - Mend

datamint 2.3.5tar.gz → 2.4.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of datamint might be problematic. Click here for more details.

Files changed (68) hide show

datamint-2.4.0/PKG-INFO ADDED Viewed

@@ -0,0 +1,320 @@
+Metadata-Version: 2.4
+Name: datamint
+Version: 2.4.0
+Summary: A library for interacting with the Datamint API, designed for efficient data management, processing and Deep Learning workflows.
+Requires-Python: >=3.10
+Classifier: Programming Language :: Python :: 3
+Classifier: Programming Language :: Python :: 3.10
+Classifier: Programming Language :: Python :: 3.11
+Classifier: Programming Language :: Python :: 3.12
+Classifier: Programming Language :: Python :: 3.13
+Classifier: Programming Language :: Python :: 3.14
+Provides-Extra: dev
+Provides-Extra: docs
+Requires-Dist: Deprecated (>=1.2.0)
+Requires-Dist: aiohttp (>=3.0.0,<4.0.0)
+Requires-Dist: aioresponses (>=0.7.8,<0.8.0) ; extra == "dev"
+Requires-Dist: albumentations (>=2.0.0)
+Requires-Dist: backports-strenum ; python_version < "3.11"
+Requires-Dist: datamintapi (==0.0.*)
+Requires-Dist: httpx
+Requires-Dist: humanize (>=4.0.0,<5.0.0)
+Requires-Dist: lazy-loader (>=0.3.0)
+Requires-Dist: lightning (>=2.0.0,!=2.5.1,!=2.5.1.post0)
+Requires-Dist: matplotlib
+Requires-Dist: medimgkit (>=0.7.3)
+Requires-Dist: mlflow (>=2.0.0,<3.0.0)
+Requires-Dist: nest-asyncio (>=1.0.0,<2.0.0)
+Requires-Dist: nibabel (>=4.0.0)
+Requires-Dist: numpy
+Requires-Dist: opencv-python (>=4.0.0)
+Requires-Dist: pandas (>=2.0.0)
+Requires-Dist: platformdirs (>=4.0.0,<5.0.0)
+Requires-Dist: pydantic (>=2.6.4)
+Requires-Dist: pydicom (>=3.0.0,<4.0.0)
+Requires-Dist: pylibjpeg (>=2.0.0,<3.0.0)
+Requires-Dist: pylibjpeg-libjpeg (>=2.0.0,<3.0.0)
+Requires-Dist: pytest (>=7.0.0,<8.0.0) ; extra == "dev"
+Requires-Dist: pytest-cov (>=4.0.0,<5.0.0) ; extra == "dev"
+Requires-Dist: pyyaml (>=5.0.0)
+Requires-Dist: requests (>=2.0.0,<3.0.0)
+Requires-Dist: responses (>=0.20.0,<0.21.0) ; extra == "dev"
+Requires-Dist: rich (>=10.0.0)
+Requires-Dist: setuptools (>=57.0) ; extra == "docs"
+Requires-Dist: sphinx (>=5.0) ; extra == "docs"
+Requires-Dist: sphinx-tabs (>=3.0.0) ; extra == "docs"
+Requires-Dist: sphinx_rtd_theme (>=2.0.0) ; extra == "docs"
+Requires-Dist: torch (>=1.2.0,!=2.3.0)
+Requires-Dist: torchvision (>=0.18.0)
+Requires-Dist: tqdm (>=4.0.0,<5.0.0)
+Requires-Dist: typing_extensions (>=4.0.0)
+Description-Content-Type: text/markdown
+# Datamint Python API
+![Build Status](https://github.com/SonanceAI/datamint-python-api/actions/workflows/run_test.yaml/badge.svg)
+[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
+A comprehensive Python SDK for interacting with the Datamint platform, providing seamless integration for medical imaging workflows, dataset management, and machine learning experiments.
+## 📋 Table of Contents
+- [Features](#-features)
+- [Installation](#-installation)
+- [Quick Setup](#-quick-setup)
+- [Documentation](#-documentation)
+- [Key Components](#-key-components)
+- [Command Line Tools](#️-command-line-tools)
+- [Examples](#-examples)
+- [Support](#-support)
+## 🚀 Features
+- **Dataset Management**: Download, upload, and manage medical imaging datasets
+- **Annotation Tools**: Create, upload, and manage annotations (segmentations, labels, measurements)
+- **Experiment Tracking**: Integrated MLflow support for experiment management
+- **PyTorch Lightning Integration**: Streamlined ML workflows with Lightning DataModules and callbacks
+- **DICOM Support**: Native handling of DICOM files with anonymization capabilities
+- **Multi-format Support**: PNG, JPEG, NIfTI, and other medical imaging formats
+See the full documentation at https://sonanceai.github.io/datamint-python-api/
+## 📦 Installation
+> [!NOTE]
+> We recommend using a virtual environment to avoid package conflicts.
+### From PyPI
+To be released soon
+### From Source
+```bash
+pip install git+https://github.com/SonanceAI/datamint-python-api
+```
+### Virtual Environment Setup
+<details>
+<summary>Click to expand virtual environment setup instructions</summary>
+We recommend that you install Datamint in a dedicated virtual environment, to avoid conflicting with your system packages.
+For instance, create the enviroment once with `python3 -m venv datamint-env` and then activate it whenever you need it with:
+1. **Create the environment** (one-time setup):
+   ```bash
+   python3 -m venv datamint-env
+   ```
+2. **Activate the environment** (run whenever you need it):
+   | Platform | Command |
+   |----------|---------|
+   | Linux/macOS | `source datamint-env/bin/activate` |
+   | Windows CMD | `datamint-env\Scripts\activate.bat` |
+   | Windows PowerShell | `datamint-env\Scripts\Activate.ps1` |
+3. **Install the package**:
+   ```bash
+   pip install git+https://github.com/SonanceAI/datamint-python-api
+   ```
+</details>
+## Setup API key
+To use the Datamint API, you need to setup your API key (ask your administrator if you don't have one). Use one of the following methods to setup your API key:
+### Method 1: Command-line tool (recommended)
+Run ``datamint-config`` in the terminal and follow the instructions. See [command_line_tools](https://sonanceai.github.io/datamint-python-api/command_line_tools.html) for more details.
+### Method 2: Environment variable
+Specify the API key as an environment variable.
+**Bash:**
+```bash
+export DATAMINT_API_KEY="my_api_key"
+# run your commands (e.g., `datamint-upload`, `python script.py`)
+```
+**Python:**
+```python
+import os
+os.environ["DATAMINT_API_KEY"] = "my_api_key"
+```
+## 📚 Documentation
+| Resource | Description |
+|----------|-------------|
+| [🚀 Getting Started](docs/source/getting_started.rst) | Step-by-step setup and basic usage |
+| [📖 API Reference](docs/source/client_api.rst) | Complete API documentation |
+| [🔥 PyTorch Integration](docs/source/pytorch_integration.rst) | ML workflow integration |
+| [💡 Examples](examples/) | Practical usage examples |
+## 🔗 Key Components
+### Dataset Management
+```python
+from datamint import Dataset
+# Load dataset with annotations
+dataset = Dataset(
+    project_name="medical-segmentation",
+)
+# Access data
+for sample in dataset:
+    image = sample['image']       # torch.Tensor
+    mask = sample['segmentation'] # torch.Tensor (if available)
+    metadata = sample['metainfo'] # dict
+```
+### PyTorch Lightning Integration
+```python
+import lightning as L
+from datamint.lightning import DatamintDataModule
+from datamint.mlflow.lightning.callbacks import MLFlowModelCheckpoint
+# Data module
+datamodule = DatamintDataModule(
+    project_name="your-project",
+    batch_size=16,
+    train_split=0.8
+)
+# ML tracking callback
+checkpoint_callback = MLFlowModelCheckpoint(
+    monitor="val_loss",
+    save_top_k=1,
+    register_model_name="best-model"
+)
+# Trainer with MLflow logging
+trainer = L.Trainer(
+    max_epochs=100,
+    callbacks=[checkpoint_callback],
+    logger=L.pytorch.loggers.MLFlowLogger(
+        experiment_name="medical-segmentation"
+    )
+)
+```
+### Annotation Management
+```python
+# Upload segmentation masks
+api.upload_segmentations(
+    resource_id="resource-123",
+    file_path="segmentation.nii.gz",
+    name="liver_segmentation",
+    frame_index=0
+)
+# Add categorical annotations
+api.add_image_category_annotation(
+    resource_id="resource-123",
+    identifier="diagnosis",
+    value="positive"
+)
+# Add geometric annotations
+api.add_line_annotation(
+    point1=(10, 20),
+    point2=(50, 80),
+    resource_id="resource-123",
+    identifier="measurement",
+    frame_index=5
+)
+```
+## 🛠️ Command Line Tools
+### Upload Resources
+**Upload DICOM files with anonymization:**
+```bash
+datamint-upload \
+    --path /path/to/dicoms \
+    --recursive \
+    --channel "training-data" \
+    --anonymize \
+    --publish
+```
+**Upload with segmentation masks:**
+```bash
+datamint-upload \
+    --path /path/to/images \
+    --segmentation_path /path/to/masks \
+    --segmentation_names segmentation_config.yaml
+```
+### Configuration Management
+```bash
+# Interactive setup
+datamint-config
+# Set API key
+datamint-config --api-key "your-key"
+```
+## 🔍 Examples
+### Medical Image Segmentation Pipeline
+```python
+import torch
+import lightning as L
+from datamint.lightning import DatamintDataModule
+from datamint.mlflow.lightning.callbacks import MLFlowModelCheckpoint
+class SegmentationModel(L.LightningModule):
+    def __init__(self):
+        super().__init__()
+        # Model definition...
+    def training_step(self, batch, batch_idx):
+        # Training logic...
+        pass
+# Setup data
+datamodule = DatamintDataModule(
+    project_name="liver-segmentation",
+    batch_size=8,
+    train_split=0.8
+)
+# Setup model with MLflow tracking
+model = SegmentationModel()
+checkpoint_cb = MLFlowModelCheckpoint(
+    monitor="val_dice",
+    mode="max",
+    register_model_name="liver-segmentation-model"
+)
+# Train
+trainer = L.Trainer(
+    max_epochs=50,
+    callbacks=[checkpoint_cb],
+    logger=L.pytorch.loggers.MLFlowLogger()
+)
+trainer.fit(model, datamodule)
+```
+## 🆘 Support
+[Full Documentation](https://datamint-python-api.readthedocs.io/)
+[GitHub Issues](https://github.com/SonanceAI/datamint-python-api/issues)

datamint-2.4.0/README.md ADDED Viewed

@@ -0,0 +1,267 @@
+# Datamint Python API
+![Build Status](https://github.com/SonanceAI/datamint-python-api/actions/workflows/run_test.yaml/badge.svg)
+[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
+A comprehensive Python SDK for interacting with the Datamint platform, providing seamless integration for medical imaging workflows, dataset management, and machine learning experiments.
+## 📋 Table of Contents
+- [Features](#-features)
+- [Installation](#-installation)
+- [Quick Setup](#-quick-setup)
+- [Documentation](#-documentation)
+- [Key Components](#-key-components)
+- [Command Line Tools](#️-command-line-tools)
+- [Examples](#-examples)
+- [Support](#-support)
+## 🚀 Features
+- **Dataset Management**: Download, upload, and manage medical imaging datasets
+- **Annotation Tools**: Create, upload, and manage annotations (segmentations, labels, measurements)
+- **Experiment Tracking**: Integrated MLflow support for experiment management
+- **PyTorch Lightning Integration**: Streamlined ML workflows with Lightning DataModules and callbacks
+- **DICOM Support**: Native handling of DICOM files with anonymization capabilities
+- **Multi-format Support**: PNG, JPEG, NIfTI, and other medical imaging formats
+See the full documentation at https://sonanceai.github.io/datamint-python-api/
+## 📦 Installation
+> [!NOTE]
+> We recommend using a virtual environment to avoid package conflicts.
+### From PyPI
+To be released soon
+### From Source
+```bash
+pip install git+https://github.com/SonanceAI/datamint-python-api
+```
+### Virtual Environment Setup
+<details>
+<summary>Click to expand virtual environment setup instructions</summary>
+We recommend that you install Datamint in a dedicated virtual environment, to avoid conflicting with your system packages.
+For instance, create the enviroment once with `python3 -m venv datamint-env` and then activate it whenever you need it with:
+1. **Create the environment** (one-time setup):
+   ```bash
+   python3 -m venv datamint-env
+   ```
+2. **Activate the environment** (run whenever you need it):
+   | Platform | Command |
+   |----------|---------|
+   | Linux/macOS | `source datamint-env/bin/activate` |
+   | Windows CMD | `datamint-env\Scripts\activate.bat` |
+   | Windows PowerShell | `datamint-env\Scripts\Activate.ps1` |
+3. **Install the package**:
+   ```bash
+   pip install git+https://github.com/SonanceAI/datamint-python-api
+   ```
+</details>
+## Setup API key
+To use the Datamint API, you need to setup your API key (ask your administrator if you don't have one). Use one of the following methods to setup your API key:
+### Method 1: Command-line tool (recommended)
+Run ``datamint-config`` in the terminal and follow the instructions. See [command_line_tools](https://sonanceai.github.io/datamint-python-api/command_line_tools.html) for more details.
+### Method 2: Environment variable
+Specify the API key as an environment variable.
+**Bash:**
+```bash
+export DATAMINT_API_KEY="my_api_key"
+# run your commands (e.g., `datamint-upload`, `python script.py`)
+```
+**Python:**
+```python
+import os
+os.environ["DATAMINT_API_KEY"] = "my_api_key"
+```
+## 📚 Documentation
+| Resource | Description |
+|----------|-------------|
+| [🚀 Getting Started](docs/source/getting_started.rst) | Step-by-step setup and basic usage |
+| [📖 API Reference](docs/source/client_api.rst) | Complete API documentation |
+| [🔥 PyTorch Integration](docs/source/pytorch_integration.rst) | ML workflow integration |
+| [💡 Examples](examples/) | Practical usage examples |
+## 🔗 Key Components
+### Dataset Management
+```python
+from datamint import Dataset
+# Load dataset with annotations
+dataset = Dataset(
+    project_name="medical-segmentation",
+)
+# Access data
+for sample in dataset:
+    image = sample['image']       # torch.Tensor
+    mask = sample['segmentation'] # torch.Tensor (if available)
+    metadata = sample['metainfo'] # dict
+```
+### PyTorch Lightning Integration
+```python
+import lightning as L
+from datamint.lightning import DatamintDataModule
+from datamint.mlflow.lightning.callbacks import MLFlowModelCheckpoint
+# Data module
+datamodule = DatamintDataModule(
+    project_name="your-project",
+    batch_size=16,
+    train_split=0.8
+)
+# ML tracking callback
+checkpoint_callback = MLFlowModelCheckpoint(
+    monitor="val_loss",
+    save_top_k=1,
+    register_model_name="best-model"
+)
+# Trainer with MLflow logging
+trainer = L.Trainer(
+    max_epochs=100,
+    callbacks=[checkpoint_callback],
+    logger=L.pytorch.loggers.MLFlowLogger(
+        experiment_name="medical-segmentation"
+    )
+)
+```
+### Annotation Management
+```python
+# Upload segmentation masks
+api.upload_segmentations(
+    resource_id="resource-123",
+    file_path="segmentation.nii.gz",
+    name="liver_segmentation",
+    frame_index=0
+)
+# Add categorical annotations
+api.add_image_category_annotation(
+    resource_id="resource-123",
+    identifier="diagnosis",
+    value="positive"
+)
+# Add geometric annotations
+api.add_line_annotation(
+    point1=(10, 20),
+    point2=(50, 80),
+    resource_id="resource-123",
+    identifier="measurement",
+    frame_index=5
+)
+```
+## 🛠️ Command Line Tools
+### Upload Resources
+**Upload DICOM files with anonymization:**
+```bash
+datamint-upload \
+    --path /path/to/dicoms \
+    --recursive \
+    --channel "training-data" \
+    --anonymize \
+    --publish
+```
+**Upload with segmentation masks:**
+```bash
+datamint-upload \
+    --path /path/to/images \
+    --segmentation_path /path/to/masks \
+    --segmentation_names segmentation_config.yaml
+```
+### Configuration Management
+```bash
+# Interactive setup
+datamint-config
+# Set API key
+datamint-config --api-key "your-key"
+```
+## 🔍 Examples
+### Medical Image Segmentation Pipeline
+```python
+import torch
+import lightning as L
+from datamint.lightning import DatamintDataModule
+from datamint.mlflow.lightning.callbacks import MLFlowModelCheckpoint
+class SegmentationModel(L.LightningModule):
+    def __init__(self):
+        super().__init__()
+        # Model definition...
+    def training_step(self, batch, batch_idx):
+        # Training logic...
+        pass
+# Setup data
+datamodule = DatamintDataModule(
+    project_name="liver-segmentation",
+    batch_size=8,
+    train_split=0.8
+)
+# Setup model with MLflow tracking
+model = SegmentationModel()
+checkpoint_cb = MLFlowModelCheckpoint(
+    monitor="val_dice",
+    mode="max",
+    register_model_name="liver-segmentation-model"
+)
+# Train
+trainer = L.Trainer(
+    max_epochs=50,
+    callbacks=[checkpoint_cb],
+    logger=L.pytorch.loggers.MLFlowLogger()
+)
+trainer.fit(model, datamodule)
+```
+## 🆘 Support
+[Full Documentation](https://datamint-python-api.readthedocs.io/)
+[GitHub Issues](https://github.com/SonanceAI/datamint-python-api/issues)

{datamint-2.3.5 → datamint-2.4.0}/datamint/api/base_api.py RENAMED Viewed

@@ -61,22 +61,56 @@ class BaseApi:
             client: Optional HTTP client instance. If None, a new one will be created.
         """
         self.config = config
-        self.client = client or self._create_client()
+        self._owns_client = client is None  # Track if we created the client
+        self.client = client or BaseApi._create_client(config)
         self.semaphore = asyncio.Semaphore(20)
         self._api_instance: 'Api | None' = None  # Injected by Api class
-    def _create_client(self) -> httpx.Client:
-        """Create and configure HTTP client with authentication and timeouts."""
-        headers = None
-        if self.config.api_key:
-            headers = {"apikey": self.config.api_key}
+    @staticmethod
+    def _create_client(config: ApiConfig) -> httpx.Client:
+        """Create and configure HTTP client with authentication and timeouts.
+        The client is designed to be long-lived and reused across multiple requests.
+        It maintains connection pooling for improved performance.
+        Default limits: max_keepalive_connections=20, max_connections=100
+        """
+        headers = {"apikey": config.api_key} if config.api_key else None
         return httpx.Client(
-            base_url=self.config.server_url,
+            base_url=config.server_url,
             headers=headers,
-            timeout=self.config.timeout
+            timeout=config.timeout,
+            limits=httpx.Limits(
+                max_keepalive_connections=5,  # Increased from default 20
+                max_connections=20,  # Increased from default 100
+                keepalive_expiry=8
+            )
         )
+    def close(self) -> None:
+        """Close the HTTP client and release resources.
+        Should be called when the API instance is no longer needed.
+        Only closes the client if it was created by this instance.
+        """
+        if self._owns_client and self.client is not None:
+            self.client.close()
+    def __enter__(self):
+        """Context manager entry."""
+        return self
+    def __exit__(self, exc_type, exc_val, exc_tb):
+        """Context manager exit - ensures client is closed."""
+        self.close()
+    def __del__(self):
+        """Destructor - ensures client is closed when instance is garbage collected."""
+        try:
+            self.close()
+        except Exception:
+            pass  # Ignore errors during cleanup
     def _stream_request(self, method: str, endpoint: str, **kwargs):
         """Make streaming HTTP request with error handling.

{datamint-2.3.5 → datamint-2.4.0}/datamint/api/client.py RENAMED Viewed

@@ -68,6 +68,8 @@ class Api:
                                     f" Please check your api_key and/or other configurations. {e}")
     def _get_endpoint(self, name: str):
+        if self._client is None:
+            self._client = BaseApi._create_client(self.config)
         if name not in self._endpoints:
             api_class = self._API_MAP[name]
             endpoint = api_class(self.config, self._client)

{datamint-2.3.5 → datamint-2.4.0}/datamint/apihandler/base_api_handler.py RENAMED Viewed

@@ -30,7 +30,6 @@ ResourceFields: TypeAlias = Literal['modality', 'created_by', 'published_by', 'p
 _PAGE_LIMIT = 5000
 @deprecated(reason="Please use `from datamint import Api` instead.", version="2.0.0")
 class BaseAPIHandler:
     """

{datamint-2.3.5 → datamint-2.4.0}/datamint/apihandler/dto/annotation_dto.py RENAMED Viewed

@@ -178,6 +178,8 @@ class CreateAnnotationDto:
         if model_id is not None:
             if is_model == False:
                 raise ValueError("model_id==False while self.model_id is provided.")
+            if not isinstance(model_id, str):
+                raise ValueError("model_id must be a string if provided.")
             is_model = True
         self.is_model = is_model
         self.geometry = geometry

{datamint-2.3.5 → datamint-2.4.0}/datamint/dataset/base_dataset.py RENAMED Viewed

@@ -307,6 +307,10 @@ class DatamintBaseDataset:
         self.image_lsets, self.image_lcodes = self._get_labels_set(framed=False)
         worklist_id = self.get_info()['worklist_id']
         groups: dict[str, dict] = self.api.annotationsets.get_segmentation_group(worklist_id)['groups']
+        if not groups:
+            self.seglabel_list = []
+            self.seglabel2code = {}
+            return
         # order by 'index' key
         max_index = max([g['index'] for g in groups.values()])
         self.seglabel_list : list[str] = ['UNKNOWN'] * max_index  # 1-based

datamint-2.4.0/datamint/lightning/__init__.py ADDED Viewed

	@@ -0,0 +1 @@
1	+ from .datamintdatamodule import DatamintDataModule

datamint 2.3.5__tar.gz → 2.4.0__tar.gz

Potentially problematic release.

datamint 2.3.5tar.gz → 2.4.0tar.gz