PyPI - hydraflow - Versions diffs - 0.15.0__tar.gz → 0.16.0__tar.gz - Mend

hydraflow 0.15.0tar.gz → 0.16.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (105) hide show

{hydraflow-0.15.0 → hydraflow-0.16.0}/.gitignore RENAMED Viewed

@@ -1,11 +1,11 @@
 *.db
+**/mlruns/
+**/multirun/
+**/outputs/
 .coverage*
 .env
 .venv/
 __pycache__/
 dist/
 lcov.info
-mlruns/
-multirun/
-outputs/
 uv.lock

{hydraflow-0.15.0 → hydraflow-0.16.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: hydraflow
-Version: 0.15.0
+Version: 0.16.0
 Summary: HydraFlow seamlessly integrates Hydra and MLflow to streamline ML experiment management, combining Hydra's configuration management with MLflow's tracking capabilities.
 Project-URL: Documentation, https://daizutabi.github.io/hydraflow/
 Project-URL: Source, https://github.com/daizutabi/hydraflow
@@ -51,7 +51,7 @@ Requires-Dist: ruff>=0.11
 Requires-Dist: typer>=0.15
 Description-Content-Type: text/markdown
-# Hydraflow
+# HydraFlow
 [![PyPI Version][pypi-v-image]][pypi-v-link]
 [![Build Status][GHAction-image]][GHAction-link]
@@ -60,6 +60,7 @@ Description-Content-Type: text/markdown
 [![Python Version][python-v-image]][python-v-link]
 <!-- Badges -->
 [pypi-v-image]: https://img.shields.io/pypi/v/hydraflow.svg
 [pypi-v-link]: https://pypi.org/project/hydraflow/
 [GHAction-image]: https://github.com/daizutabi/hydraflow/actions/workflows/ci.yaml/badge.svg?branch=main&event=push
@@ -73,117 +74,125 @@ Description-Content-Type: text/markdown
 ## Overview
-Hydraflow is a library designed to seamlessly integrate
-[Hydra](https://hydra.cc/) and [MLflow](https://mlflow.org/), making it easier to
-manage and track machine learning experiments. By combining the flexibility of
-Hydra's configuration management with the robust experiment tracking capabilities
-of MLflow, Hydraflow provides a comprehensive solution for managing complex
-machine learning workflows.
+HydraFlow seamlessly integrates [Hydra](https://hydra.cc/) and [MLflow](https://mlflow.org/) to streamline machine learning experiment workflows. By combining Hydra's powerful configuration management with MLflow's robust experiment tracking, HydraFlow provides a comprehensive solution for defining, executing, and analyzing machine learning experiments.
+## Design Principles
+HydraFlow is built on the following design principles:
+1. **Type Safety** - Utilizing Python dataclasses for configuration type checking and IDE support
+2. **Reproducibility** - Automatically tracking all experiment configurations for fully reproducible experiments
+3. **Analysis Capabilities** - Providing powerful APIs for easily analyzing experiment results
+4. **Workflow Integration** - Creating a cohesive workflow by integrating Hydra's configuration management with MLflow's experiment tracking
 ## Key Features
-- **Configuration Management**: Utilize Hydra's advanced configuration management
-  to handle complex parameter sweeps and experiment setups.
-- **Experiment Tracking**: Leverage MLflow's tracking capabilities to log parameters,
-  metrics, and artifacts for each run.
-- **Artifact Management**: Automatically log and manage artifacts, such as model
-  checkpoints and configuration files, with MLflow.
-- **Seamless Integration**: Easily integrate Hydra and MLflow in your machine learning
-  projects with minimal setup.
-- **Rich CLI Interface**: Command-line tools for managing experiments and viewing results.
-- **Cross-Platform Support**: Works consistently across different operating systems.
+- **Type-safe Configuration Management** - Define experiment parameters using Python dataclasses with full IDE support and validation
+- **Seamless Hydra-MLflow Integration** - Automatically register configurations with Hydra and track experiments with MLflow
+- **Advanced Parameter Sweeps** - Define complex parameter spaces using extended sweep syntax for numerical ranges, combinations, and SI prefixes
+- **Workflow Automation** - Create reusable experiment workflows with YAML-based job definitions
+- **Powerful Analysis Tools** - Filter, group, and analyze experiment results with type-aware APIs
+- **Custom Implementation Support** - Extend experiment analysis with domain-specific functionality
 ## Installation
-You can install Hydraflow via pip:
 ```bash
 pip install hydraflow
 ```
 **Requirements:** Python 3.13+
-## Quick Start
-Here is a simple example to get you started with Hydraflow:
+## Quick Example
 ```python
-from __future__ import annotations
 from dataclasses import dataclass
-from typing import TYPE_CHECKING
+from mlflow.entities import Run
 import hydraflow
-import mlflow
-if TYPE_CHECKING:
-    from mlflow.entities import Run
+@dataclass
+class Config:
+    width: int = 1024
+    height: int = 768
+@hydraflow.main(Config)
+def app(run: Run, cfg: Config) -> None:
+    # Your experiment code here
+    print(f"Running with width={cfg.width}, height={cfg.height}")
+    # Log metrics
+    hydraflow.log_metric("area", cfg.width * cfg.height)
+if __name__ == "__main__":
+    app()
+```
+Execute a parameter sweep with:
+```bash
+python app.py -m width=800,1200 height=600,900
+```
+## Core Components
+HydraFlow consists of the following key components:
+### Configuration Management
+Define type-safe configurations using Python dataclasses:
+```python
 @dataclass
 class Config:
-    """Configuration for the ML training experiment."""
-    # Training hyperparameters
     learning_rate: float = 0.001
     batch_size: int = 32
     epochs: int = 10
+```
-    # Model architecture parameters
-    hidden_size: int = 128
-    dropout: float = 0.1
-    # Dataset parameters
-    train_size: float = 0.8
-    random_seed: int = 42
+### Main Decorator
+The `@hydraflow.main` decorator integrates Hydra and MLflow:
+```python
 @hydraflow.main(Config)
-def app(run: Run, cfg: Config):
-    """Train a model with the given configuration.
-    This example demonstrates how to:
+def train(run: Run, cfg: Config) -> None:
+    # Your experiment code
+```
-    1. Define a configuration using dataclasses
-    2. Use Hydraflow to integrate with MLflow
-    3. Track metrics and parameters automatically
+### Workflow Automation
-    Args:
-        run: MLflow run for the experiment corresponding to the Hydra app.
-            This `Run` instance is automatically created by Hydraflow.
-        cfg: Configuration for the experiment's run.
-            This `Config` instance is originally defined by Hydra, and then
-            automatically passed to the app by Hydraflow.
-    """
-    # Training loop
-    for epoch in range(cfg.epochs):
-        # Simulate training and validation
-        train_loss = 1.0 / (epoch + 1)
-        val_loss = 1.1 / (epoch + 1)
+Define reusable experiment workflows in YAML:
-        # Log metrics to MLflow
-        mlflow.log_metrics({
-            "train_loss": train_loss,
-            "val_loss": val_loss
-        }, step=epoch)
+```yaml
+jobs:
+  train_models:
+    run: python train.py
+    sets:
+      - each: model=small,medium,large
+        all: learning_rate=0.001,0.01,0.1
+```
-        print(f"Epoch {epoch}: train_loss={train_loss:.4f}, val_loss={val_loss:.4f}")
+### Analysis Tools
+Analyze experiment results with powerful APIs:
-if __name__ == "__main__":
-    app()
-```
+```python
+from hydraflow import Run, iter_run_dirs
-This example demonstrates:
+# Load runs
+runs = Run.load(iter_run_dirs("mlruns"))
-- Configuration management with Hydra
-- Automatic experiment tracking with MLflow
-- Parameter logging and metric tracking
-- Type-safe configuration with dataclasses
+# Filter and analyze
+best_runs = runs.filter(model_type="transformer").to_frame("learning_rate", "accuracy")
+```
 ## Documentation
-For detailed documentation, including advanced usage examples and API reference,
-visit our [documentation site](https://daizutabi.github.io/hydraflow/).
+For detailed documentation, visit our [documentation site](https://daizutabi.github.io/hydraflow/):
+- [Getting Started](https://daizutabi.github.io/hydraflow/getting-started/) - Installation and core concepts
+- [Practical Tutorials](https://daizutabi.github.io/hydraflow/practical-tutorials/) - Learn through hands-on examples
+- [User Guide](https://daizutabi.github.io/hydraflow/part1-applications/) - Detailed documentation of HydraFlow's capabilities
+- [API Reference](https://daizutabi.github.io/hydraflow/api/hydraflow/) - Complete API documentation
 ## Contributing
@@ -191,4 +200,4 @@ We welcome contributions! Please see our [contributing guide](CONTRIBUTING.md) f
 ## License
-This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
+This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

hydraflow-0.16.0/README.md ADDED Viewed

@@ -0,0 +1,150 @@
+# HydraFlow
+[![PyPI Version][pypi-v-image]][pypi-v-link]
+[![Build Status][GHAction-image]][GHAction-link]
+[![Coverage Status][codecov-image]][codecov-link]
+[![Documentation Status][docs-image]][docs-link]
+[![Python Version][python-v-image]][python-v-link]
+<!-- Badges -->
+[pypi-v-image]: https://img.shields.io/pypi/v/hydraflow.svg
+[pypi-v-link]: https://pypi.org/project/hydraflow/
+[GHAction-image]: https://github.com/daizutabi/hydraflow/actions/workflows/ci.yaml/badge.svg?branch=main&event=push
+[GHAction-link]: https://github.com/daizutabi/hydraflow/actions?query=event%3Apush+branch%3Amain
+[codecov-image]: https://codecov.io/github/daizutabi/hydraflow/coverage.svg?branch=main
+[codecov-link]: https://codecov.io/github/daizutabi/hydraflow?branch=main
+[docs-image]: https://img.shields.io/badge/docs-latest-blue.svg
+[docs-link]: https://daizutabi.github.io/hydraflow/
+[python-v-image]: https://img.shields.io/pypi/pyversions/hydraflow.svg
+[python-v-link]: https://pypi.org/project/hydraflow
+## Overview
+HydraFlow seamlessly integrates [Hydra](https://hydra.cc/) and [MLflow](https://mlflow.org/) to streamline machine learning experiment workflows. By combining Hydra's powerful configuration management with MLflow's robust experiment tracking, HydraFlow provides a comprehensive solution for defining, executing, and analyzing machine learning experiments.
+## Design Principles
+HydraFlow is built on the following design principles:
+1. **Type Safety** - Utilizing Python dataclasses for configuration type checking and IDE support
+2. **Reproducibility** - Automatically tracking all experiment configurations for fully reproducible experiments
+3. **Analysis Capabilities** - Providing powerful APIs for easily analyzing experiment results
+4. **Workflow Integration** - Creating a cohesive workflow by integrating Hydra's configuration management with MLflow's experiment tracking
+## Key Features
+- **Type-safe Configuration Management** - Define experiment parameters using Python dataclasses with full IDE support and validation
+- **Seamless Hydra-MLflow Integration** - Automatically register configurations with Hydra and track experiments with MLflow
+- **Advanced Parameter Sweeps** - Define complex parameter spaces using extended sweep syntax for numerical ranges, combinations, and SI prefixes
+- **Workflow Automation** - Create reusable experiment workflows with YAML-based job definitions
+- **Powerful Analysis Tools** - Filter, group, and analyze experiment results with type-aware APIs
+- **Custom Implementation Support** - Extend experiment analysis with domain-specific functionality
+## Installation
+```bash
+pip install hydraflow
+```
+**Requirements:** Python 3.13+
+## Quick Example
+```python
+from dataclasses import dataclass
+from mlflow.entities import Run
+import hydraflow
+@dataclass
+class Config:
+    width: int = 1024
+    height: int = 768
+@hydraflow.main(Config)
+def app(run: Run, cfg: Config) -> None:
+    # Your experiment code here
+    print(f"Running with width={cfg.width}, height={cfg.height}")
+    # Log metrics
+    hydraflow.log_metric("area", cfg.width * cfg.height)
+if __name__ == "__main__":
+    app()
+```
+Execute a parameter sweep with:
+```bash
+python app.py -m width=800,1200 height=600,900
+```
+## Core Components
+HydraFlow consists of the following key components:
+### Configuration Management
+Define type-safe configurations using Python dataclasses:
+```python
+@dataclass
+class Config:
+    learning_rate: float = 0.001
+    batch_size: int = 32
+    epochs: int = 10
+```
+### Main Decorator
+The `@hydraflow.main` decorator integrates Hydra and MLflow:
+```python
+@hydraflow.main(Config)
+def train(run: Run, cfg: Config) -> None:
+    # Your experiment code
+```
+### Workflow Automation
+Define reusable experiment workflows in YAML:
+```yaml
+jobs:
+  train_models:
+    run: python train.py
+    sets:
+      - each: model=small,medium,large
+        all: learning_rate=0.001,0.01,0.1
+```
+### Analysis Tools
+Analyze experiment results with powerful APIs:
+```python
+from hydraflow import Run, iter_run_dirs
+# Load runs
+runs = Run.load(iter_run_dirs("mlruns"))
+# Filter and analyze
+best_runs = runs.filter(model_type="transformer").to_frame("learning_rate", "accuracy")
+```
+## Documentation
+For detailed documentation, visit our [documentation site](https://daizutabi.github.io/hydraflow/):
+- [Getting Started](https://daizutabi.github.io/hydraflow/getting-started/) - Installation and core concepts
+- [Practical Tutorials](https://daizutabi.github.io/hydraflow/practical-tutorials/) - Learn through hands-on examples
+- [User Guide](https://daizutabi.github.io/hydraflow/part1-applications/) - Detailed documentation of HydraFlow's capabilities
+- [API Reference](https://daizutabi.github.io/hydraflow/api/hydraflow/) - Complete API documentation
+## Contributing
+We welcome contributions! Please see our [contributing guide](CONTRIBUTING.md) for details.
+## License
+This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

hydraflow-0.16.0/docs/getting-started/concepts.md ADDED Viewed

@@ -0,0 +1,174 @@
+# Core Concepts
+This page introduces the fundamental concepts of HydraFlow that form the foundation of the framework.
+## Design Principles
+HydraFlow is built on the following design principles:
+1. **Type Safety** - Utilizing Python dataclasses for configuration type checking and IDE support
+2. **Reproducibility** - Automatically tracking all experiment configurations for fully reproducible experiments
+3. **Workflow Integration** - Creating a cohesive workflow by integrating Hydra's configuration management with MLflow's experiment tracking
+4. **Analysis Capabilities** - Providing powerful APIs for easily analyzing experiment results
+## Key Components
+HydraFlow consists of the following key components:
+### Configuration Management
+HydraFlow uses a hierarchical configuration system based on OmegaConf and Hydra. This provides:
+- Type-safe configuration using Python dataclasses
+- Schema validation to ensure configuration correctness
+- Configuration composition from multiple sources
+- Command-line overrides
+Example configuration:
+```python
+from dataclasses import dataclass
+@dataclass
+class Config:
+    learning_rate: float = 0.001
+    batch_size: int = 32
+    epochs: int = 10
+```
+This configuration class defines the structure and default values for your experiment, enabling type checking and auto-completion.
+### Main Decorator
+The [`@hydraflow.main`][hydraflow.main] decorator defines the entry point for a HydraFlow application:
+```python
+import hydraflow
+from mlflow.entities import Run
+@hydraflow.main(Config)
+def train(run: Run, cfg: Config) -> None:
+    # Your experiment code
+    print(f"Training with lr={cfg.learning_rate}, batch_size={cfg.batch_size}")
+    # Log metrics
+    hydraflow.log_metric("accuracy", 0.95)
+```
+This decorator provides:
+- Automatic registration of your config class with Hydra's `ConfigStore`
+- Automatic setup of an MLflow experiment
+- Storage of Hydra configurations and logs as MLflow artifacts
+- Support for type-safe APIs and IDE integration
+### Workflow Automation
+HydraFlow allows you to automate experiment workflows using a YAML-based job definition system:
+```yaml
+jobs:
+  train_models:
+    run: python train.py
+    sets:
+      - each: model=small,medium,large
+        all: learning_rate=0.001,0.01,0.1
+```
+This enables:
+- Defining reusable experiment workflows
+- Efficient configuration of parameter sweeps
+- Organization of complex experiment campaigns
+You can also define more complex parameter spaces using extended sweep syntax:
+```bash
+# Ranges (start:end:step)
+python train.py -m "learning_rate=0.01:0.03:0.01"
+# SI prefixes
+python train.py -m "batch_size=1k,2k,4k"
+# 1000, 2000, 4000
+# Grid within a single parameter
+python train.py -m "model=(small,large)_(v1,v2)"
+# small_v1, small_v2, large_v1, large_v2
+```
+### Analysis Tools
+After running experiments, HydraFlow provides powerful tools for accessing and analyzing results. These tools help you track, compare, and derive insights from your experiments.
+#### Working with Individual Runs
+For individual experiment analysis, HydraFlow provides the `Run` class, which represents a single experiment run:
+```python
+from hydraflow import Run
+# Load an existing run
+run = Run.load("path/to/run")
+# Access configuration values
+learning_rate = run.get("learning_rate")
+```
+The `Run` class provides:
+- Access to experiment configurations used during the run
+- Methods for loading and analyzing experiment results
+- Support for custom implementations through the factory pattern
+- Type-safe access to configuration values
+You can use type parameters for more powerful IDE support:
+```python
+from dataclasses import dataclass
+from hydraflow import Run
+@dataclass
+class MyConfig:
+    learning_rate: float
+    batch_size: int
+# Load a Run with type information
+run = Run[MyConfig].load("path/to/run")
+print(run.cfg.learning_rate)  # IDE auto-completion works
+```
+#### Comparing Multiple Runs
+For comparing multiple runs, HydraFlow offers the `RunCollection` class, which enables efficient analysis across runs:
+```python
+# Load multiple runs
+runs = Run.load(["path/to/run1", "path/to/run2", "path/to/run3"])
+# Filter runs by parameter value
+filtered_runs = runs.filter(model_type="lstm")
+# Group runs by a parameter
+grouped_runs = runs.group_by("dataset_name")
+# Convert to DataFrame for analysis
+df = runs.to_frame("learning_rate", "batch_size", "accuracy")
+```
+Key features of experiment comparison:
+- Filtering runs based on configuration parameters
+- Grouping runs by common attributes
+- Aggregating data across runs
+- Converting to Polars DataFrames for advanced analysis
+## Summary
+These core concepts work together to provide a comprehensive framework for managing machine learning experiments:
+1. **Configuration Management** - Type-safe configuration with Python dataclasses
+2. **Main Decorator** - The entry point that integrates Hydra and MLflow
+3. **Workflow Automation** - Reusable experiment definitions and advanced parameter sweeps
+4. **Analysis Tools** - Access, filter, and analyze experiment results
+Understanding these fundamental concepts will help you leverage the full power of HydraFlow for your machine learning projects.

hydraflow-0.16.0/docs/getting-started/index.md ADDED Viewed

@@ -0,0 +1,80 @@
+# Getting Started with HydraFlow
+Welcome to HydraFlow, a framework designed to streamline machine learning
+workflows by integrating Hydra's configuration management with MLflow's
+experiment tracking capabilities.
+## Overview
+This section provides everything you need to begin using HydraFlow
+effectively:
+- [Installation](installation.md): Step-by-step instructions for installing
+  HydraFlow and its dependencies
+- [Core Concepts](concepts.md): An introduction to the fundamental concepts
+  that underpin HydraFlow's design and functionality
+## Why HydraFlow?
+Managing machine learning experiments involves numerous challenges, including:
+- **Configuration Management**: Tracking hyperparameters and settings across
+  multiple experiment runs
+- **Reproducibility**: Ensuring experiments can be reliably reproduced
+- **Result Analysis**: Efficiently comparing and analyzing experiment outcomes
+- **Workflow Automation**: Organizing and managing experiment workflows
+HydraFlow addresses these challenges by providing:
+1. **Type-safe Configuration**: Using Python's native dataclasses for
+   robust configuration management
+2. **Seamless Integration**: Bridging Hydra and MLflow to combine their
+   respective strengths
+3. **Analysis Tools**: Providing powerful APIs for filtering, grouping,
+   and analyzing results
+4. **Workflow Automation**: Simplifying the organization and execution of
+   machine learning experiments
+## Quick Example
+Here's a simple example to demonstrate HydraFlow's basic usage:
+```python
+from dataclasses import dataclass
+from mlflow.entities import Run
+import hydraflow
+@dataclass
+class Config:
+    learning_rate: float = 0.01
+    batch_size: int = 32
+    epochs: int = 10
+@hydraflow.main(Config)
+def train(run: Run, cfg: Config) -> None:
+    # Your training code here
+    print(f"Training with lr={cfg.learning_rate}, batch_size={cfg.batch_size}")
+    # Log metrics
+    hydraflow.log_metric("accuracy", 0.95)
+if __name__ == "__main__":
+    train()
+```
+Run this example with:
+```bash
+python train.py learning_rate=0.001 batch_size=64
+```
+## Next Steps
+After installing HydraFlow and understanding its core concepts, you're ready to:
+1. Follow our [Practical Tutorials](../practical-tutorials/index.md) to see HydraFlow in action
+2. Explore the detailed [User Guide](../part1-applications/index.md) to learn more about HydraFlow's capabilities
+3. Check the [API Reference](../api/hydraflow/README.md) for detailed documentation of HydraFlow's API
+Continue to the [Installation Guide](installation.md) to get started with
+HydraFlow.

hydraflow 0.15.0__tar.gz → 0.16.0__tar.gz

hydraflow 0.15.0tar.gz → 0.16.0tar.gz