hydraflow 0.14.4__tar.gz → 0.15.1__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {hydraflow-0.14.4 → hydraflow-0.15.1}/.github/workflows/ci.yaml +2 -2
- {hydraflow-0.14.4 → hydraflow-0.15.1}/.gitignore +3 -3
- {hydraflow-0.14.4 → hydraflow-0.15.1}/PKG-INFO +11 -9
- {hydraflow-0.14.4 → hydraflow-0.15.1}/README.md +6 -4
- hydraflow-0.15.1/docs/getting-started/concepts.md +174 -0
- hydraflow-0.15.1/docs/getting-started/index.md +80 -0
- hydraflow-0.15.1/docs/getting-started/installation.md +83 -0
- hydraflow-0.15.1/docs/index.md +91 -0
- hydraflow-0.15.1/docs/part1-applications/configuration.md +126 -0
- hydraflow-0.15.1/docs/part1-applications/execution.md +183 -0
- hydraflow-0.15.1/docs/part1-applications/index.md +89 -0
- hydraflow-0.15.1/docs/part1-applications/main-decorator.md +264 -0
- hydraflow-0.15.1/docs/part2-advanced/index.md +88 -0
- hydraflow-0.15.1/docs/part2-advanced/job-configuration.md +259 -0
- hydraflow-0.15.1/docs/part2-advanced/sweep-syntax.md +280 -0
- hydraflow-0.15.1/docs/part3-analysis/index.md +144 -0
- hydraflow-0.15.1/docs/part3-analysis/run-class.md +234 -0
- hydraflow-0.15.1/docs/part3-analysis/run-collection.md +341 -0
- hydraflow-0.15.1/docs/part3-analysis/updating-runs.md +142 -0
- hydraflow-0.15.1/docs/practical-tutorials/advanced.md +227 -0
- hydraflow-0.15.1/docs/practical-tutorials/analysis.md +332 -0
- hydraflow-0.15.1/docs/practical-tutorials/applications.md +171 -0
- hydraflow-0.15.1/docs/practical-tutorials/index.md +51 -0
- hydraflow-0.14.4/apps/quickstart.py → hydraflow-0.15.1/examples/example.py +1 -11
- hydraflow-0.15.1/examples/hydraflow.yaml +19 -0
- hydraflow-0.15.1/examples/submit.py +19 -0
- hydraflow-0.15.1/mkdocs.yaml +90 -0
- {hydraflow-0.14.4 → hydraflow-0.15.1}/pyproject.toml +8 -13
- {hydraflow-0.14.4 → hydraflow-0.15.1}/src/hydraflow/__init__.py +3 -13
- {hydraflow-0.14.4 → hydraflow-0.15.1}/src/hydraflow/core/context.py +12 -32
- hydraflow-0.15.1/src/hydraflow/core/io.py +150 -0
- {hydraflow-0.14.4 → hydraflow-0.15.1}/src/hydraflow/core/main.py +3 -3
- hydraflow-0.15.1/src/hydraflow/core/run.py +355 -0
- hydraflow-0.15.1/src/hydraflow/core/run_collection.py +525 -0
- hydraflow-0.15.1/src/hydraflow/core/run_info.py +84 -0
- {hydraflow-0.14.4 → hydraflow-0.15.1}/src/hydraflow/executor/conf.py +6 -6
- {hydraflow-0.14.4 → hydraflow-0.15.1}/src/hydraflow/executor/io.py +1 -17
- {hydraflow-0.14.4 → hydraflow-0.15.1}/src/hydraflow/executor/job.py +41 -14
- {hydraflow-0.14.4 → hydraflow-0.15.1}/src/hydraflow/executor/parser.py +9 -8
- {hydraflow-0.14.4 → hydraflow-0.15.1}/tests/cli/app.py +3 -3
- hydraflow-0.15.1/tests/cli/hydraflow.yaml +58 -0
- {hydraflow-0.14.4 → hydraflow-0.15.1}/tests/cli/test_run.py +13 -20
- {hydraflow-0.14.4 → hydraflow-0.15.1}/tests/conftest.py +20 -4
- {hydraflow-0.14.4 → hydraflow-0.15.1}/tests/core/context/chdir.py +1 -1
- {hydraflow-0.14.4 → hydraflow-0.15.1}/tests/core/context/log_run.py +1 -1
- {hydraflow-0.14.4 → hydraflow-0.15.1}/tests/core/context/start_run.py +2 -2
- hydraflow-0.15.1/tests/core/context/test_chdir.py +24 -0
- hydraflow-0.15.1/tests/core/context/test_log_run.py +45 -0
- hydraflow-0.15.1/tests/core/context/test_start_run.py +29 -0
- {hydraflow-0.14.4 → hydraflow-0.15.1}/tests/core/main/default.py +3 -3
- hydraflow-0.15.1/tests/core/main/test_default.py +60 -0
- hydraflow-0.15.1/tests/core/main/test_force_new_run.py +29 -0
- hydraflow-0.15.1/tests/core/main/test_main.py +13 -0
- hydraflow-0.15.1/tests/core/main/test_match_overrides.py +15 -0
- hydraflow-0.15.1/tests/core/main/test_rerun_finished.py +20 -0
- hydraflow-0.15.1/tests/core/main/test_skip_finished.py +71 -0
- hydraflow-0.15.1/tests/core/run/run.py +31 -0
- hydraflow-0.15.1/tests/core/run/test_run.py +260 -0
- hydraflow-0.15.1/tests/core/run/test_run_collection.py +290 -0
- hydraflow-0.15.1/tests/core/run/test_run_info.py +48 -0
- hydraflow-0.14.4/tests/core/io/test_iter_dirs.py → hydraflow-0.15.1/tests/core/test_io.py +37 -21
- {hydraflow-0.14.4 → hydraflow-0.15.1}/tests/executor/conftest.py +2 -2
- {hydraflow-0.14.4 → hydraflow-0.15.1}/tests/executor/test_conf.py +5 -5
- {hydraflow-0.14.4 → hydraflow-0.15.1}/tests/executor/test_job.py +19 -4
- {hydraflow-0.14.4 → hydraflow-0.15.1}/tests/executor/test_parser.py +41 -41
- hydraflow-0.14.4/docs/index.md +0 -10
- hydraflow-0.14.4/docs/usage/quickstart.md +0 -143
- hydraflow-0.14.4/mkdocs.yaml +0 -56
- hydraflow-0.14.4/src/hydraflow/core/config.py +0 -122
- hydraflow-0.14.4/src/hydraflow/core/io.py +0 -229
- hydraflow-0.14.4/src/hydraflow/core/mlflow.py +0 -174
- hydraflow-0.14.4/src/hydraflow/core/param.py +0 -165
- hydraflow-0.14.4/src/hydraflow/entities/run_collection.py +0 -583
- hydraflow-0.14.4/src/hydraflow/entities/run_data.py +0 -61
- hydraflow-0.14.4/src/hydraflow/entities/run_info.py +0 -36
- hydraflow-0.14.4/tests/cli/hydraflow.yaml +0 -42
- hydraflow-0.14.4/tests/core/config/test_config.py +0 -54
- hydraflow-0.14.4/tests/core/config/test_params.py +0 -176
- hydraflow-0.14.4/tests/core/context/test_chdir.py +0 -30
- hydraflow-0.14.4/tests/core/context/test_log_run.py +0 -58
- hydraflow-0.14.4/tests/core/context/test_start_run.py +0 -34
- hydraflow-0.14.4/tests/core/io/hydra_dir.py +0 -34
- hydraflow-0.14.4/tests/core/io/test_hydra_dir.py +0 -64
- hydraflow-0.14.4/tests/core/io/test_run.py +0 -51
- hydraflow-0.14.4/tests/core/main/__init__.py +0 -0
- hydraflow-0.14.4/tests/core/main/test_default.py +0 -63
- hydraflow-0.14.4/tests/core/main/test_force_new_run.py +0 -36
- hydraflow-0.14.4/tests/core/main/test_match_overrides.py +0 -31
- hydraflow-0.14.4/tests/core/main/test_rerun_finished.py +0 -27
- hydraflow-0.14.4/tests/core/main/test_skip_finished.py +0 -60
- hydraflow-0.14.4/tests/core/param/__init__.py +0 -0
- hydraflow-0.14.4/tests/core/param/params.py +0 -39
- hydraflow-0.14.4/tests/core/param/test_param.py +0 -158
- hydraflow-0.14.4/tests/core/param/test_params.py +0 -49
- hydraflow-0.14.4/tests/core/test_mlflow.py +0 -83
- hydraflow-0.14.4/tests/entities/__init__.py +0 -0
- hydraflow-0.14.4/tests/entities/filter.py +0 -35
- hydraflow-0.14.4/tests/entities/test_collection.py +0 -417
- hydraflow-0.14.4/tests/entities/test_data.py +0 -44
- hydraflow-0.14.4/tests/entities/test_filter.py +0 -44
- hydraflow-0.14.4/tests/entities/test_info.py +0 -47
- hydraflow-0.14.4/tests/entities/test_values.py +0 -37
- hydraflow-0.14.4/tests/entities/values.py +0 -34
- hydraflow-0.14.4/tests/executor/__init__.py +0 -0
- {hydraflow-0.14.4 → hydraflow-0.15.1}/.devcontainer/devcontainer.json +0 -0
- {hydraflow-0.14.4 → hydraflow-0.15.1}/.devcontainer/postCreate.sh +0 -0
- {hydraflow-0.14.4 → hydraflow-0.15.1}/.devcontainer/starship.toml +0 -0
- {hydraflow-0.14.4 → hydraflow-0.15.1}/.gitattributes +0 -0
- {hydraflow-0.14.4 → hydraflow-0.15.1}/.github/workflows/docs.yaml +0 -0
- {hydraflow-0.14.4 → hydraflow-0.15.1}/.github/workflows/publish.yaml +0 -0
- {hydraflow-0.14.4 → hydraflow-0.15.1}/LICENSE +0 -0
- {hydraflow-0.14.4 → hydraflow-0.15.1}/src/hydraflow/cli.py +0 -0
- {hydraflow-0.14.4 → hydraflow-0.15.1}/src/hydraflow/core/__init__.py +0 -0
- {hydraflow-0.14.4/src/hydraflow/entities → hydraflow-0.15.1/src/hydraflow/executor}/__init__.py +0 -0
- {hydraflow-0.14.4 → hydraflow-0.15.1}/src/hydraflow/executor/aio.py +0 -0
- {hydraflow-0.14.4 → hydraflow-0.15.1}/src/hydraflow/py.typed +0 -0
- {hydraflow-0.14.4/src/hydraflow/executor → hydraflow-0.15.1/tests}/__init__.py +0 -0
- {hydraflow-0.14.4/tests → hydraflow-0.15.1/tests/cli}/__init__.py +0 -0
- {hydraflow-0.14.4 → hydraflow-0.15.1}/tests/cli/conftest.py +0 -0
- {hydraflow-0.14.4 → hydraflow-0.15.1}/tests/cli/submit.py +0 -0
- {hydraflow-0.14.4 → hydraflow-0.15.1}/tests/cli/test_setup.py +0 -0
- {hydraflow-0.14.4 → hydraflow-0.15.1}/tests/cli/test_show.py +0 -0
- {hydraflow-0.14.4 → hydraflow-0.15.1}/tests/cli/test_version.py +0 -0
- {hydraflow-0.14.4/tests/cli → hydraflow-0.15.1/tests/core}/__init__.py +0 -0
- {hydraflow-0.14.4/tests/core → hydraflow-0.15.1/tests/core/context}/__init__.py +0 -0
- {hydraflow-0.14.4/tests/core/config → hydraflow-0.15.1/tests/core/main}/__init__.py +0 -0
- {hydraflow-0.14.4 → hydraflow-0.15.1}/tests/core/main/force_new_run.py +0 -0
- {hydraflow-0.14.4 → hydraflow-0.15.1}/tests/core/main/match_overrides.py +0 -0
- {hydraflow-0.14.4 → hydraflow-0.15.1}/tests/core/main/rerun_finished.py +0 -0
- {hydraflow-0.14.4 → hydraflow-0.15.1}/tests/core/main/skip_finished.py +0 -0
- {hydraflow-0.14.4/tests/core/context → hydraflow-0.15.1/tests/core/run}/__init__.py +0 -0
- {hydraflow-0.14.4/tests/core/io → hydraflow-0.15.1/tests/executor}/__init__.py +0 -0
- {hydraflow-0.14.4 → hydraflow-0.15.1}/tests/executor/echo.py +0 -0
- {hydraflow-0.14.4 → hydraflow-0.15.1}/tests/executor/read.py +0 -0
- {hydraflow-0.14.4 → hydraflow-0.15.1}/tests/executor/test_aio.py +0 -0
- {hydraflow-0.14.4 → hydraflow-0.15.1}/tests/executor/test_args.py +0 -0
- {hydraflow-0.14.4 → hydraflow-0.15.1}/tests/executor/test_io.py +0 -0
@@ -21,7 +21,7 @@ jobs:
|
|
21
21
|
fail-fast: false
|
22
22
|
matrix:
|
23
23
|
os: [ubuntu-latest, windows-latest, macos-latest]
|
24
|
-
python-version: ["3.
|
24
|
+
python-version: ["3.13"]
|
25
25
|
|
26
26
|
steps:
|
27
27
|
- uses: actions/checkout@v4
|
@@ -36,7 +36,7 @@ jobs:
|
|
36
36
|
- name: Ruff check
|
37
37
|
run: ruff check
|
38
38
|
- name: Run test
|
39
|
-
run: uv run pytest -v --junitxml=junit.xml
|
39
|
+
run: uv run pytest -v -n8 --junitxml=junit.xml
|
40
40
|
- name: Upload Codecov Results
|
41
41
|
if: success()
|
42
42
|
uses: codecov/codecov-action@v4
|
@@ -1,6 +1,6 @@
|
|
1
1
|
Metadata-Version: 2.4
|
2
2
|
Name: hydraflow
|
3
|
-
Version: 0.
|
3
|
+
Version: 0.15.1
|
4
4
|
Summary: HydraFlow seamlessly integrates Hydra and MLflow to streamline ML experiment management, combining Hydra's configuration management with MLflow's tracking capabilities.
|
5
5
|
Project-URL: Documentation, https://daizutabi.github.io/hydraflow/
|
6
6
|
Project-URL: Source, https://github.com/daizutabi/hydraflow
|
@@ -36,40 +36,40 @@ Classifier: Intended Audience :: Science/Research
|
|
36
36
|
Classifier: License :: OSI Approved :: MIT License
|
37
37
|
Classifier: Operating System :: OS Independent
|
38
38
|
Classifier: Programming Language :: Python
|
39
|
-
Classifier: Programming Language :: Python :: 3.10
|
40
|
-
Classifier: Programming Language :: Python :: 3.11
|
41
|
-
Classifier: Programming Language :: Python :: 3.12
|
42
39
|
Classifier: Programming Language :: Python :: 3.13
|
43
40
|
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
|
44
41
|
Classifier: Topic :: Software Development :: Libraries :: Python Modules
|
45
|
-
Requires-Python: >=3.
|
42
|
+
Requires-Python: >=3.13
|
46
43
|
Requires-Dist: hydra-core>=1.3
|
44
|
+
Requires-Dist: joblib>=1.4.0
|
47
45
|
Requires-Dist: mlflow>=2.15
|
48
46
|
Requires-Dist: omegaconf>=2.3
|
47
|
+
Requires-Dist: polars>=1.26
|
49
48
|
Requires-Dist: python-ulid>=3.0.0
|
50
49
|
Requires-Dist: rich>=13.9
|
50
|
+
Requires-Dist: ruff>=0.11
|
51
51
|
Requires-Dist: typer>=0.15
|
52
52
|
Description-Content-Type: text/markdown
|
53
53
|
|
54
54
|
# Hydraflow
|
55
55
|
|
56
56
|
[![PyPI Version][pypi-v-image]][pypi-v-link]
|
57
|
-
[![Python Version][python-v-image]][python-v-link]
|
58
57
|
[![Build Status][GHAction-image]][GHAction-link]
|
59
58
|
[![Coverage Status][codecov-image]][codecov-link]
|
60
59
|
[![Documentation Status][docs-image]][docs-link]
|
60
|
+
[![Python Version][python-v-image]][python-v-link]
|
61
61
|
|
62
62
|
<!-- Badges -->
|
63
63
|
[pypi-v-image]: https://img.shields.io/pypi/v/hydraflow.svg
|
64
64
|
[pypi-v-link]: https://pypi.org/project/hydraflow/
|
65
|
-
[python-v-image]: https://img.shields.io/pypi/pyversions/hydraflow.svg
|
66
|
-
[python-v-link]: https://pypi.org/project/hydraflow
|
67
65
|
[GHAction-image]: https://github.com/daizutabi/hydraflow/actions/workflows/ci.yaml/badge.svg?branch=main&event=push
|
68
66
|
[GHAction-link]: https://github.com/daizutabi/hydraflow/actions?query=event%3Apush+branch%3Amain
|
69
67
|
[codecov-image]: https://codecov.io/github/daizutabi/hydraflow/coverage.svg?branch=main
|
70
68
|
[codecov-link]: https://codecov.io/github/daizutabi/hydraflow?branch=main
|
71
|
-
[docs-image]: https://
|
69
|
+
[docs-image]: https://img.shields.io/badge/docs-latest-blue.svg
|
72
70
|
[docs-link]: https://daizutabi.github.io/hydraflow/
|
71
|
+
[python-v-image]: https://img.shields.io/pypi/pyversions/hydraflow.svg
|
72
|
+
[python-v-link]: https://pypi.org/project/hydraflow
|
73
73
|
|
74
74
|
## Overview
|
75
75
|
|
@@ -101,6 +101,8 @@ You can install Hydraflow via pip:
|
|
101
101
|
pip install hydraflow
|
102
102
|
```
|
103
103
|
|
104
|
+
**Requirements:** Python 3.13+
|
105
|
+
|
104
106
|
## Quick Start
|
105
107
|
|
106
108
|
Here is a simple example to get you started with Hydraflow:
|
@@ -1,22 +1,22 @@
|
|
1
1
|
# Hydraflow
|
2
2
|
|
3
3
|
[![PyPI Version][pypi-v-image]][pypi-v-link]
|
4
|
-
[![Python Version][python-v-image]][python-v-link]
|
5
4
|
[![Build Status][GHAction-image]][GHAction-link]
|
6
5
|
[![Coverage Status][codecov-image]][codecov-link]
|
7
6
|
[![Documentation Status][docs-image]][docs-link]
|
7
|
+
[![Python Version][python-v-image]][python-v-link]
|
8
8
|
|
9
9
|
<!-- Badges -->
|
10
10
|
[pypi-v-image]: https://img.shields.io/pypi/v/hydraflow.svg
|
11
11
|
[pypi-v-link]: https://pypi.org/project/hydraflow/
|
12
|
-
[python-v-image]: https://img.shields.io/pypi/pyversions/hydraflow.svg
|
13
|
-
[python-v-link]: https://pypi.org/project/hydraflow
|
14
12
|
[GHAction-image]: https://github.com/daizutabi/hydraflow/actions/workflows/ci.yaml/badge.svg?branch=main&event=push
|
15
13
|
[GHAction-link]: https://github.com/daizutabi/hydraflow/actions?query=event%3Apush+branch%3Amain
|
16
14
|
[codecov-image]: https://codecov.io/github/daizutabi/hydraflow/coverage.svg?branch=main
|
17
15
|
[codecov-link]: https://codecov.io/github/daizutabi/hydraflow?branch=main
|
18
|
-
[docs-image]: https://
|
16
|
+
[docs-image]: https://img.shields.io/badge/docs-latest-blue.svg
|
19
17
|
[docs-link]: https://daizutabi.github.io/hydraflow/
|
18
|
+
[python-v-image]: https://img.shields.io/pypi/pyversions/hydraflow.svg
|
19
|
+
[python-v-link]: https://pypi.org/project/hydraflow
|
20
20
|
|
21
21
|
## Overview
|
22
22
|
|
@@ -48,6 +48,8 @@ You can install Hydraflow via pip:
|
|
48
48
|
pip install hydraflow
|
49
49
|
```
|
50
50
|
|
51
|
+
**Requirements:** Python 3.13+
|
52
|
+
|
51
53
|
## Quick Start
|
52
54
|
|
53
55
|
Here is a simple example to get you started with Hydraflow:
|
@@ -0,0 +1,174 @@
|
|
1
|
+
# Core Concepts
|
2
|
+
|
3
|
+
This page introduces the fundamental concepts of HydraFlow that form the foundation of the framework.
|
4
|
+
|
5
|
+
## Design Principles
|
6
|
+
|
7
|
+
HydraFlow is built on the following design principles:
|
8
|
+
|
9
|
+
1. **Type Safety** - Utilizing Python dataclasses for configuration type checking and IDE support
|
10
|
+
2. **Reproducibility** - Automatically tracking all experiment configurations for fully reproducible experiments
|
11
|
+
3. **Workflow Integration** - Creating a cohesive workflow by integrating Hydra's configuration management with MLflow's experiment tracking
|
12
|
+
4. **Analysis Capabilities** - Providing powerful APIs for easily analyzing experiment results
|
13
|
+
|
14
|
+
## Key Components
|
15
|
+
|
16
|
+
HydraFlow consists of the following key components:
|
17
|
+
|
18
|
+
### Configuration Management
|
19
|
+
|
20
|
+
HydraFlow uses a hierarchical configuration system based on OmegaConf and Hydra. This provides:
|
21
|
+
|
22
|
+
- Type-safe configuration using Python dataclasses
|
23
|
+
- Schema validation to ensure configuration correctness
|
24
|
+
- Configuration composition from multiple sources
|
25
|
+
- Command-line overrides
|
26
|
+
|
27
|
+
Example configuration:
|
28
|
+
|
29
|
+
```python
|
30
|
+
from dataclasses import dataclass
|
31
|
+
|
32
|
+
@dataclass
|
33
|
+
class Config:
|
34
|
+
learning_rate: float = 0.001
|
35
|
+
batch_size: int = 32
|
36
|
+
epochs: int = 10
|
37
|
+
```
|
38
|
+
|
39
|
+
This configuration class defines the structure and default values for your experiment, enabling type checking and auto-completion.
|
40
|
+
|
41
|
+
### Main Decorator
|
42
|
+
|
43
|
+
The [`@hydraflow.main`][hydraflow.main] decorator defines the entry point for a HydraFlow application:
|
44
|
+
|
45
|
+
```python
|
46
|
+
import hydraflow
|
47
|
+
from mlflow.entities import Run
|
48
|
+
|
49
|
+
@hydraflow.main(Config)
|
50
|
+
def train(run: Run, cfg: Config) -> None:
|
51
|
+
# Your experiment code
|
52
|
+
print(f"Training with lr={cfg.learning_rate}, batch_size={cfg.batch_size}")
|
53
|
+
|
54
|
+
# Log metrics
|
55
|
+
hydraflow.log_metric("accuracy", 0.95)
|
56
|
+
```
|
57
|
+
|
58
|
+
This decorator provides:
|
59
|
+
|
60
|
+
- Automatic registration of your config class with Hydra's `ConfigStore`
|
61
|
+
- Automatic setup of an MLflow experiment
|
62
|
+
- Storage of Hydra configurations and logs as MLflow artifacts
|
63
|
+
- Support for type-safe APIs and IDE integration
|
64
|
+
|
65
|
+
### Workflow Automation
|
66
|
+
|
67
|
+
HydraFlow allows you to automate experiment workflows using a YAML-based job definition system:
|
68
|
+
|
69
|
+
```yaml
|
70
|
+
jobs:
|
71
|
+
train_models:
|
72
|
+
run: python train.py
|
73
|
+
sets:
|
74
|
+
- each: model=small,medium,large
|
75
|
+
all: learning_rate=0.001,0.01,0.1
|
76
|
+
```
|
77
|
+
|
78
|
+
This enables:
|
79
|
+
|
80
|
+
- Defining reusable experiment workflows
|
81
|
+
- Efficient configuration of parameter sweeps
|
82
|
+
- Organization of complex experiment campaigns
|
83
|
+
|
84
|
+
You can also define more complex parameter spaces using extended sweep syntax:
|
85
|
+
|
86
|
+
```bash
|
87
|
+
# Ranges (start:end:step)
|
88
|
+
python train.py -m "learning_rate=0.01:0.03:0.01"
|
89
|
+
|
90
|
+
# SI prefixes
|
91
|
+
python train.py -m "batch_size=1k,2k,4k"
|
92
|
+
# 1000, 2000, 4000
|
93
|
+
|
94
|
+
# Grid within a single parameter
|
95
|
+
python train.py -m "model=(small,large)_(v1,v2)"
|
96
|
+
# small_v1, small_v2, large_v1, large_v2
|
97
|
+
```
|
98
|
+
|
99
|
+
### Analysis Tools
|
100
|
+
|
101
|
+
After running experiments, HydraFlow provides powerful tools for accessing and analyzing results. These tools help you track, compare, and derive insights from your experiments.
|
102
|
+
|
103
|
+
#### Working with Individual Runs
|
104
|
+
|
105
|
+
For individual experiment analysis, HydraFlow provides the `Run` class, which represents a single experiment run:
|
106
|
+
|
107
|
+
```python
|
108
|
+
from hydraflow import Run
|
109
|
+
|
110
|
+
# Load an existing run
|
111
|
+
run = Run.load("path/to/run")
|
112
|
+
|
113
|
+
# Access configuration values
|
114
|
+
learning_rate = run.get("learning_rate")
|
115
|
+
```
|
116
|
+
|
117
|
+
The `Run` class provides:
|
118
|
+
|
119
|
+
- Access to experiment configurations used during the run
|
120
|
+
- Methods for loading and analyzing experiment results
|
121
|
+
- Support for custom implementations through the factory pattern
|
122
|
+
- Type-safe access to configuration values
|
123
|
+
|
124
|
+
You can use type parameters for more powerful IDE support:
|
125
|
+
|
126
|
+
```python
|
127
|
+
from dataclasses import dataclass
|
128
|
+
from hydraflow import Run
|
129
|
+
|
130
|
+
@dataclass
|
131
|
+
class MyConfig:
|
132
|
+
learning_rate: float
|
133
|
+
batch_size: int
|
134
|
+
|
135
|
+
# Load a Run with type information
|
136
|
+
run = Run[MyConfig].load("path/to/run")
|
137
|
+
print(run.cfg.learning_rate) # IDE auto-completion works
|
138
|
+
```
|
139
|
+
|
140
|
+
#### Comparing Multiple Runs
|
141
|
+
|
142
|
+
For comparing multiple runs, HydraFlow offers the `RunCollection` class, which enables efficient analysis across runs:
|
143
|
+
|
144
|
+
```python
|
145
|
+
# Load multiple runs
|
146
|
+
runs = Run.load(["path/to/run1", "path/to/run2", "path/to/run3"])
|
147
|
+
|
148
|
+
# Filter runs by parameter value
|
149
|
+
filtered_runs = runs.filter(model_type="lstm")
|
150
|
+
|
151
|
+
# Group runs by a parameter
|
152
|
+
grouped_runs = runs.group_by("dataset_name")
|
153
|
+
|
154
|
+
# Convert to DataFrame for analysis
|
155
|
+
df = runs.to_frame("learning_rate", "batch_size", "accuracy")
|
156
|
+
```
|
157
|
+
|
158
|
+
Key features of experiment comparison:
|
159
|
+
|
160
|
+
- Filtering runs based on configuration parameters
|
161
|
+
- Grouping runs by common attributes
|
162
|
+
- Aggregating data across runs
|
163
|
+
- Converting to Polars DataFrames for advanced analysis
|
164
|
+
|
165
|
+
## Summary
|
166
|
+
|
167
|
+
These core concepts work together to provide a comprehensive framework for managing machine learning experiments:
|
168
|
+
|
169
|
+
1. **Configuration Management** - Type-safe configuration with Python dataclasses
|
170
|
+
2. **Main Decorator** - The entry point that integrates Hydra and MLflow
|
171
|
+
3. **Workflow Automation** - Reusable experiment definitions and advanced parameter sweeps
|
172
|
+
4. **Analysis Tools** - Access, filter, and analyze experiment results
|
173
|
+
|
174
|
+
Understanding these fundamental concepts will help you leverage the full power of HydraFlow for your machine learning projects.
|
@@ -0,0 +1,80 @@
|
|
1
|
+
# Getting Started with HydraFlow
|
2
|
+
|
3
|
+
Welcome to HydraFlow, a framework designed to streamline machine learning
|
4
|
+
workflows by integrating Hydra's configuration management with MLflow's
|
5
|
+
experiment tracking capabilities.
|
6
|
+
|
7
|
+
## Overview
|
8
|
+
|
9
|
+
This section provides everything you need to begin using HydraFlow
|
10
|
+
effectively:
|
11
|
+
|
12
|
+
- [Installation](installation.md): Step-by-step instructions for installing
|
13
|
+
HydraFlow and its dependencies
|
14
|
+
- [Core Concepts](concepts.md): An introduction to the fundamental concepts
|
15
|
+
that underpin HydraFlow's design and functionality
|
16
|
+
|
17
|
+
## Why HydraFlow?
|
18
|
+
|
19
|
+
Managing machine learning experiments involves numerous challenges, including:
|
20
|
+
|
21
|
+
- **Configuration Management**: Tracking hyperparameters and settings across
|
22
|
+
multiple experiment runs
|
23
|
+
- **Reproducibility**: Ensuring experiments can be reliably reproduced
|
24
|
+
- **Result Analysis**: Efficiently comparing and analyzing experiment outcomes
|
25
|
+
- **Workflow Automation**: Organizing and managing experiment workflows
|
26
|
+
|
27
|
+
HydraFlow addresses these challenges by providing:
|
28
|
+
|
29
|
+
1. **Type-safe Configuration**: Using Python's native dataclasses for
|
30
|
+
robust configuration management
|
31
|
+
2. **Seamless Integration**: Bridging Hydra and MLflow to combine their
|
32
|
+
respective strengths
|
33
|
+
3. **Analysis Tools**: Providing powerful APIs for filtering, grouping,
|
34
|
+
and analyzing results
|
35
|
+
4. **Workflow Automation**: Simplifying the organization and execution of
|
36
|
+
machine learning experiments
|
37
|
+
|
38
|
+
## Quick Example
|
39
|
+
|
40
|
+
Here's a simple example to demonstrate HydraFlow's basic usage:
|
41
|
+
|
42
|
+
```python
|
43
|
+
from dataclasses import dataclass
|
44
|
+
from mlflow.entities import Run
|
45
|
+
import hydraflow
|
46
|
+
|
47
|
+
@dataclass
|
48
|
+
class Config:
|
49
|
+
learning_rate: float = 0.01
|
50
|
+
batch_size: int = 32
|
51
|
+
epochs: int = 10
|
52
|
+
|
53
|
+
@hydraflow.main(Config)
|
54
|
+
def train(run: Run, cfg: Config) -> None:
|
55
|
+
# Your training code here
|
56
|
+
print(f"Training with lr={cfg.learning_rate}, batch_size={cfg.batch_size}")
|
57
|
+
|
58
|
+
# Log metrics
|
59
|
+
hydraflow.log_metric("accuracy", 0.95)
|
60
|
+
|
61
|
+
if __name__ == "__main__":
|
62
|
+
train()
|
63
|
+
```
|
64
|
+
|
65
|
+
Run this example with:
|
66
|
+
|
67
|
+
```bash
|
68
|
+
python train.py learning_rate=0.001 batch_size=64
|
69
|
+
```
|
70
|
+
|
71
|
+
## Next Steps
|
72
|
+
|
73
|
+
After installing HydraFlow and understanding its core concepts, you're ready to:
|
74
|
+
|
75
|
+
1. Follow our [Practical Tutorials](../practical-tutorials/index.md) to see HydraFlow in action
|
76
|
+
2. Explore the detailed [User Guide](../part1-applications/index.md) to learn more about HydraFlow's capabilities
|
77
|
+
3. Check the [API Reference](../api/hydraflow/README.md) for detailed documentation of HydraFlow's API
|
78
|
+
|
79
|
+
Continue to the [Installation Guide](installation.md) to get started with
|
80
|
+
HydraFlow.
|
@@ -0,0 +1,83 @@
|
|
1
|
+
# Installation
|
2
|
+
|
3
|
+
This guide walks you through installing HydraFlow and its dependencies.
|
4
|
+
|
5
|
+
## Requirements
|
6
|
+
|
7
|
+
HydraFlow requires:
|
8
|
+
|
9
|
+
- Python 3.13 or higher
|
10
|
+
- A package manager (pip or uv)
|
11
|
+
|
12
|
+
## Basic Installation
|
13
|
+
|
14
|
+
You can install HydraFlow using your preferred package manager:
|
15
|
+
|
16
|
+
### Using pip
|
17
|
+
|
18
|
+
```bash
|
19
|
+
pip install hydraflow
|
20
|
+
```
|
21
|
+
|
22
|
+
### Using uv
|
23
|
+
|
24
|
+
[uv](https://github.com/astral-sh/uv) is a modern, fast Python package manager:
|
25
|
+
|
26
|
+
```bash
|
27
|
+
uv pip install hydraflow
|
28
|
+
```
|
29
|
+
|
30
|
+
These commands install the core framework with minimal dependencies.
|
31
|
+
|
32
|
+
## Verifying Installation
|
33
|
+
|
34
|
+
After installation, verify that HydraFlow is correctly installed by running
|
35
|
+
the CLI command:
|
36
|
+
|
37
|
+
```bash
|
38
|
+
hydraflow --help
|
39
|
+
```
|
40
|
+
|
41
|
+
This should display the help message and available commands, confirming that
|
42
|
+
HydraFlow is properly installed and accessible from your terminal.
|
43
|
+
|
44
|
+
## Environment Setup
|
45
|
+
|
46
|
+
While not required, we recommend using a virtual environment:
|
47
|
+
|
48
|
+
### Using venv
|
49
|
+
|
50
|
+
```bash
|
51
|
+
python -m venv hydraflow-env
|
52
|
+
source hydraflow-env/bin/activate # On Windows: hydraflow-env\Scripts\activate
|
53
|
+
pip install hydraflow # or use uv pip
|
54
|
+
```
|
55
|
+
|
56
|
+
### Using uv
|
57
|
+
|
58
|
+
```bash
|
59
|
+
uv venv hydraflow-env
|
60
|
+
source hydraflow-env/bin/activate # On Windows: hydraflow-env\Scripts\activate
|
61
|
+
uv pip install hydraflow
|
62
|
+
```
|
63
|
+
|
64
|
+
## Troubleshooting
|
65
|
+
|
66
|
+
If you encounter issues during installation:
|
67
|
+
|
68
|
+
1. Ensure your Python version is 3.13 or higher
|
69
|
+
2. Update your package manager:
|
70
|
+
- For pip: `pip install --upgrade pip`
|
71
|
+
- For uv: `uv self update`
|
72
|
+
3. If installing from source, ensure you have the necessary build tools
|
73
|
+
installed for your platform
|
74
|
+
|
75
|
+
For persistent issues, please check the
|
76
|
+
[GitHub issues](https://github.com/daizutabi/hydraflow/issues) or open a
|
77
|
+
new issue with details about your environment and the error message.
|
78
|
+
|
79
|
+
## Next Steps
|
80
|
+
|
81
|
+
Now that you have installed HydraFlow, proceed to
|
82
|
+
[Core Concepts](concepts.md) to understand the framework's fundamental
|
83
|
+
principles.
|
@@ -0,0 +1,91 @@
|
|
1
|
+
# HydraFlow: Streamline ML Experiment Workflows
|
2
|
+
|
3
|
+
<div class="grid cards" markdown>
|
4
|
+
|
5
|
+
- 🚀 **Define and Run Experiments**
|
6
|
+
Combine Hydra's configuration management with MLflow's experiment
|
7
|
+
tracking for streamlined experiment workflows
|
8
|
+
- 🔄 **Automate Workflows**
|
9
|
+
Define reusable experiment workflows with YAML configuration and
|
10
|
+
leverage extended sweep syntax for parameter exploration
|
11
|
+
- 📊 **Collect and Analyze Results**
|
12
|
+
Gather, filter, and analyze experiment results with type-safe APIs
|
13
|
+
for comprehensive insights
|
14
|
+
|
15
|
+
</div>
|
16
|
+
|
17
|
+
## What is HydraFlow?
|
18
|
+
|
19
|
+
HydraFlow seamlessly integrates [Hydra](https://hydra.cc/) and
|
20
|
+
[MLflow](https://mlflow.org/) to create a comprehensive machine learning
|
21
|
+
experiment management framework. It provides a complete workflow from defining
|
22
|
+
experiments to execution and analysis, streamlining machine learning projects
|
23
|
+
from research to production.
|
24
|
+
|
25
|
+
### Key Integration Features
|
26
|
+
|
27
|
+
- **Automatic Configuration Tracking**: Hydra configurations are automatically
|
28
|
+
saved as MLflow artifacts, ensuring complete reproducibility of experiments
|
29
|
+
- **Type-safe Configuration**: Leverage Python dataclasses for type-safe
|
30
|
+
experiment configuration with full IDE support
|
31
|
+
- **Unified Workflow**: Connect configuration management and experiment tracking
|
32
|
+
in a single, coherent workflow
|
33
|
+
- **Powerful Analysis Tools**: Analyze and compare experiments using
|
34
|
+
configuration parameters captured from Hydra
|
35
|
+
|
36
|
+
### Hydra + MLflow = More Than the Sum of Parts
|
37
|
+
|
38
|
+
HydraFlow goes beyond simply using Hydra and MLflow side by side:
|
39
|
+
|
40
|
+
- **Parameter Sweep Integration**: Run Hydra multi-run sweeps with automatic
|
41
|
+
MLflow experiment organization
|
42
|
+
- **Configuration-Aware Analysis**: Filter and group experiment results using
|
43
|
+
Hydra configuration parameters
|
44
|
+
- **Reproducible Experiments**: Ensure experiments can be reliably reproduced
|
45
|
+
with configuration-based definitions
|
46
|
+
- **Implementation Support**: Extend experiment analysis with custom
|
47
|
+
domain-specific implementations
|
48
|
+
|
49
|
+
## Quick Installation
|
50
|
+
|
51
|
+
```bash
|
52
|
+
pip install hydraflow
|
53
|
+
```
|
54
|
+
|
55
|
+
**Requirements:** Python 3.13+
|
56
|
+
|
57
|
+
## Documentation Structure
|
58
|
+
|
59
|
+
The HydraFlow documentation is organized as follows:
|
60
|
+
|
61
|
+
<div class="grid cards" markdown>
|
62
|
+
|
63
|
+
- :material-book-open-variant: [**Getting Started**](getting-started/index.md)
|
64
|
+
Install HydraFlow and learn core concepts
|
65
|
+
- :material-school: [**Practical Tutorials**](practical-tutorials/index.md)
|
66
|
+
Learn through hands-on examples and real use cases
|
67
|
+
- :material-rocket-launch: [**Part 1: Running Applications**](part1-applications/index.md)
|
68
|
+
Define and execute HydraFlow applications
|
69
|
+
- :material-cogs: [**Part 2: Automating Workflows**](part2-advanced/index.md)
|
70
|
+
Build advanced experiment workflows
|
71
|
+
- :material-magnify: [**Part 3: Analyzing Results**](part3-analysis/index.md)
|
72
|
+
Collect and analyze experiment results
|
73
|
+
- :material-code-tags: [**API Reference**](api/hydraflow/README.md)
|
74
|
+
Detailed documentation of classes and methods
|
75
|
+
|
76
|
+
</div>
|
77
|
+
|
78
|
+
## Getting Started
|
79
|
+
|
80
|
+
Begin your journey with HydraFlow through our introductory guides:
|
81
|
+
|
82
|
+
<div class="grid cards" markdown>
|
83
|
+
|
84
|
+
- :material-book-open-variant: [**Installation Guide**](getting-started/installation.md)
|
85
|
+
Install and set up HydraFlow
|
86
|
+
- :material-school: [**Core Concepts**](getting-started/concepts.md)
|
87
|
+
Learn the key concepts and design principles of HydraFlow
|
88
|
+
- :material-file-code: [**Practical Tutorials**](practical-tutorials/index.md)
|
89
|
+
Hands-on examples to understand HydraFlow in practice
|
90
|
+
|
91
|
+
</div>
|
@@ -0,0 +1,126 @@
|
|
1
|
+
# Configuration Management
|
2
|
+
|
3
|
+
HydraFlow uses a powerful configuration management system based on Python's
|
4
|
+
dataclasses and Hydra's composition capabilities. This approach provides
|
5
|
+
type safety, IDE auto-completion, and flexible parameter specification.
|
6
|
+
|
7
|
+
## Basic Configuration
|
8
|
+
|
9
|
+
The simplest way to define a configuration for a HydraFlow application is
|
10
|
+
using a Python dataclass:
|
11
|
+
|
12
|
+
```python
|
13
|
+
from dataclasses import dataclass
|
14
|
+
from mlflow.entities import Run
|
15
|
+
import hydraflow
|
16
|
+
|
17
|
+
@dataclass
|
18
|
+
class Config:
|
19
|
+
learning_rate: float = 0.01
|
20
|
+
batch_size: int = 32
|
21
|
+
epochs: int = 10
|
22
|
+
model_name: str = "transformer"
|
23
|
+
|
24
|
+
@hydraflow.main(Config)
|
25
|
+
def train(run: Run, cfg: Config) -> None:
|
26
|
+
# Access configuration parameters
|
27
|
+
print(f"Training {cfg.model_name} for {cfg.epochs} epochs")
|
28
|
+
print(f"Learning rate: {cfg.learning_rate}, Batch size: {cfg.batch_size}")
|
29
|
+
```
|
30
|
+
|
31
|
+
## Type Hints
|
32
|
+
|
33
|
+
Adding type hints to your configuration class provides several benefits:
|
34
|
+
|
35
|
+
1. **Static Type Checking**: Tools like mypy can catch configuration errors
|
36
|
+
before runtime.
|
37
|
+
2. **IDE Auto-completion**: Your IDE can provide suggestions as you work with
|
38
|
+
configuration objects.
|
39
|
+
3. **Documentation**: Type hints serve as implicit documentation for your
|
40
|
+
configuration parameters.
|
41
|
+
|
42
|
+
## Nested Configurations
|
43
|
+
|
44
|
+
For more complex applications, you can use nested dataclasses to organize
|
45
|
+
related parameters:
|
46
|
+
|
47
|
+
```python
|
48
|
+
@dataclass
|
49
|
+
class ModelConfig:
|
50
|
+
name: str = "transformer"
|
51
|
+
hidden_size: int = 512
|
52
|
+
num_layers: int = 6
|
53
|
+
dropout: float = 0.1
|
54
|
+
|
55
|
+
@dataclass
|
56
|
+
class OptimizerConfig:
|
57
|
+
name: str = "adam"
|
58
|
+
learning_rate: float = 0.001
|
59
|
+
weight_decay: float = 0.0
|
60
|
+
|
61
|
+
@dataclass
|
62
|
+
class DataConfig:
|
63
|
+
batch_size: int = 32
|
64
|
+
num_workers: int = 4
|
65
|
+
train_path: str = "data/train"
|
66
|
+
val_path: str = "data/val"
|
67
|
+
|
68
|
+
@dataclass
|
69
|
+
class Config:
|
70
|
+
model: ModelConfig = ModelConfig()
|
71
|
+
optimizer: OptimizerConfig = OptimizerConfig()
|
72
|
+
data: DataConfig = DataConfig()
|
73
|
+
seed: int = 42
|
74
|
+
max_epochs: int = 10
|
75
|
+
|
76
|
+
@hydraflow.main(Config)
|
77
|
+
def train(run: Run, cfg: Config) -> None:
|
78
|
+
# Access nested configuration
|
79
|
+
model_name = cfg.model.name
|
80
|
+
lr = cfg.optimizer.learning_rate
|
81
|
+
batch_size = cfg.data.batch_size
|
82
|
+
```
|
83
|
+
|
84
|
+
## Hydra Integration
|
85
|
+
|
86
|
+
HydraFlow integrates closely with Hydra for configuration management. For detailed explanations of Hydra's capabilities, please refer to the [Hydra documentation](https://hydra.cc/docs/intro/).
|
87
|
+
|
88
|
+
HydraFlow leverages the following Hydra features, but does not modify their behavior:
|
89
|
+
|
90
|
+
- **Configuration Files**: Organize configurations in YAML files
|
91
|
+
- **Command-line Overrides**: Change parameters without modifying code
|
92
|
+
- **Configuration Groups**: Swap entire configuration blocks
|
93
|
+
- **Configuration Composition**: Combine configurations from multiple sources
|
94
|
+
- **Interpolation**: Reference other configuration values
|
95
|
+
- **Multi-run Sweeps**: Run experiments with different parameter combinations
|
96
|
+
|
97
|
+
When using HydraFlow, remember that:
|
98
|
+
|
99
|
+
1. Your configuration structure comes from your dataclass definitions
|
100
|
+
2. HydraFlow automatically registers your top-level dataclass with Hydra
|
101
|
+
3. `@hydraflow.main` sets up the connection between your dataclass and Hydra
|
102
|
+
|
103
|
+
For advanced Hydra features and detailed usage examples, we recommend consulting the official Hydra documentation after you become familiar with the basic HydraFlow concepts.
|
104
|
+
|
105
|
+
## Best Practices
|
106
|
+
|
107
|
+
1. **Use Type Hints**: Always include type hints for all configuration parameters.
|
108
|
+
|
109
|
+
2. **Set Sensible Defaults**: Provide reasonable default values to make your
|
110
|
+
application usable with minimal configuration.
|
111
|
+
|
112
|
+
3. **Group Related Parameters**: Use nested dataclasses to organize related
|
113
|
+
parameters logically.
|
114
|
+
|
115
|
+
4. **Document Parameters**: Add docstrings to your dataclasses and parameters
|
116
|
+
to explain their purpose and valid values.
|
117
|
+
|
118
|
+
5. **Validate Configurations**: Add validation logic to catch invalid
|
119
|
+
configurations early.
|
120
|
+
|
121
|
+
## Summary
|
122
|
+
|
123
|
+
HydraFlow's configuration system combines the type safety of Python dataclasses
|
124
|
+
with the flexibility of Hydra's composition and override capabilities. This
|
125
|
+
approach makes your machine learning experiments more maintainable,
|
126
|
+
reproducible, and easier to debug.
|