PyPI - hydraflow - Versions diffs - 0.17.2__tar.gz → 0.18.1__tar.gz - Mend

hydraflow 0.17.2tar.gz → 0.18.1tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (110) hide show

{hydraflow-0.17.2 → hydraflow-0.18.1}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: hydraflow
-Version: 0.17.2
+Version: 0.18.1
 Summary: HydraFlow seamlessly integrates Hydra and MLflow to streamline ML experiment management, combining Hydra's configuration management with MLflow's tracking capabilities.
 Project-URL: Documentation, https://daizutabi.github.io/hydraflow/
 Project-URL: Source, https://github.com/daizutabi/hydraflow
@@ -47,7 +47,6 @@ Requires-Dist: omegaconf>=2.3
 Requires-Dist: polars>=1.26
 Requires-Dist: python-ulid>=3.0.0
 Requires-Dist: rich>=13.9
-Requires-Dist: ruff>=0.11
 Requires-Dist: typer>=0.15
 Description-Content-Type: text/markdown
@@ -119,9 +118,6 @@ def app(run: Run, cfg: Config) -> None:
     # Your experiment code here
     print(f"Running with width={cfg.width}, height={cfg.height}")
-    # Log metrics
-    hydraflow.log_metric("area", cfg.width * cfg.height)
 if __name__ == "__main__":
     app()
 ```

{hydraflow-0.17.2 → hydraflow-0.18.1}/README.md RENAMED Viewed

@@ -66,9 +66,6 @@ def app(run: Run, cfg: Config) -> None:
     # Your experiment code here
     print(f"Running with width={cfg.width}, height={cfg.height}")
-    # Log metrics
-    hydraflow.log_metric("area", cfg.width * cfg.height)
 if __name__ == "__main__":
     app()
 ```

{hydraflow-0.17.2 → hydraflow-0.18.1}/docs/getting-started/concepts.md RENAMED Viewed

@@ -1,15 +1,20 @@
 # Core Concepts
-This page introduces the fundamental concepts of HydraFlow that form the foundation of the framework.
+This page introduces the fundamental concepts of HydraFlow that
+form the foundation of the framework.
 ## Design Principles
 HydraFlow is built on the following design principles:
-1. **Type Safety** - Utilizing Python dataclasses for configuration type checking and IDE support
-2. **Reproducibility** - Automatically tracking all experiment configurations for fully reproducible experiments
-3. **Workflow Integration** - Creating a cohesive workflow by integrating Hydra's configuration management with MLflow's experiment tracking
-4. **Analysis Capabilities** - Providing powerful APIs for easily analyzing experiment results
+1. **Type Safety** - Utilizing Python dataclasses for configuration
+    type checking and IDE support
+2. **Reproducibility** - Automatically tracking all experiment configurations
+    for fully reproducible experiments
+3. **Workflow Integration** - Creating a cohesive workflow by integrating
+    Hydra's configuration management with MLflow's experiment tracking
+4. **Analysis Capabilities** - Providing powerful APIs for easily
+    analyzing experiment results
 ## Key Components
@@ -17,7 +22,8 @@ HydraFlow consists of the following key components:
 ### Configuration Management
-HydraFlow uses a hierarchical configuration system based on OmegaConf and Hydra. This provides:
+HydraFlow uses a hierarchical configuration system based on
+OmegaConf and Hydra. This provides:
 - Type-safe configuration using Python dataclasses
 - Schema validation to ensure configuration correctness
@@ -36,11 +42,13 @@ class Config:
     epochs: int = 10
 ```
-This configuration class defines the structure and default values for your experiment, enabling type checking and auto-completion.
+This configuration class defines the structure and default values
+for your experiment, enabling type checking and auto-completion.
 ### Main Decorator
-The [`@hydraflow.main`][hydraflow.main] decorator defines the entry point for a HydraFlow application:
+The [`@hydraflow.main`][hydraflow.main] decorator defines the entry
+point for a HydraFlow application:
 ```python
 import hydraflow
@@ -64,7 +72,8 @@ This decorator provides:
 ### Workflow Automation
-HydraFlow allows you to automate experiment workflows using a YAML-based job definition system:
+HydraFlow allows you to automate experiment workflows using a
+YAML-based job definition system:
 ```yaml
 jobs:
@@ -98,11 +107,14 @@ python train.py -m "model=(small,large)_(v1,v2)"
 ### Analysis Tools
-After running experiments, HydraFlow provides powerful tools for accessing and analyzing results. These tools help you track, compare, and derive insights from your experiments.
+After running experiments, HydraFlow provides powerful tools for accessing
+and analyzing results. These tools help you track, compare, and derive
+insights from your experiments.
 #### Working with Individual Runs
-For individual experiment analysis, HydraFlow provides the `Run` class, which represents a single experiment run:
+For individual experiment analysis, HydraFlow provides the `Run` class,
+which represents a single experiment run:
 ```python
 from hydraflow import Run
@@ -139,7 +151,8 @@ print(run.cfg.learning_rate)  # IDE auto-completion works
 #### Comparing Multiple Runs
-For comparing multiple runs, HydraFlow offers the `RunCollection` class, which enables efficient analysis across runs:
+For comparing multiple runs, HydraFlow offers the `RunCollection` class,
+which enables efficient analysis across runs:
 ```python
 # Load multiple runs
@@ -164,11 +177,13 @@ Key features of experiment comparison:
 ## Summary
-These core concepts work together to provide a comprehensive framework for managing machine learning experiments:
+These core concepts work together to provide a comprehensive framework
+for managing machine learning experiments:
 1. **Configuration Management** - Type-safe configuration with Python dataclasses
 2. **Main Decorator** - The entry point that integrates Hydra and MLflow
 3. **Workflow Automation** - Reusable experiment definitions and advanced parameter sweeps
 4. **Analysis Tools** - Access, filter, and analyze experiment results
-Understanding these fundamental concepts will help you leverage the full power of HydraFlow for your machine learning projects.
+Understanding these fundamental concepts will help you leverage the full power
+of HydraFlow for your machine learning projects.

{hydraflow-0.17.2 → hydraflow-0.18.1}/docs/part1-applications/configuration.md RENAMED Viewed

@@ -83,7 +83,9 @@ def train(run: Run, cfg: Config) -> None:
 ## Hydra Integration
-HydraFlow integrates closely with Hydra for configuration management. For detailed explanations of Hydra's capabilities, please refer to the [Hydra documentation](https://hydra.cc/docs/intro/).
+HydraFlow integrates closely with Hydra for configuration management.
+For detailed explanations of Hydra's capabilities, please refer to
+the [Hydra documentation](https://hydra.cc/docs/intro/).
 HydraFlow leverages the following Hydra features, but does not modify their behavior:
@@ -100,7 +102,9 @@ When using HydraFlow, remember that:
 2. HydraFlow automatically registers your top-level dataclass with Hydra
 3. `@hydraflow.main` sets up the connection between your dataclass and Hydra
-For advanced Hydra features and detailed usage examples, we recommend consulting the official Hydra documentation after you become familiar with the basic HydraFlow concepts.
+For advanced Hydra features and detailed usage examples, we recommend
+consulting the official Hydra documentation after you become familiar
+with the basic HydraFlow concepts.
 ## Best Practices

{hydraflow-0.17.2 → hydraflow-0.18.1}/docs/part1-applications/execution.md RENAMED Viewed

@@ -14,13 +14,18 @@ python train.py
 This will:
-1. Set up an MLflow experiment with the same name as the Hydra job name (using `mlflow.set_experiment`). If the experiment doesn't exist, it will be created automatically
+1. Set up an MLflow experiment with the same name as the Hydra job name
+    (using `mlflow.set_experiment`). If the experiment doesn't exist,
+    it will be created automatically
 2. Create a new MLflow run or reuse an existing one based on the configuration
 3. Save the Hydra configuration as an MLflow artifact
 4. Execute your function decorated with `@hydraflow.main`
 5. Save only `*.log` files from Hydra's output directory as MLflow artifacts
-Note that any other artifacts (models, data files, etc.) must be explicitly saved by your code using MLflow's logging functions. The `chdir` option in the `@hydraflow.main` decorator can help with this by changing the working directory to the run's artifact directory, making file operations more convenient.
+Note that any other artifacts (models, data files, etc.) must be explicitly
+saved by your code using MLflow's logging functions. The `chdir` option in
+the `@hydraflow.main` decorator can help with this by changing the working
+directory to the run's artifact directory, making file operations more convenient.
 ## Command-line Override Syntax
@@ -62,7 +67,8 @@ of the specified parameters (2 learning rates × 3 model types).
 ### Advanced Parameter Sweeps
-For more complex parameter spaces, HydraFlow provides an extended sweep syntax that goes beyond Hydra's basic capabilities:
+For more complex parameter spaces, HydraFlow provides an extended
+sweep syntax that goes beyond Hydra's basic capabilities:
 ```bash
 # Define numerical ranges with start:stop:step
@@ -75,11 +81,13 @@ python train.py -m learning_rate=1:5:m  # 0.001 to 0.005
 python train.py -m model=(cnn,transformer)_(small,large)
 ```
-See [Extended Sweep Syntax](../part2-advanced/sweep-syntax.md) for a complete reference on these powerful features.
+See [Extended Sweep Syntax](../part2-advanced/sweep-syntax.md) for a
+complete reference on these powerful features.
 ### Managing Complex Experiment Workflows
-HydraFlow provides CLI tools to work with multirun mode more efficiently than using long command lines:
+HydraFlow provides CLI tools to work with multirun mode more efficiently
+than using long command lines:
 ```bash
 # Define jobs in hydraflow.yaml
@@ -94,11 +102,14 @@ jobs:
 hydraflow run train
 ```
-This approach helps you organize complex experiments, track execution history, and make experiments more reproducible. For details on these advanced capabilities, see [Job Configuration](../part2-advanced/job-configuration.md) in Part 2.
+This approach helps you organize complex experiments, track execution history,
+and make experiments more reproducible. For details on these advanced capabilities,
+see [Job Configuration](../part2-advanced/job-configuration.md) in Part 2.
 ## Output Organization
-By default, Hydra organizes outputs in the following directory structure for HydraFlow applications:
+By default, Hydra organizes outputs in the following directory structure
+for HydraFlow applications:
 ```
 ROOT_DIR/

{hydraflow-0.17.2 → hydraflow-0.18.1}/docs/part1-applications/index.md RENAMED Viewed

@@ -55,7 +55,8 @@ if __name__ == "__main__":
 ## Practical Examples
-If you prefer learning by example, check out our [Practical Tutorials](../practical-tutorials/index.md) section, which includes:
+If you prefer learning by example, check out our
+[Practical Tutorials](../practical-tutorials/index.md) section, which includes:
 - [Creating Your First HydraFlow Application](../practical-tutorials/applications.md): A step-by-step guide to building a basic application
 - [Automating Complex Workflows](../practical-tutorials/advanced.md): How to define and execute complex experiment workflows

{hydraflow-0.17.2 → hydraflow-0.18.1}/docs/part1-applications/main-decorator.md RENAMED Viewed

@@ -135,7 +135,10 @@ This default behavior improves efficiency by:
 ## Automatic Skipping of Completed Runs
-HydraFlow automatically skips runs that have already completed successfully. This is especially valuable in environments where jobs are automatically restarted after preemption. Without requiring any additional configuration, HydraFlow will:
+HydraFlow automatically skips runs that have already completed successfully.
+This is especially valuable in environments where jobs are automatically
+restarted after preemption. Without requiring any additional configuration,
+HydraFlow will:
 1. Identify already completed runs with the same configuration
 2. Skip re-execution of those runs
@@ -161,7 +164,9 @@ This automatic skipping behavior:
 ## Advanced Features
-The `hydraflow.main` decorator supports several keyword arguments that enhance its functionality. All these options are set to `False` by default and must be explicitly enabled when needed:
+The `hydraflow.main` decorator supports several keyword arguments that
+enhance its functionality. All these options are set to `False` by
+default and must be explicitly enabled when needed:
 ### Working Directory Management (`chdir`)
@@ -187,7 +192,8 @@ This option is beneficial when:
 ### Forcing New Runs (`force_new_run`)
-Override the default run identification and reuse behavior by always creating a new run, even when identical configurations exist:
+Override the default run identification and reuse behavior by always
+creating a new run, even when identical configurations exist:
 ```python
 @hydraflow.main(Config, force_new_run=True)
@@ -206,7 +212,8 @@ This option is useful when:
 ### Rerunning Finished Experiments (`rerun_finished`)
-Override the automatic skipping of completed runs by explicitly allowing rerunning of experiments that have already finished:
+Override the automatic skipping of completed runs by explicitly
+allowing rerunning of experiments that have already finished:
 ```python
 @hydraflow.main(Config, rerun_finished=True)

{hydraflow-0.17.2 → hydraflow-0.18.1}/docs/part2-advanced/index.md RENAMED Viewed

@@ -1,10 +1,13 @@
 # Automating Workflows
-This section covers advanced techniques for automating and structuring multiple experiments in HydraFlow. It provides tools for defining complex parameter spaces and reusable experiment definitions.
+This section covers advanced techniques for automating and structuring
+multiple experiments in HydraFlow. It provides tools for defining complex
+parameter spaces and reusable experiment definitions.
 ## Overview
-After creating your basic HydraFlow applications, the next step is to automate your experiment workflows. This includes:
+After creating your basic HydraFlow applications, the next step is to
+automate your experiment workflows. This includes:
 - Creating parameter sweeps across complex combinations
 - Defining reusable experiment configurations
@@ -14,20 +17,26 @@ After creating your basic HydraFlow applications, the next step is to automate y
 The main components for workflow automation in HydraFlow are:
-1. **Extended Sweep Syntax**: A powerful syntax for defining parameter spaces beyond simple comma-separated values.
-2. **Job Configuration**: A YAML-based definition system for creating reusable experiment workflows.
+1. **Extended Sweep Syntax**: A powerful syntax for defining parameter
+    spaces beyond simple comma-separated values.
+2. **Job Configuration**: A YAML-based definition system for creating
+    reusable experiment workflows.
 ## Practical Examples
-For hands-on examples of workflow automation, see our [Practical Tutorials](../practical-tutorials/index.md) section, specifically:
+For hands-on examples of workflow automation, see our
+[Practical Tutorials](../practical-tutorials/index.md) section, specifically:
-- [Automating Complex Workflows](../practical-tutorials/advanced.md): A tutorial that demonstrates how to use `hydraflow.yaml` to define and execute various types of workflows
-- [Analyzing Experiment Results](../practical-tutorials/analysis.md): Learn how to work with results from automated experiment runs
+- [Automating Complex Workflows](../practical-tutorials/advanced.md): A tutorial
+  that demonstrates how to use `hydraflow.yaml` to define and execute
+  various types of workflows
+- [Analyzing Experiment Results](../practical-tutorials/analysis.md): Learn
+  how to work with results from automated experiment runs
 ## Extended Sweep Syntax
-HydraFlow extends Hydra's sweep syntax to provide more powerful ways to define parameter spaces:
+HydraFlow extends Hydra's sweep syntax to provide more powerful ways
+to define parameter spaces:
 ```bash
 # Range of values (inclusive)
@@ -44,7 +53,8 @@ Learn more about these capabilities in [Sweep Syntax](sweep-syntax.md).
 ## Job Configuration
-For more complex experiment workflows, you can use HydraFlow's job configuration system:
+For more complex experiment workflows, you can use HydraFlow's job
+configuration system:
 ```yaml
 jobs:
@@ -61,7 +71,9 @@ jobs:
         all: test_data=validation
 ```
-This approach allows you to define reusable experiment definitions that can be executed with a single command. Learn more in [Job Configuration](job-configuration.md).
+This approach allows you to define reusable experiment definitions that
+can be executed with a single command. Learn more in
+[Job Configuration](job-configuration.md).
 ## Executing Workflows
@@ -82,7 +94,10 @@ hydraflow run train_models seed=123
 In the following pages, we'll explore workflow automation in detail:
-- [Sweep Syntax](sweep-syntax.md): Learn about HydraFlow's extended syntax for defining parameter spaces.
-- [Job Configuration](job-configuration.md): Discover how to create reusable job definitions for your experiments.
+- [Sweep Syntax](sweep-syntax.md): Learn about HydraFlow's extended
+  syntax for defining parameter spaces.
+- [Job Configuration](job-configuration.md): Discover how to create
+  reusable job definitions for your experiments.
-After automating your experiments, you'll want to analyze the results using the tools covered in [Part 3: Analyzing Results](../part3-analysis/index.md).
+After automating your experiments, you'll want to analyze the results
+using the tools covered in [Part 3: Analyzing Results](../part3-analysis/index.md).

{hydraflow-0.17.2 → hydraflow-0.18.1}/docs/part2-advanced/job-configuration.md RENAMED Viewed

@@ -64,7 +64,8 @@ The specified function will be imported and called with the parameters.
 ### `submit`
-The `submit` command collects all parameter combinations into a text file and passes this file to the specified command:
+The `submit` command collects all parameter combinations into a text
+file and passes this file to the specified command:
 ```yaml
 jobs:
@@ -91,7 +92,9 @@ The key difference between `run` and `submit`:
 - `run`: Executes the command once per parameter combination
 - `submit`: Executes the command once, with all parameter combinations provided in a file
-This gives you complete flexibility in how parameter combinations are processed. Your handler script can implement any logic - from simple sequential processing to complex distributed execution across a cluster.
+This gives you complete flexibility in how parameter combinations are
+processed. Your handler script can implement any logic - from simple
+sequential processing to complex distributed execution across a cluster.
 ## Parameter Sets
@@ -214,14 +217,18 @@ jobs:
         add: hydra/launcher=submitit hydra.launcher.submitit.cpus_per_task=8
 ```
-When a set has its own `add` parameter, it is merged with the job-level `add` parameter.
-If the same parameter key exists in both the job-level and set-level `add`, the set-level value takes precedence.
+When a set has its own `add` parameter, it is merged with
+the job-level `add` parameter.
+If the same parameter key exists in both the job-level and set-level
+`add`, the set-level value takes precedence.
 For example, with the configuration above:
 - The first set uses: `hydra/launcher=joblib hydra.launcher.n_jobs=2`
 - The second set uses: `hydra/launcher=submitit hydra.launcher.n_jobs=2 hydra.launcher.submitit.cpus_per_task=8`
-Notice how `hydra/launcher` is overridden by the set-level value, while `hydra.launcher.n_jobs` from the job-level is retained.
+Notice how `hydra/launcher` is overridden by the set-level value,
+while `hydra.launcher.n_jobs` from the job-level is retained.
 This behavior allows you to:
@@ -229,11 +236,13 @@ This behavior allows you to:
 2. Override or add specific parameters at the set level
 3. Keep all non-conflicting parameters from both levels
-This merging behavior makes it easy to maintain common configuration options while customizing specific aspects for different parameter sets.
+This merging behavior makes it easy to maintain common configuration
+options while customizing specific aspects for different parameter sets.
 ## Summary
-HydraFlow's job configuration system provides a powerful way to define and manage complex parameter sweeps:
+HydraFlow's job configuration system provides a powerful way to define
+and manage complex parameter sweeps:
 1. **Execution Commands**:

{hydraflow-0.17.2 → hydraflow-0.18.1}/docs/part2-advanced/sweep-syntax.md RENAMED Viewed

@@ -24,7 +24,8 @@ python train.py -m model=medium
 python train.py -m model=large
 ```
-When using multiple parameters with `each`, all possible combinations (cartesian product) will be generated:
+When using multiple parameters with `each`, all possible
+combinations (cartesian product) will be generated:
 ```yaml
 jobs:
@@ -275,6 +276,10 @@ HydraFlow's extended sweep syntax provides several powerful features for paramet
 5. **Parentheses grouping** - Create combinations of values and nested structures
 6. **Pipe operator** - Run multiple independent parameter sweeps in the same job
-All of these can be combined to create complex, expressive parameter sweeps with minimal configuration. Remember that using the `each` keyword creates a cartesian product of all parameters (all possible combinations), while the pipe operator (`|`) creates separate, independent parameter sweeps.
+All of these can be combined to create complex, expressive parameter sweeps
+with minimal configuration. Remember that using the `each` keyword creates a cartesian
+product of all parameters (all possible combinations), while the pipe
+operator (`|`) creates separate, independent parameter sweeps.
-When using these features, HydraFlow will automatically generate the appropriate Hydra multirun commands with the `-m` flag.
+When using these features, HydraFlow will automatically generate the appropriate
+Hydra multirun commands with the `-m` flag.

{hydraflow-0.17.2 → hydraflow-0.18.1}/docs/part3-analysis/index.md RENAMED Viewed

@@ -33,9 +33,12 @@ The main components of HydraFlow's analysis tools are:
 ## Practical Examples
-For hands-on examples of experiment analysis, check out our [Practical Tutorials](../practical-tutorials/index.md) section, specifically:
+For hands-on examples of experiment analysis, check out our
+[Practical Tutorials](../practical-tutorials/index.md) section, specifically:
-- [Analyzing Experiment Results](../practical-tutorials/analysis.md): A detailed tutorial demonstrating how to load, filter, group, and analyze experiment data using HydraFlow's APIs
+- [Analyzing Experiment Results](../practical-tutorials/analysis.md): A
+detailed tutorial demonstrating how to load, filter, group, and analyze
+experiment data using HydraFlow's APIs
 ## Basic Analysis Workflow
@@ -66,7 +69,8 @@ best_run = df.sort("accuracy", descending=True).first()
 ## Finding and Loading Runs
-HydraFlow provides utilities to easily find and load runs from your MLflow tracking directory:
+HydraFlow provides utilities to easily find and load runs from your
+MLflow tracking directory:
 ```python
 from hydraflow import Run
@@ -83,7 +87,8 @@ runs = Run.load(iter_run_dirs(tracking_dir, "my_experiment"))
 runs = Run.load(iter_run_dirs(tracking_dir, ["training_*", "finetuning_*"]))
 ```
-This approach makes it easy to gather all relevant runs for analysis without having to manually specify each run directory.
+This approach makes it easy to gather all relevant runs for analysis
+without having to manually specify each run directory.
 ## Type-Safe Analysis
@@ -137,7 +142,9 @@ model = run.impl.load_model()
 results = run.impl.analyze_performance()
 ```
-The analysis capabilities covered in Part 3 are designed to work seamlessly with the experiment definitions from [Part 1](../part1-applications/index.md) and the advanced workflow automation from [Part 2](../part2-advanced/index.md).
+The analysis capabilities covered in Part 3 are designed to work
+seamlessly with the experiment definitions from [Part 1](../part1-applications/index.md)
+and the advanced workflow automation from [Part 2](../part2-advanced/index.md).
 ## What's Next

{hydraflow-0.17.2 → hydraflow-0.18.1}/docs/part3-analysis/run-class.md RENAMED Viewed

@@ -210,7 +210,8 @@ runs = Run.load(run_dirs, n_jobs=-1)  # Use all available CPU cores
 ### Finding Runs with `iter_run_dirs`
-HydraFlow provides the [`iter_run_dirs`][hydraflow.core.io.iter_run_dirs] function to easily discover runs in your MLflow tracking directory:
+HydraFlow provides the [`iter_run_dirs`][hydraflow.core.io.iter_run_dirs]
+function to easily discover runs in your MLflow tracking directory:
 ```python
 from hydraflow.core.io import iter_run_dirs
@@ -235,7 +236,9 @@ def filter_experiments(name: str) -> bool:
 runs = Run.load(iter_run_dirs(tracking_dir, filter_experiments))
 ```
-The `iter_run_dirs` function yields paths to run directories that can be directly passed to `Run.load`. This makes it easy to find and load runs based on experiment names or custom filtering criteria.
+The `iter_run_dirs` function yields paths to run directories that can be
+directly passed to `Run.load`. This makes it easy to find and load runs
+based on experiment names or custom filtering criteria.
 ## Best Practices

{hydraflow-0.17.2 → hydraflow-0.18.1}/docs/part3-analysis/run-collection.md RENAMED Viewed

@@ -7,19 +7,27 @@ instances, making it easy to compare and extract insights from your experiments.
 ## Architecture
-`RunCollection` is built on top of the more general [`Collection`][hydraflow.core.collection.Collection]
-class, which provides a flexible foundation for working with sequences of items. This architecture offers several benefits:
-1. **Consistent Interface**: All collection-based classes in HydraFlow share a common interface and behavior
-2. **Code Reuse**: Core functionality is implemented once in the base class and inherited by specialized collections
-3. **Extensibility**: New collection types can easily be created for different item types
-4. **Type Safety**: Generic type parameters ensure type checking throughout the collection hierarchy
-The `Collection` class implements the Python `Sequence` protocol, allowing it to be used like standard Python
-collections (lists, tuples) while providing specialized methods for filtering, grouping, and data extraction.
-`RunCollection` extends this foundation with run-specific functionality, particularly for working with MLflow
-experiment data. This layered design separates generic collection behavior from domain-specific operations.
+`RunCollection` is built on top of the more general
+[`Collection`][hydraflow.core.collection.Collection]
+class, which provides a flexible foundation for working with sequences
+of items. This architecture offers several benefits:
+1. **Consistent Interface**: All collection-based classes in HydraFlow
+    share a common interface and behavior
+2. **Code Reuse**: Core functionality is implemented once in the base
+    class and inherited by specialized collections
+3. **Extensibility**: New collection types can easily be created
+    for different item types
+4. **Type Safety**: Generic type parameters ensure type checking
+    throughout the collection hierarchy
+The `Collection` class implements the Python `Sequence` protocol,
+allowing it to be used like standard Python collections (lists, tuples)
+while providing specialized methods for filtering, grouping, and data extraction.
+`RunCollection` extends this foundation with run-specific functionality,
+particularly for working with MLflow experiment data. This layered
+design separates generic collection behavior from domain-specific operations.
 ## Creating a Run Collection
@@ -243,17 +251,19 @@ model_types = runs.unique("model_type")
 num_model_types = runs.n_unique("model_type")
 ```
-All data extraction methods (`to_list`, `to_numpy`, `to_series`, etc.) support both static and callable default values,
-matching the behavior of the `Run.get` method. When using a callable default, the function receives
-the Run instance as an argument, allowing you to:
+All data extraction methods (`to_list`, `to_numpy`, `to_series`, etc.)
+support both static and callable default values,
+matching the behavior of the `Run.get` method. When using a callable default,
+the function receives the Run instance as an argument, allowing you to:
 - Implement fallback logic for missing parameters
 - Create derived values based on multiple parameters
 - Handle varying configuration schemas across different experiments
 - Apply transformations to the raw parameter values
-This makes it much easier to work with heterogeneous collections of runs that might have different
-parameter sets or evolving configuration schemas.
+This makes it much easier to work with heterogeneous collections of
+runs that might have different parameter sets or evolving configuration
+schemas.
 ## Converting to DataFrame
@@ -293,23 +303,6 @@ filled_df = missing_values_df.with_columns(
 )
 ```
-The `to_frame` method provides several ways to handle missing data:
-1. **defaults parameter**: Provide static or callable default values for specific keys
-   - Static values: `defaults={"param": value}`
-   - Callable values: `defaults={"param": lambda run: computed_value}`
-2. **None values**: Parameters without defaults are represented as `None` (null) in the DataFrame
-   - This lets you use Polars operations for handling null values:
-     - Filter: `df.filter(pl.col("param").is_not_null())`
-     - Fill nulls: `df.with_columns(pl.col("param").fill_null(value))`
-     - Aggregations: Most aggregation functions handle nulls appropriately
-3. **Special object keys**: Use the special keys `"run"`, `"cfg"`, and `"impl"` to include the actual
-   Run objects, configuration objects, or implementation objects in the DataFrame
-   - This allows direct access to the original objects for further operations
-   - You can combine regular data columns with object columns as needed
 ## Grouping Runs
 The `group_by` method allows you to organize runs based on parameter values:
@@ -343,16 +336,25 @@ model_avg_loss = model_groups.agg(
 )
 ```
-The `group_by` method returns a `GroupBy` instance that maps keys to `RunCollection` instances. This design allows you to:
+The `group_by` method returns a `GroupBy` instance that maps keys to
+`RunCollection` instances. This design allows you to:
-- Work with each group as a separate `RunCollection` with all the filtering, sorting, and analysis capabilities
-- Perform custom operations on each group that might not be expressible as simple aggregation functions
+- Work with each group as a separate `RunCollection` with all the
+  filtering, sorting, and analysis capabilities
+- Perform custom operations on each group that might not be expressible
+  as simple aggregation functions
 - Chain additional operations on specific groups that interest you
-- Implement multi-stage analysis workflows where you need to maintain the full run information at each step
+- Implement multi-stage analysis workflows where you need to maintain
+  the full run information at each step
-To perform aggregations on the grouped data, use the `agg` method on the GroupBy instance. This transforms the grouped data into a DataFrame with aggregated results. You can define multiple aggregation functions to compute different metrics across each group.
+To perform aggregations on the grouped data, use the `agg` method on
+the GroupBy instance. This transforms the grouped data into a DataFrame
+with aggregated results.
+You can define multiple aggregation functions to compute different
+metrics across each group.
-This approach preserves all information in each group, giving you maximum flexibility for downstream analysis.
+This approach preserves all information in each group, giving
+you maximum flexibility for downstream analysis.
 ## Type-Safe Run Collections

hydraflow 0.17.2__tar.gz → 0.18.1__tar.gz

hydraflow 0.17.2tar.gz → 0.18.1tar.gz