PyPI - vpop-calibration - Versions diffs - 2.0.0__tar.gz - Mend

vpop-calibration 2.0.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (19) hide show

vpop_calibration-2.0.0/LICENSE +21 -0
vpop_calibration-2.0.0/PKG-INFO +78 -0
vpop_calibration-2.0.0/README.md +54 -0
vpop_calibration-2.0.0/pyproject.toml +64 -0
vpop_calibration-2.0.0/vpop_calibration/__init__.py +19 -0
vpop_calibration-2.0.0/vpop_calibration/data_generation.py +180 -0
vpop_calibration-2.0.0/vpop_calibration/model/__init__.py +3 -0
vpop_calibration-2.0.0/vpop_calibration/model/data.py +421 -0
vpop_calibration-2.0.0/vpop_calibration/model/gp.py +497 -0
vpop_calibration-2.0.0/vpop_calibration/model/plot.py +235 -0
vpop_calibration-2.0.0/vpop_calibration/nlme.py +661 -0
vpop_calibration-2.0.0/vpop_calibration/ode.py +193 -0
vpop_calibration-2.0.0/vpop_calibration/saem.py +747 -0
vpop_calibration-2.0.0/vpop_calibration/structural_model.py +185 -0
vpop_calibration-2.0.0/vpop_calibration/test/__init__.py +0 -0
vpop_calibration-2.0.0/vpop_calibration/test/test_data.py +20 -0
vpop_calibration-2.0.0/vpop_calibration/test/test_gp_flavors.py +77 -0
vpop_calibration-2.0.0/vpop_calibration/test/test_gp_saem.py +172 -0
vpop_calibration-2.0.0/vpop_calibration/vpop.py +50 -0

vpop_calibration-2.0.0/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2024 Nova
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

vpop_calibration-2.0.0/PKG-INFO ADDED Viewed

@@ -0,0 +1,78 @@
+Metadata-Version: 2.4
+Name: vpop-calibration
+Version: 2.0.0
+Summary:
+License-File: LICENSE
+Author: Paul Lemarre
+Author-email: paul.lemarre@novainsilico.ai
+Requires-Python: >=3.12,<4.0
+Classifier: Programming Language :: Python :: 3
+Classifier: Programming Language :: Python :: 3.12
+Classifier: Programming Language :: Python :: 3.13
+Classifier: Programming Language :: Python :: 3.14
+Requires-Dist: gpytorch (>=1.14.2,<2.0.0)
+Requires-Dist: matplotlib (>=3.10.7,<4.0.0)
+Requires-Dist: numpy (>=2.3.4,<3.0.0)
+Requires-Dist: pandas (>=2.3.3,<3.0.0)
+Requires-Dist: plotly (>=6.4.0,<7.0.0)
+Requires-Dist: scipy (>=1.16.3,<2.0.0)
+Requires-Dist: torch
+Requires-Dist: tqdm (>=4.67.1,<5.0.0)
+Requires-Dist: uuid (>=1.30,<2.0)
+Description-Content-Type: text/markdown
+# Vpop calibration
+## Description
+A set of Python tools to allow for virtual population calibration, using a non-linear mixed effects (NLME) model approach, combined with surrogate models in order to speed up the simulation of QSP models.
+### Currently available features
+- Surrogate modeling using gaussian processes, implemented using [GPyTorch](https://github.com/cornellius-gp/gpytorch)
+- Synthetic data generation using ODE models. The current implementation uses [scipy.integrate.solve_ivp](https://docs.scipy.org/doc/scipy/reference/generated/scipy.integrate.solve_ivp.html), parallelized with [multiprocessing](https://docs.python.org/3/library/multiprocessing.html)
+- Non-linear mixed effect models:
+  - Log-distributed parameters
+  - Additive or multiplicative error model
+  - Covariates handling
+  - Known individual patient descriptors (i.e. covariates with no effect on other descriptors outside of the structural model)
+- SAEM: see the [dedicated doc](./docs/saem_implementation.md) for more details
+## Getting started
+- [Tutorial](./examples/saem_gp_model.ipynb): this notebook demonstrates step-by-step how to create and train a surrogate model, using a reference ODE model and a GP surrogate model. It then showcases how to optimize the surrogate model on synthetic data using SAEM
+- Other available examples:
+  - [Data generation using Sobol sequences](./examples/generate_data_ranges.ipynb)
+  - [Data generation using a reference NLME model](./examples/generate_data_nlme.ipynb)
+  - [Training and exporting a GP using synthetic data](./examples/train_gp.ipynb)
+  - [Running SAEM on a reference ODE model](./examples/saem_ode_model.ipynb). Note: the current implementation is notably under-optimized for running SAEM directly on an ODE structural model. This is implemented for testing purposes mostly
+  - [Training a GP with a deep kernel](./examples/train_deep_kernel.ipynb)
+## Support
+For any issue or comments, please reach out to paul.lemarre@novainsilico.ai, or feel free to open an issue in the repo directly.
+## Authors and acknowledgment
+- Paul Lemarre
+- Eléonore Dravet
+- Adeline Leclerq-Sampson
+## Roadmap
+- NLME:
+  - Support additional error models (additive-multiplicative, power, etc...)
+  - Support additional covariate models (categorical covariates)
+  - Add residual diagnostic methods (weighted residuals computation and visualization)
+- Structural models:
+  - Integrate with SBML models (Roadrunner)
+- Surrogate models:
+  - Support additional surrogate models in PyTorch
+- Optimizer:
+  - Add SVGP for surrogate model optimization
+## References
+- [Delyon et al. 99](https://doi.org/10.1214/aos/1018031103): Bernard Delyon. Marc Lavielle. Eric Moulines. "Convergence of a stochastic approximation version of the EM algorithm." Ann. Statist. 27 (1) 94 - 128, February 1999. https://doi.org/10.1214/aos/1018031103
+- [Grenier et al. 2018](https://doi.org/10.1007/s40314-016-0337-5): Grenier, E., Helbert, C., Louvet, V. et al. Population parametrization of costly black box models using iterations between SAEM algorithm and kriging. Comp. Appl. Math. 37, 161–173 (2018). https://doi.org/10.1007/s40314-016-0337-5

vpop_calibration-2.0.0/README.md ADDED Viewed

@@ -0,0 +1,54 @@
+# Vpop calibration
+## Description
+A set of Python tools to allow for virtual population calibration, using a non-linear mixed effects (NLME) model approach, combined with surrogate models in order to speed up the simulation of QSP models.
+### Currently available features
+- Surrogate modeling using gaussian processes, implemented using [GPyTorch](https://github.com/cornellius-gp/gpytorch)
+- Synthetic data generation using ODE models. The current implementation uses [scipy.integrate.solve_ivp](https://docs.scipy.org/doc/scipy/reference/generated/scipy.integrate.solve_ivp.html), parallelized with [multiprocessing](https://docs.python.org/3/library/multiprocessing.html)
+- Non-linear mixed effect models:
+  - Log-distributed parameters
+  - Additive or multiplicative error model
+  - Covariates handling
+  - Known individual patient descriptors (i.e. covariates with no effect on other descriptors outside of the structural model)
+- SAEM: see the [dedicated doc](./docs/saem_implementation.md) for more details
+## Getting started
+- [Tutorial](./examples/saem_gp_model.ipynb): this notebook demonstrates step-by-step how to create and train a surrogate model, using a reference ODE model and a GP surrogate model. It then showcases how to optimize the surrogate model on synthetic data using SAEM
+- Other available examples:
+  - [Data generation using Sobol sequences](./examples/generate_data_ranges.ipynb)
+  - [Data generation using a reference NLME model](./examples/generate_data_nlme.ipynb)
+  - [Training and exporting a GP using synthetic data](./examples/train_gp.ipynb)
+  - [Running SAEM on a reference ODE model](./examples/saem_ode_model.ipynb). Note: the current implementation is notably under-optimized for running SAEM directly on an ODE structural model. This is implemented for testing purposes mostly
+  - [Training a GP with a deep kernel](./examples/train_deep_kernel.ipynb)
+## Support
+For any issue or comments, please reach out to paul.lemarre@novainsilico.ai, or feel free to open an issue in the repo directly.
+## Authors and acknowledgment
+- Paul Lemarre
+- Eléonore Dravet
+- Adeline Leclerq-Sampson
+## Roadmap
+- NLME:
+  - Support additional error models (additive-multiplicative, power, etc...)
+  - Support additional covariate models (categorical covariates)
+  - Add residual diagnostic methods (weighted residuals computation and visualization)
+- Structural models:
+  - Integrate with SBML models (Roadrunner)
+- Surrogate models:
+  - Support additional surrogate models in PyTorch
+- Optimizer:
+  - Add SVGP for surrogate model optimization
+## References
+- [Delyon et al. 99](https://doi.org/10.1214/aos/1018031103): Bernard Delyon. Marc Lavielle. Eric Moulines. "Convergence of a stochastic approximation version of the EM algorithm." Ann. Statist. 27 (1) 94 - 128, February 1999. https://doi.org/10.1214/aos/1018031103
+- [Grenier et al. 2018](https://doi.org/10.1007/s40314-016-0337-5): Grenier, E., Helbert, C., Louvet, V. et al. Population parametrization of costly black box models using iterations between SAEM algorithm and kriging. Comp. Appl. Math. 37, 161–173 (2018). https://doi.org/10.1007/s40314-016-0337-5

vpop_calibration-2.0.0/pyproject.toml ADDED Viewed

@@ -0,0 +1,64 @@
+[project]
+name = "vpop-calibration"
+version = "2.0.0"
+description = ""
+authors = [{ name = "Paul Lemarre", email = "paul.lemarre@novainsilico.ai" }]
+readme = "README.md"
+requires-python = ">=3.12,<4.0"
+dependencies = [
+    "torch",
+    "gpytorch (>=1.14.2,<2.0.0)",
+    "scipy (>=1.16.3,<2.0.0)",
+    "uuid (>=1.30,<2.0)",
+    "matplotlib (>=3.10.7,<4.0.0)",
+    "plotly (>=6.4.0,<7.0.0)",
+    "pandas (>=2.3.3,<3.0.0)",
+    "numpy (>=2.3.4,<3.0.0)",
+    "tqdm (>=4.67.1,<5.0.0)",
+]
+[tool.poetry.group.dev.dependencies]
+torch = { url = "https://download.pytorch.org/whl/cpu/torch-2.9.0%2Bcpu-cp313-cp313-manylinux_2_28_x86_64.whl#sha256=6c9b217584400963d5b4daddb3711ec7a3778eab211e18654fba076cce3b8682" }
+ipykernel = "^7.1.0"
+jupyter = "^1.1.1"
+ipython = "^9.7.0"
+jupyterlab = "^4.4.10"
+jupytext = "^1.18.1"
+plotnine = "^0.15.1"
+pytest = "^9.0.1"
+pytest-cov = "^7.0.0"
+[build-system]
+requires = ["poetry-core>=2.0.0,<3.0.0"]
+build-backend = "poetry.core.masonry.api"
+[tool.pytest]
+minversion = "9.0"
+addopts = ["-ra", "-q", "--cov=vpop_calibration", "--cov-report=markdown"]
+testpaths = ["vpop_calibration/test"]
+filterwarnings = ["error", "ignore::UserWarning", "ignore::DeprecationWarning"]
+[tool.coverage.run]
+branch = true
+omit = ["vpop_calibration/test/*"]
+[tool.coverage.report]
+# Regexes for lines to exclude from consideration
+exclude_also = [
+    # Don't complain about missing debug-only code:
+    "def __repr__",
+    "if self\\.debug",
+    # Don't complain if tests don't hit defensive assertion code:
+    "raise AssertionError",
+    "raise NotImplementedError",
+    # Don't complain if non-runnable code isn't run:
+    "if 0:",
+    "if __name__ == .__main__.:",
+    # Don't complain about abstract methods, they aren't run:
+    "@(abc\\.)?abstractmethod",
+]
+show_missing = true
+ignore_errors = true

vpop_calibration-2.0.0/vpop_calibration/__init__.py ADDED Viewed

@@ -0,0 +1,19 @@
+from .nlme import NlmeModel
+from .saem import PySaem
+from .structural_model import StructuralGp, StructuralOdeModel
+from .model import *
+from .ode import OdeModel
+from .vpop import generate_vpop_from_ranges
+from .data_generation import simulate_dataset_from_omega, simulate_dataset_from_ranges
+__all__ = [
+    "GP",
+    "OdeModel",
+    "StructuralGp",
+    "StructuralOdeModel",
+    "NlmeModel",
+    "PySaem",
+    "simulate_dataset_from_omega",
+    "simulate_dataset_from_ranges",
+    "generate_vpop_from_ranges",
+]

vpop_calibration-2.0.0/vpop_calibration/data_generation.py ADDED Viewed

@@ -0,0 +1,180 @@
+import numpy as np
+import pandas as pd
+from typing import Optional
+from .ode import OdeModel
+from .vpop import generate_vpop_from_ranges
+from .structural_model import StructuralOdeModel
+from .nlme import NlmeModel
+def simulate_dataset_from_ranges(
+    ode_model: OdeModel,
+    log_nb_individuals: int,
+    param_ranges: dict[str, dict[str, float | bool]],
+    initial_conditions: np.ndarray,
+    protocol_design: Optional[pd.DataFrame],
+    residual_error_variance: Optional[np.ndarray],
+    error_model: Optional[str],  # "additive" or "proportional"
+    time_steps: np.ndarray,
+) -> pd.DataFrame:
+    """Generate a simulated data set with an ODE model
+    Simulates a dataset for training a surrogate model. Timesteps can be different for each output.
+    The parameter space is explored with Sobol sequences.
+    Args:
+        log_nb_individuals (int): The number of simulated patients will be 2^this parameter
+        param_ranges (list[dict]): For each parameter in the model, a dict describing the search space 'low': low bound, 'high': high bound, and 'log': True if the search space is log-scaled
+        initial_conditions (array): set of initial conditions, one for each variable
+        protocol_design (optional): a DataFrame with a `protocol_arm` column, and one column per parameter override
+        residual_error_variance (np.array): A 1D array of residual error variances for each output.
+        error_model (str): the type of error model ("additive" or "proportional").
+        time_steps (np.array): an array with the time points
+    Returns:
+        pd.DataFrame: A DataFrame with columns 'id', parameter names, 'time', 'output_name', and 'value'.
+    Notes:
+        If a parameter appears both in the ranges and in the protocol design, the ranges take precedence.
+    """
+    # Validate input data
+    params_to_explore = list(param_ranges.keys())
+    if protocol_design is None:
+        print("No protocol")
+        params = params_to_explore
+        params_in_protocol = []
+        protocol_design_filt = pd.DataFrame({"protocol_arm": ["identity"]})
+    else:
+        params_in_protocol = protocol_design.drop(
+            "protocol_arm", axis=1
+        ).columns.tolist()
+        # Find the paramaters that appear both in the ranges and the protocol
+        overlap = set(params_to_explore) & set(params_in_protocol)
+        if overlap != set():
+            protocol_design_filt = protocol_design.drop(list(overlap), axis=1)
+            print(
+                f"Warning: ignoring entries {overlap} from the protocol design (already defined in the ranges)."
+            )
+        else:
+            protocol_design_filt = protocol_design
+        params = params_to_explore + params_in_protocol
+    if set(params) != set(ode_model.param_names):
+        raise ValueError(
+            f"Under-defined system: missing {set(ode_model.param_names) - set(params)}"
+        )
+    # Generate the vpop using sobol sequences
+    patients_df = generate_vpop_from_ranges(log_nb_individuals, param_ranges)
+    # Add a choice of protocol arm for each patient
+    protocol_arms = pd.DataFrame(protocol_design_filt["protocol_arm"].drop_duplicates())
+    patients_df = patients_df.merge(protocol_arms, how="cross")
+    # Add the outputs for each patient
+    outputs = pd.DataFrame({"output_name": ode_model.variable_names})
+    patients_df = patients_df.merge(outputs, how="cross")
+    # Simulate the ODE model
+    output_df = ode_model.run_trial(
+        patients_df, initial_conditions, protocol_design_filt, time_steps
+    )
+    # Pivot to wide to add noise per model output
+    wide_output = output_df.pivot_table(
+        index=["id", *ode_model.param_names, "time", "protocol_arm"],
+        columns="output_name",
+        values="predicted_value",
+    ).reset_index()
+    if error_model is None:
+        pass
+    else:
+        if residual_error_variance is None:
+            raise ValueError("Undefined residual error variance.")
+        else:
+            # Add noise to the data
+            noise = np.random.normal(
+                np.zeros_like(residual_error_variance),
+                np.sqrt(residual_error_variance),
+                (wide_output.shape[0], ode_model.nb_outputs),
+            )
+            if error_model == "additive":
+                wide_output[ode_model.variable_names] += noise
+            elif error_model == "proportional":
+                wide_output[ode_model.variable_names] += (
+                    noise * wide_output[ode_model.variable_names]
+                )
+            else:
+                raise ValueError(f"Incorrect error_model choice: {error_model}")
+    # Pivot back to long format
+    long_output = wide_output.melt(
+        id_vars=[
+            "id",
+            "protocol_arm",
+            "time",
+            *ode_model.param_names,
+        ],
+        value_vars=ode_model.variable_names,
+        var_name="output_name",
+        value_name="value",
+    )
+    # Remove the protocol arm overrides from the data set, they described by the protocol_arm column now
+    long_output = long_output.drop(params_in_protocol, axis=1)
+    return long_output
+def simulate_dataset_from_omega(
+    ode_model: OdeModel,
+    protocol_design: pd.DataFrame,
+    time_steps: np.ndarray,
+    init_conditions: np.ndarray,
+    log_mi: dict[str, float],
+    log_pdu: dict[str, dict[str, float]],
+    error_model: str,
+    res_var: list[float],
+    covariate_map: dict[str, dict[str, dict[str, str | float]]],
+    patient_covariates: pd.DataFrame,
+) -> pd.DataFrame:
+    """Generate synthetic data set using an ODE model and population distributions of parameters
+    Args:
+        ode_model (OdeModel): The equations to be simulated
+        protocol_design (pd.DataFrame): _description_
+        time_steps (np.ndarray): _description_
+        init_conditions (np.ndarray): _description_
+        log_mi (dict[str, float]): _description_
+        log_pdu (dict[str, dict[str, float]]): _description_
+        error_model (str): _description_
+        res_var (list[float]): _description_
+        covariate_map (dict[str, dict[str, dict[str, str  |  float]]]): _description_
+        patient_covariates (pd.DataFrame): _description_
+    Returns:
+        pd.DataFrame: _description_
+    """
+    structural_model = StructuralOdeModel(ode_model, protocol_design, init_conditions)
+    nlme_model = NlmeModel(
+        structural_model,
+        patient_covariates,
+        log_mi,
+        log_pdu,
+        res_var,
+        covariate_map,
+        error_model,
+    )
+    etas = nlme_model.sample_individual_etas()
+    theta = nlme_model.individual_parameters(etas, nlme_model.patients)
+    vpop = pd.DataFrame(
+        data=theta.numpy(), columns=nlme_model.structural_model.parameter_names
+    )
+    vpop["id"] = nlme_model.patients
+    protocol_arms = patient_covariates[["id", "protocol_arm"]]
+    vpop = vpop.merge(protocol_arms, on=["id"], how="left")
+    vpop = vpop.merge(
+        pd.DataFrame(data=nlme_model.outputs_names, columns=["output_name"]),
+        how="cross",
+    )
+    out = ode_model.run_trial(
+        vpop, init_conditions, protocol_design, time_steps
+    ).rename({"predicted_value": "value"}, axis=1)
+    return out

vpop_calibration-2.0.0/vpop_calibration/model/__init__.py ADDED Viewed

@@ -0,0 +1,3 @@
+from .gp import GP
+__all__ = ["GP"]