PyPI - pySEQTarget - Versions diffs - 0.9.0__tar.gz → 0.10.1__tar.gz - Mend

pySEQTarget 0.9.0tar.gz → 0.10.1tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (61) hide show

{pyseqtarget-0.9.0 → pyseqtarget-0.10.1}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: pySEQTarget
-Version: 0.9.0
+Version: 0.10.1
 Summary: Sequentially Nested Target Trial Emulation
 Author-email: Ryan O'Dea <ryan.odea@psi.ch>, Alejandro Szmulewicz <aszmulewicz@hsph.harvard.edu>, Tom Palmer <tom.palmer@bristol.ac.uk>, Miguel Hernan <mhernan@hsph.harvard.edu>
 Maintainer-email: Ryan O'Dea <ryan.odea@psi.ch>
@@ -21,6 +21,8 @@ Classifier: Programming Language :: Python :: 3
 Classifier: Programming Language :: Python :: 3.10
 Classifier: Programming Language :: Python :: 3.11
 Classifier: Programming Language :: Python :: 3.12
+Classifier: Programming Language :: Python :: 3.13
+Classifier: Programming Language :: Python :: 3.14
 Requires-Python: >=3.10
 Description-Content-Type: text/markdown
 License-File: LICENSE
@@ -34,11 +36,16 @@ Requires-Dist: lifelines
 Dynamic: license-file
 # pySEQTarget - Sequentially Nested Target Trial Emulation
+[![PyPI version](https://badge.fury.io/py/pySEQTarget.svg)](https://pypi.org/project/pySEQTarget)
+[![Downloads](https://static.pepy.tech/badge/pySEQTarget)](https://pepy.tech/project/pySEQTarget)
+[![codecov](https://codecov.io/gh/CausalInference/pySEQTarget/graph/badge.svg?token=DMOVJJUWXP)](https://codecov.io/gh/CausalInference/pySEQTarget)[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
+![versions](https://img.shields.io/pypi/pyversions/pySEQTarget.svg)
+[![Documentation Status](https://readthedocs.org/projects/pySEQTarget/badge/?version=latest)](https://pySEQTarget.readthedocs.io)
 Implementation of sequential trial emulation for the analysis of
-observational databases. The ‘SEQTaRget’ software accommodates
+observational databases. The `SEQTaRget` software accommodates
 time-varying treatments and confounders, as well as binary and failure
-time outcomes. ‘SEQTaRget’ allows to compare both static and dynamic
+time outcomes. `SEQTaRget` allows to compare both static and dynamic
 strategies, can be used to estimate observational analogs of
 intention-to-treat and per-protocol effects, and can adjust for
 potential selection bias.
@@ -61,8 +68,9 @@ From the user side, this amounts to creating a dataclass, `SEQopts`, and then fe
 ```python
 import polars as pl
 from pySEQTarget import SEQuential, SEQopts
+from pySEQTarget.data import load_data
-data = pl.from_pandas(SEQdata)
+data = load_data("SEQdata")
 options = SEQopts(km_curves = True)
 # Initiate the class
@@ -70,22 +78,22 @@ model = SEQuential(data,
                    id_col = "ID",
                    time_col = "time",
                    eligible_col = "eligible",
+                   treatment_col = "tx_init",
+                   outcome_col = "outcome",
                    time_varying_cols = ["N", "L", "P"],
                    fixed_cols = ["sex"],
                    method = "ITT",
-                   options = options)
+                   parameters = options)
 model.expand()  # Construct the nested structure
 model.bootstrap(bootstrap_nboot = 20) # Run 20 bootstrap samples
 model.fit() # Fit the model
 model.survival() # Create survival curves
 model.plot() # Create and show a plot of the survival curves
 model.collect() # Collection of important information
 ```
 ## Assumptions
 There are several key assumptions in this package -
 1. User provided `time_col` begins at 0 per unique `id_col`, we also assume this column contains only integers and continues by 1 for every time step, e.g. (0, 1, 2, 3, 4, ...) is allowed and (0, 1, 2, 2.5, ...) or (0, 1, 4, 5) are not
     1. Provided `time_col` entries may be out of order at intake as a sort is enforced at expansion.
-2. `eligible_col`, `excused_column_names` and [TODO] are once 1, only 1 (with respect to `time_col`) flag variables.
+2. `eligible_col` and elements of `excused_colnames` are once 1, only 1 (with respect to `time_col`) flag variables.

{pyseqtarget-0.9.0 → pyseqtarget-0.10.1}/README.md RENAMED Viewed

@@ -1,9 +1,14 @@
 # pySEQTarget - Sequentially Nested Target Trial Emulation
+[![PyPI version](https://badge.fury.io/py/pySEQTarget.svg)](https://pypi.org/project/pySEQTarget)
+[![Downloads](https://static.pepy.tech/badge/pySEQTarget)](https://pepy.tech/project/pySEQTarget)
+[![codecov](https://codecov.io/gh/CausalInference/pySEQTarget/graph/badge.svg?token=DMOVJJUWXP)](https://codecov.io/gh/CausalInference/pySEQTarget)[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
+![versions](https://img.shields.io/pypi/pyversions/pySEQTarget.svg)
+[![Documentation Status](https://readthedocs.org/projects/pySEQTarget/badge/?version=latest)](https://pySEQTarget.readthedocs.io)
 Implementation of sequential trial emulation for the analysis of
-observational databases. The ‘SEQTaRget’ software accommodates
+observational databases. The `SEQTaRget` software accommodates
 time-varying treatments and confounders, as well as binary and failure
-time outcomes. ‘SEQTaRget’ allows to compare both static and dynamic
+time outcomes. `SEQTaRget` allows to compare both static and dynamic
 strategies, can be used to estimate observational analogs of
 intention-to-treat and per-protocol effects, and can adjust for
 potential selection bias.
@@ -26,8 +31,9 @@ From the user side, this amounts to creating a dataclass, `SEQopts`, and then fe
 ```python
 import polars as pl
 from pySEQTarget import SEQuential, SEQopts
+from pySEQTarget.data import load_data
-data = pl.from_pandas(SEQdata)
+data = load_data("SEQdata")
 options = SEQopts(km_curves = True)
 # Initiate the class
@@ -35,22 +41,22 @@ model = SEQuential(data,
                    id_col = "ID",
                    time_col = "time",
                    eligible_col = "eligible",
+                   treatment_col = "tx_init",
+                   outcome_col = "outcome",
                    time_varying_cols = ["N", "L", "P"],
                    fixed_cols = ["sex"],
                    method = "ITT",
-                   options = options)
+                   parameters = options)
 model.expand()  # Construct the nested structure
 model.bootstrap(bootstrap_nboot = 20) # Run 20 bootstrap samples
 model.fit() # Fit the model
 model.survival() # Create survival curves
 model.plot() # Create and show a plot of the survival curves
 model.collect() # Collection of important information
 ```
 ## Assumptions
 There are several key assumptions in this package -
 1. User provided `time_col` begins at 0 per unique `id_col`, we also assume this column contains only integers and continues by 1 for every time step, e.g. (0, 1, 2, 3, 4, ...) is allowed and (0, 1, 2, 2.5, ...) or (0, 1, 4, 5) are not
     1. Provided `time_col` entries may be out of order at intake as a sort is enforced at expansion.
-2. `eligible_col`, `excused_column_names` and [TODO] are once 1, only 1 (with respect to `time_col`) flag variables.
+2. `eligible_col` and elements of `excused_colnames` are once 1, only 1 (with respect to `time_col`) flag variables.

pyseqtarget-0.10.1/pySEQTarget/SEQopts.py ADDED Viewed

@@ -0,0 +1,197 @@
+import multiprocessing
+from dataclasses import dataclass, field
+from typing import List, Literal, Optional
+@dataclass
+class SEQopts:
+    """
+    Parameter builder for ``pySEQTarget.SEQuential`` analysis
+    :param bootstrap_nboot: Number of bootstraps to preform
+    :type bootstrap_nboot: int
+    :param bootstrap_sample: Subsampling proportion of ID-Trials gathered for each bootstrapping iteration
+    :type bootstrap_sample: float
+    :param bootstrap_CI: If bootstrapped, confidence interval level
+    :type bootstrap_CI: float
+    :param bootstrap_CI_method: If bootstrapped, confidence method generation method ['SE' or 'percentile']
+    :type bootstrap_CI_method: str
+    :param cense_colname: Column name for censoring effect (LTFU, etc.)
+    :type cense_colname: str
+    :param cense_denominator: Override to specify denominator patsy formula for censoring models
+    :type cense_denominator: Optional[str] or None
+    :param cense_numerator: Override to specify numerator patsy formula for censoring models
+    :type cense_numerator: Optional[str] or None
+    :param cense_eligible_colname: Column name to identify which rows are eligible for censoring model fitting
+    :type cense_eligible_colname: Optional[str] or None
+    :param compevent_colname: Column name specifying a competing event to the outcome
+    :type compevent_colname: str
+    :param covariates: Override to specify the outcome patsy formula for outcome model fitting
+    :type covariates: Optional[str] or None
+    :param denominator: Override to specify the outcome patsy formula for denominator model fitting
+    :type denominator: Optional[str] or None
+    :param excused: Boolean to allow excused conditions when method is censoring
+    :type excused: bool
+    :param excused_colnames: Column names (at the same length of treatment_level) specifying excused conditions
+    :type excused_colnames: List[str] or []
+    :param followup_class: Boolean to force followup values to be treated as classes
+    :type followup_class: bool
+    :param followup_include: Boolean to force regular followup values into model covariates
+    :type followup_include: bool
+    :param followup_spline: Boolean to force followup values to be fit to cubic spline
+    :type followup_spline: bool
+    :param followup_max: Maximum allowed followup in analysis
+    :type followup_max: int or None
+    :param followup_min: Minimum allowed followup in analysis
+    :type followup_min: int
+    :param hazard_estimate: Boolean to create hazard estimates
+    :type hazard_estimate: bool
+    :param indicator_baseline: How to indicate baseline columns in models
+    :type indicator_baseline: str
+    :param indicator_squared: How to indicate squared columns in models
+    :type indicator_baseline: str
+    :param km_curves: Boolean to create survival, risk, and incidence (if applicable) estimates
+    :type km_curves: bool
+    :param ncores: Number of cores to use if running in parallel
+    :type ncores: int
+    :param numerator: Override to specify the outcome patsy formula for numerator models
+    :type numerator: str
+    :param parallel: Boolean to run model fitting in parallel
+    :type parallel: bool
+    :param plot_colors: List of colors for KM plots, if applicable
+    :type plot_colors: List[str]
+    :param plot_labels: List of length treat_level to specify treatment labeling
+    :type plot_labels: List[str]
+    :param plot_title: Plot title
+    :type plot_title: str
+    :param plot_type: Type of plot to show ["risk", "survival" or "incidence" if compevent is specified]
+    :type plot_type: str
+    :param seed: RNG seed
+    :type seed: int
+    :param selection_first_trial: Boolean to only use first trial for analysis (similar to non-expanded)
+    :type selection_first_trial: bool
+    :param selection_sample: Subsampling proportion of ID-trials which did not initiate a treatment
+    :type selection_sample: float
+    :param selection_random: Boolean to randomly downsample ID-trials which did not initiate a treatment
+    :type selection_random: bool
+    :param subgroup_colname: Column name for subgroups to share the same weighting but different outcome model fits
+    :type subgroup_colname: str
+    :param treatment_level: List of eligible treatment levels within treatment_col
+    :type treatment_level: List[int]
+    :param trial_include: Boolean to force trial values into model covariates
+    :type trial_include: bool
+    :param weight_eligible_colnames: List of column names of length treatment_level to identify which rows are eligible for weight fitting
+    :type weight_eligible_colnames: List[str]
+    :param weight_min: Minimum weight
+    :type weight_min: float
+    :param weight_max: Maximum weight
+    :type weight_max: float or None
+    :param weight_lag_condition: Boolean to fit weights based on their treatment lag
+    :type weight_lag_condition: bool
+    :param weight_p99: Boolean to force weight min and max to be 1st and 99th percentile respectively
+    :type weight_p99: bool
+    :param weight_preexpansion: Boolean to fit weights on preexpanded data
+    :type weight_preexpansion: bool
+    :param weighted: Boolean to weight analysis
+    :type weighted: bool
+    """
+    bootstrap_nboot: int = 0
+    bootstrap_sample: float = 0.8
+    bootstrap_CI: float = 0.95
+    bootstrap_CI_method: Literal["se", "percentile"] = "se"
+    cense_colname: Optional[str] = None
+    cense_denominator: Optional[str] = None
+    cense_numerator: Optional[str] = None
+    cense_eligible_colname: Optional[str] = None
+    compevent_colname: Optional[str] = None
+    covariates: Optional[str] = None
+    denominator: Optional[str] = None
+    excused: bool = False
+    excused_colnames: List[str] = field(default_factory=lambda: [])
+    followup_class: bool = False
+    followup_include: bool = True
+    followup_max: int = None
+    followup_min: int = 0
+    followup_spline: bool = False
+    hazard_estimate: bool = False
+    indicator_baseline: str = "_bas"
+    indicator_squared: str = "_sq"
+    km_curves: bool = False
+    ncores: int = multiprocessing.cpu_count()
+    numerator: Optional[str] = None
+    parallel: bool = False
+    plot_colors: List[str] = field(
+        default_factory=lambda: ["#F8766D", "#00BFC4", "#555555"]
+    )
+    plot_labels: List[str] = field(default_factory=lambda: [])
+    plot_title: str = None
+    plot_type: Literal["risk", "survival", "incidence"] = "risk"
+    seed: Optional[int] = None
+    selection_first_trial: bool = False
+    selection_sample: float = 0.8
+    selection_random: bool = False
+    subgroup_colname: str = None
+    treatment_level: List[int] = field(default_factory=lambda: [0, 1])
+    trial_include: bool = True
+    visit_colname: str = None
+    weight_eligible_colnames: List[str] = field(default_factory=lambda: [])
+    weight_min: float = 0.0
+    weight_max: float = None
+    weight_lag_condition: bool = True
+    weight_p99: bool = False
+    weight_preexpansion: bool = False
+    weighted: bool = False
+    def __post_init__(self):
+        bools = [
+            "excused",
+            "followup_class",
+            "followup_include",
+            "followup_spline",
+            "hazard_estimate",
+            "km_curves",
+            "parallel",
+            "selection_first_trial",
+            "selection_random",
+            "trial_include",
+            "weight_lag_condition",
+            "weight_p99",
+            "weight_preexpansion",
+            "weighted",
+        ]
+        for i in bools:
+            if not isinstance(getattr(self, i), bool):
+                raise TypeError(f"{i} must be a boolean value.")
+        if not isinstance(self.bootstrap_nboot, int) or self.bootstrap_nboot < 0:
+            raise ValueError("bootstrap_nboot must be a positive integer.")
+        if self.ncores < 1 or not isinstance(self.ncores, int):
+            raise ValueError("ncores must be a positive integer.")
+        if not (0.0 <= self.bootstrap_sample <= 1.0):
+            raise ValueError("bootstrap_sample must be between 0 and 1.")
+        if not (0.0 < self.bootstrap_CI < 1.0):
+            raise ValueError("bootstrap_CI must be between 0 and 1.")
+        if not (0.0 <= self.selection_sample <= 1.0):
+            raise ValueError("selection_sample must be between 0 and 1.")
+        if self.plot_type not in ["risk", "survival", "incidence"]:
+            raise ValueError(
+                "plot_type must be either 'risk', 'survival', or 'incidence'."
+            )
+        if self.bootstrap_CI_method not in ["se", "percentile"]:
+            raise ValueError("bootstrap_CI_method must be one of 'se' or 'percentile'")
+        for i in (
+            "covariates",
+            "numerator",
+            "denominator",
+            "cense_numerator",
+            "cense_denominator",
+        ):
+            attr = getattr(self, i)
+            if attr is not None and not isinstance(attr, list):
+                setattr(self, i, "".join(attr.split()))

pyseqtarget-0.10.1/pySEQTarget/SEQoutput.py ADDED Viewed

@@ -0,0 +1,163 @@
+import tempfile
+from dataclasses import dataclass
+from pathlib import Path
+from typing import List, Literal, Optional
+import matplotlib.figure
+import polars as pl
+from statsmodels.base.wrapper import ResultsWrapper
+from .helpers import _build_md, _build_pdf
+from .SEQopts import SEQopts
+@dataclass
+class SEQoutput:
+    """
+    Collector class for results from ``SEQuential``
+    :param options: Options used in the SEQuential process
+    :type options: SEQopts or None
+    :param method: Method of analysis ['ITT', 'dose-response', or 'censoring']
+    :type method: str
+    :param numerator_models: Numerator models, if applicable, from the weighting process
+    :type numerator_models: List[ResultsWrapper] or None
+    :param denominator_models: Denominator models, if applicable, from the weighting process
+    :type denominator_models: List[ResultsWrapper] or None
+    :param compevent_models: Competing event models, if applicable
+    :type compevent_models: List[ResultsWrapper] or None
+    :param weight_statistics: Weight statistics once returned back to the expanded dataset
+    :type weight_statistics: dict or None
+    :param hazard: Hazard ratio if applicable
+    :type hazard: pl.DataFrame or None
+    :param km_data: Dataframe of risk, survival, and incidence data if applicable at all followups
+    :type km_data: pl.DataFrame or None
+    :param km_graph: Figure of survival, risk, or incidence over followup times
+    :type km_graph: matplotlib.figure.Figure or None
+    :param risk_ratio: Dataframe of risk ratios, compared between treatments and subgroups
+    :type risk_ratio: pl.DataFrame or None
+    :param risk_difference: Dataframe of risk differences, compared between treatments and subgroups
+    :type risk_difference: pl.DataFrame or None
+    :param time: Timings for every step of the process completed thus far
+    :type time: dict or None
+    :param diagnostic_tables: Diagnostic tables for unique and nonunique outcome events and treatment switches
+    :type diagnostic_tables: dict or None
+    """
+    options: SEQopts = None
+    method: str = None
+    numerator_models: List[ResultsWrapper] = None
+    denominator_models: List[ResultsWrapper] = None
+    outcome_models: List[List[ResultsWrapper]] = None
+    compevent_models: List[List[ResultsWrapper]] = None
+    weight_statistics: pl.DataFrame = None
+    hazard: pl.DataFrame = None
+    km_data: pl.DataFrame = None
+    km_graph: matplotlib.figure.Figure = None
+    risk_ratio: pl.DataFrame = None
+    risk_difference: pl.DataFrame = None
+    time: dict = None
+    diagnostic_tables: dict = None
+    def plot(self) -> None:
+        """
+        Prints the kaplan-meier graph
+        """
+        print(self.km_graph)
+    def summary(
+        self, type=Optional[Literal["numerator", "denominator", "outcome", "compevent"]]
+    ) -> List:
+        """
+        Returns a list of model summaries of either the numerator, denominator, outcome, or competing event models
+        :param type: Indicator for which model list you would like returned
+        :type type: str
+        """
+        match type:
+            case "numerator":
+                models = self.numerator_models
+            case "denominator":
+                models = self.denominator_models
+            case "compevent":
+                models = self.compevent_models
+            case _:
+                models = self.outcome_models
+        return [model.summary() for model in models]
+    def retrieve_data(
+        self,
+        type=Optional[
+            Literal[
+                "km_data",
+                "hazard",
+                "risk_ratio",
+                "risk_difference",
+                "unique_outcomes",
+                "nonunique_outcomes",
+                "unique_switches",
+                "nonunique_switches",
+            ]
+        ],
+    ) -> pl.DataFrame:
+        """
+        Getter for data stored within ``SEQoutput``
+        :param type: Data which you would like to access, ['km_data', 'hazard', 'risk_ratio', 'risk_difference', 'unique_outcomes', 'nonunique_outcomes', 'unique_switches', 'nonunique_switches']
+        :type type: str
+        """
+        match type:
+            case "hazard":
+                data = self.hazard
+            case "risk_ratio":
+                data = self.risk_ratio
+            case "risk_difference":
+                data = self.risk_difference
+            case "unique_outcomes":
+                data = self.diagnostic_tables["unique_outcomes"]
+            case "nonunique_outcomes":
+                data = self.diagnostic_tables["nonunique_outcomes"]
+            case "unique_switches":
+                if self.diagnostic_tables.has_key("unique_switches"):
+                    data = self.diagnostic_tables["unique_switches"]
+                else:
+                    data = None
+            case "nonunique_switches":
+                if self.diagnostic_tables.has_key("nonunique_switches"):
+                    data = self.diagnostic_tables["nonunique_switches"]
+                else:
+                    data = None
+            case _:
+                data = self.km_data
+        if data is None:
+            raise ValueError("Data {type} was not created in the SEQuential process")
+        return data
+    def to_md(self, filename="SEQuential_results.md") -> None:
+        """Generates a markdown report of the SEQuential analysis results."""
+        img_path = None
+        if self.options.km_curves and self.km_graph is not None:
+            img_path = Path(filename).with_suffix(".png")
+            self.km_graph.savefig(img_path, dpi=300, bbox_inches="tight")
+            img_path = img_path.name
+        with open(filename, "w") as f:
+            f.write(_build_md(self, img_path))
+        print(f"Results saved to {filename}")
+    def to_pdf(self, filename="SEQuential_results.pdf") -> None:
+        """Generates a PDF report of the SEQuential analysis results."""
+        with tempfile.TemporaryDirectory() as tmpdir:
+            tmp_md = Path(tmpdir) / "report.md"
+            self.to_md(str(tmp_md))
+            with open(tmp_md, "r") as f:
+                md_content = f.read()
+            tmp_img = tmp_md.with_suffix(".png")
+            img_abs_path = str(tmp_img.absolute()) if tmp_img.exists() else None
+            _build_pdf(md_content, filename, img_abs_path)
+        print(f"Results saved to {filename}")

{pyseqtarget-0.9.0 → pyseqtarget-0.10.1}/pySEQTarget/SEQuential.py RENAMED Viewed

@@ -7,9 +7,10 @@ from typing import List, Literal, Optional
 import numpy as np
 import polars as pl
-from .analysis import (_calculate_hazard, _calculate_survival, _outcome_fit,
-                       _pred_risk, _risk_estimates, _subgroup_fit)
-from .error import _datachecker, _param_checker
+from .analysis import (_calculate_hazard, _calculate_survival, _clamp,
+                       _outcome_fit, _pred_risk, _risk_estimates,
+                       _subgroup_fit)
+from .error import _data_checker, _param_checker
 from .expansion import _binder, _diagnostics, _dynamic, _random_selection
 from .helpers import _col_string, _format_time, bootstrap_loop
 from .initialization import (_cense_denominator, _cense_numerator,
@@ -18,11 +19,36 @@ from .plot import _survival_plot
 from .SEQopts import SEQopts
 from .SEQoutput import SEQoutput
 from .weighting import (_fit_denominator, _fit_LTFU, _fit_numerator,
-                        _weight_bind, _weight_predict, _weight_setup,
-                        _weight_stats)
+                        _fit_visit, _weight_bind, _weight_predict,
+                        _weight_setup, _weight_stats)
 class SEQuential:
+    """
+    Primary class initializer for SEQuentially nested target trial emulation
+    :param data: Data for analysis
+    :type data: pl.DataFrame
+    :param id_col: Column name for unique patient IDs
+    :type id_col: str
+    :param time_col: Column name for observational time points
+    :type time_col: str
+    :param eligible_col: Column name for analytical eligibility
+    :type eligible_col: str
+    :param treatment_col: Column name specifying treatment per time_col
+    :type treatment_col: str
+    :param outcome_col: Column name specifying outcome per time_col
+    :type outcome_col: str
+    :param time_varying_cols: Time-varying column names as covariates (BMI, Age, etc.)
+    :type time_varying_cols: Optional[List[str]] or None
+    :param fixed_cols: Fixed column names as covariates (Sex, YOB, etc.)
+    :type fixed_cols: Optional[List[str]] or None
+    :param method: Method for analysis ['ITT', 'dose-response', or 'censoring']
+    :type method: str
+    :param parameters: Parameters to augment analysis, specified with ``pySEQTarget.SEQopts``
+    :type parameters: Optional[SEQopts] or None
+    """
     def __init__(
         self,
         data: pl.DataFrame,
@@ -68,7 +94,7 @@ class SEQuential:
             if self.denominator is None:
                 self.denominator = _denominator(self)
-            if self.cense_colname is not None:
+            if self.cense_colname is not None or self.visit_colname is not None:
                 if self.cense_numerator is None:
                     self.cense_numerator = _cense_numerator(self)
@@ -76,14 +102,18 @@ class SEQuential:
                     self.cense_denominator = _cense_denominator(self)
         _param_checker(self)
-        _datachecker(self)
+        _data_checker(self)
-    def expand(self):
+    def expand(self) -> None:
+        """
+        Creates the sequentially nested, emulated target trial structure
+        """
         start = time.perf_counter()
         kept = [
             self.cense_colname,
             self.cense_eligible_colname,
             self.compevent_colname,
+            self.visit_colname,
             *self.weight_eligible_colnames,
             *self.excused_colnames,
         ]
@@ -136,7 +166,10 @@ class SEQuential:
         end = time.perf_counter()
         self._expansion_time = _format_time(start, end)
-    def bootstrap(self, **kwargs):
+    def bootstrap(self, **kwargs) -> None:
+        """
+        Internally sets up bootstrapping - creating a list of IDs to use per iteration
+        """
         allowed = {
             "bootstrap_nboot",
             "bootstrap_sample",
@@ -148,7 +181,6 @@ class SEQuential:
                 setattr(self, key, value)
             else:
                 raise ValueError(f"Unknown argument: {key}")
         UIDs = self.DT.select(pl.col(self.id_col)).unique().to_series().to_list()
         NIDs = len(UIDs)
@@ -159,10 +191,12 @@ class SEQuential:
             )
             id_counts = Counter(sampled_IDs)
             self._boot_samples.append(id_counts)
-        return self
     @bootstrap_loop
-    def fit(self):
+    def fit(self) -> None:
+        """
+        Fits weight models (numerator, denominator, censoring) and outcome models (outcome, competing event)
+        """
         if self.bootstrap_nboot > 0 and not hasattr(self, "_boot_samples"):
             raise ValueError(
                 "Bootstrap sampling not found. Please run the 'bootstrap' method before fitting with bootstrapping."
@@ -179,6 +213,7 @@ class SEQuential:
                     WDT[col] = WDT[col].astype("category")
             _fit_LTFU(self, WDT)
+            _fit_visit(self, WDT)
             _fit_numerator(self, WDT)
             _fit_denominator(self, WDT)
@@ -211,7 +246,17 @@ class SEQuential:
             )
         return models
-    def survival(self):
+    def survival(self, **kwargs) -> None:
+        """
+        Uses fit outcome models (outcome, competing event) to estimate risk, survival, and incidence curves
+        """
+        allowed = {"bootstrap_CI", "bootstrap_CI_method"}
+        for key, val in kwargs.items():
+            if key in allowed:
+                setattr(self, key, val)
+            else:
+                raise ValueError(f"Unknown or misplaced arugment: {key}")
         if not hasattr(self, "outcome_model") or not self.outcome_model:
             raise ValueError(
                 "Outcome model not found. Please run the 'fit' method before calculating survival."
@@ -221,13 +266,16 @@ class SEQuential:
         risk_data = _pred_risk(self)
         surv_data = _calculate_survival(self, risk_data)
-        self.km_data = pl.concat([risk_data, surv_data])
+        self.km_data = _clamp(pl.concat([risk_data, surv_data]))
         self.risk_estimates = _risk_estimates(self)
         end = time.perf_counter()
         self._survival_time = _format_time(start, end)
-    def hazard(self):
+    def hazard(self) -> None:
+        """
+        Uses fit outcome models (outcome, competing event) to estimate hazard ratios
+        """
         start = time.perf_counter()
         if not hasattr(self, "outcome_model") or not self.outcome_model:
@@ -239,10 +287,22 @@ class SEQuential:
         end = time.perf_counter()
         self._hazard_time = _format_time(start, end)
-    def plot(self):
+    def plot(self, **kwargs) -> None:
+        """
+        Shows a plot specific to plot_type
+        """
+        allowed = {"plot_type", "plot_colors", "plot_title", "plot_labels"}
+        for key, val in kwargs.items():
+            if key in allowed:
+                setattr(self, key, val)
+            else:
+                raise ValueError(f"Unknown or misplaced arugment: {key}")
         self.km_graph = _survival_plot(self)
-    def collect(self):
+    def collect(self) -> SEQoutput:
+        """
+        Collects all results current created into ``SEQoutput`` class
+        """
         self._time_collected = datetime.datetime.now()
         generated = [

pySEQTarget 0.9.0__tar.gz → 0.10.1__tar.gz

pySEQTarget 0.9.0tar.gz → 0.10.1tar.gz