PyPI - validmind - Versions diffs - 2.5.8__py3-none-any.whl → 2.5.15__py3-none-any.whl - Mend

validmind 2.5.8py3-none-any.whl → 2.5.15py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (212) hide show

validmind/tests/model_validation/statsmodels/JarqueBera.py CHANGED Viewed

@@ -11,36 +11,41 @@ class JarqueBera(Metric):
     """
     Assesses normality of dataset features in an ML model using the Jarque-Bera test.
-    **Purpose**: The purpose of the Jarque-Bera test as implemented in this metric is to determine if the features in
-    the dataset of a given Machine Learning model follows a normal distribution. This is crucial for understanding the
-    distribution and behavior of the model's features, as numerous statistical methods assume normal distribution of
-    the data.
-    **Test Mechanism**: The test mechanism involves computing the Jarque-Bera statistic, p-value, skew, and kurtosis
-    for each feature in the dataset. It utilizes the 'jarque_bera' function from the 'statsmodels' library in Python,
-    storing the results in a dictionary. The test evaluates the skewness and kurtosis to ascertain whether the dataset
-    follows a normal distribution. A significant p-value (typically less than 0.05) implies that the data does not
-    possess normal distribution.
-    **Signs of High Risk**:
-    - A high Jarque-Bera statistic and a low p-value (usually less than 0.05) indicates high-risk conditions.
+    ### Purpose
+    The purpose of the Jarque-Bera test as implemented in this metric is to determine if the features in the dataset of
+    a given Machine Learning model follow a normal distribution. This is crucial for understanding the distribution and
+    behavior of the model's features, as numerous statistical methods assume normal distribution of the data.
+    ### Test Mechanism
+    The test mechanism involves computing the Jarque-Bera statistic, p-value, skew, and kurtosis for each feature in
+    the dataset. It utilizes the 'jarque_bera' function from the 'statsmodels' library in Python, storing the results
+    in a dictionary. The test evaluates the skewness and kurtosis to ascertain whether the dataset follows a normal
+    distribution. A significant p-value (typically less than 0.05) implies that the data does not possess normal
+    distribution.
+    ### Signs of High Risk
+    - A high Jarque-Bera statistic and a low p-value (usually less than 0.05) indicate high-risk conditions.
     - Such results suggest the data significantly deviates from a normal distribution. If a machine learning model
     expects feature data to be normally distributed, these findings imply that it may not function as intended.
-    **Strengths**:
-    - This test provides insights into the shape of the data distribution, helping determine whether a given set of
-    data follows a normal distribution.
-    - This is particularly useful for risk assessment for models that assume a normal distribution of data.
+    ### Strengths
+    - Provides insights into the shape of the data distribution, helping determine whether a given set of data follows
+    a normal distribution.
+    - Particularly useful for risk assessment for models that assume a normal distribution of data.
     - By measuring skewness and kurtosis, it provides additional insights into the nature and magnitude of a
     distribution's deviation.
-    **Limitations**:
-    - The Jarque-Bera test only checks for normality in the data distribution. It cannot provide insights into other
-    types of distributions.
+    ### Limitations
+    - Only checks for normality in the data distribution. It cannot provide insights into other types of distributions.
     - Datasets that aren't normally distributed but follow some other distribution might lead to inaccurate risk
     assessments.
-    - The test is highly sensitive to large sample sizes, often rejecting the null hypothesis (that data is normally
-    distributed) even for minor deviations in larger datasets.
+    - Highly sensitive to large sample sizes, often rejecting the null hypothesis (that data is normally distributed)
+    even for minor deviations in larger datasets.
     """
     name = "jarque_bera"

validmind/tests/model_validation/statsmodels/KolmogorovSmirnov.py CHANGED Viewed

@@ -13,40 +13,39 @@ from validmind.vm_models import Metric, ResultSummary, ResultTable, ResultTableM
 @dataclass
 class KolmogorovSmirnov(Metric):
     """
-    Executes a feature-wise Kolmogorov-Smirnov test to evaluate alignment with normal distribution in datasets.
-    **Purpose**: This metric employs the Kolmogorov-Smirnov (KS) test to evaluate the distribution of a dataset's
-    features. It specifically gauges whether the data from each feature aligns with a normal distribution, a common
-    presumption in many statistical methods and machine learning models.
-    **Test Mechanism**: This KS test calculates the KS statistic and the corresponding p-value for each column in a
-    dataset. It achieves this by contrasting the cumulative distribution function of the dataset's feature with an
-    ideal normal distribution. Subsequently, a feature-by-feature KS statistic and p-value are stored in a dictionary.
-    The specific threshold p-value (the value below which we reject the hypothesis that the data is drawn from a normal
-    distribution) is not firmly set within this implementation, allowing for definitional flexibility depending on the
-    specific application.
-    **Signs of High Risk**:
-    - Elevated KS statistic for a feature combined with a low p-value. This suggests a significant divergence between
-    the feature's distribution and a normal one.
-    - Features with notable deviations. These could create problems if the applicable model makes assumptions about
-    normal data distribution, thereby representing a risk.
-    **Strengths**:
-    - The KS test is acutely sensitive to differences in the location and shape of the empirical cumulative
-    distribution functions of two samples.
-    - It is non-parametric and does not presuppose any specific data distribution, making it adaptable to a range of
-    datasets.
-    - With its focus on individual features, it offers detailed insights into data distribution.
-    **Limitations**:
-    - The sensitivity of the KS test to disparities in data distribution tails can be excessive. Such sensitivity might
-    prompt false alarms about non-normal distributions, particularly in situations where these tail tendencies are
-    irrelevant to the model.
-    - It could become less effective when applied to multivariate distributions, considering that it's primarily
-    configured for univariate distributions.
-    - As a goodness-of-fit test, the KS test does not identify specific types of non-normality, such as skewness or
-    kurtosis, that could directly impact model fitting.
+    Assesses whether each feature in the dataset aligns with a normal distribution using the Kolmogorov-Smirnov test.
+    ### Purpose
+    The Kolmogorov-Smirnov (KS) test evaluates the distribution of features in a dataset to determine their alignment
+    with a normal distribution. This is important because many statistical methods and machine learning models assume
+    normality in the data distribution.
+    ### Test Mechanism
+    This test calculates the KS statistic and corresponding p-value for each feature in the dataset. It does so by
+    comparing the cumulative distribution function of the feature with an ideal normal distribution. The KS statistic
+    and p-value for each feature are then stored in a dictionary. The p-value threshold to reject the normal
+    distribution hypothesis is not preset, providing flexibility for different applications.
+    ### Signs of High Risk
+    - Elevated KS statistic for a feature combined with a low p-value, indicating a significant divergence from a
+    normal distribution.
+    - Features with notable deviations that could create problems if the model assumes normality in data distribution.
+    ### Strengths
+    - The KS test is sensitive to differences in the location and shape of empirical cumulative distribution functions.
+    - It is non-parametric and adaptable to various datasets, as it does not assume any specific data distribution.
+    - Provides detailed insights into the distribution of individual features.
+    ### Limitations
+    - The test's sensitivity to disparities in the tails of data distribution might cause false alarms about
+    non-normality.
+    - Less effective for multivariate distributions, as it is designed for univariate distributions.
+    - Does not identify specific types of non-normality, such as skewness or kurtosis, which could impact model fitting.
     """
     name = "kolmogorov_smirnov"

validmind/tests/model_validation/statsmodels/LJungBox.py CHANGED Viewed

@@ -11,36 +11,40 @@ class LJungBox(Metric):
     """
     Assesses autocorrelations in dataset features by performing a Ljung-Box test on each feature.
-    **Purpose**: The Ljung-Box test is a type of statistical test utilized to ascertain whether there are
-    autocorrelations within a given dataset that differ significantly from zero. In the context of a machine learning
-    model, this test is primarily used to evaluate data utilized in regression tasks, especially those involving time
-    series and forecasting.
-    **Test Mechanism**: The test operates by iterating over each feature within the training dataset and applying the
-    `acorr_ljungbox` function from the `statsmodels.stats.diagnostic` library. This function calculates the Ljung-Box
-    statistic and p-value for each feature. These results are then stored in a dictionary where the keys are the
-    feature names and the values are dictionaries containing the statistic and p-value respectively. Generally, a lower
-    p-value indicates a higher likelihood of significant autocorrelations within the feature.
-    **Signs of High Risk**:
-    - A high risk or failure in the model's performance relating to this test might be indicated by high Ljung-Box
-    statistic values or low p-values.
-    - These outcomes suggest the presence of significant autocorrelations in the respective features. If not properly
-    consider or handle in the machine learning model, these can negatively affect model performance or bias.
-    **Strengths**:
-    - The Ljung-Box test is a powerful tool for detecting autocorrelations within datasets, especially in time series
-    data.
-    - It provides quantitative measures (statistic and p-value) that allow for precise evaluation of autocorrelation.
-    - This test can be instrumental in avoiding issues related to autoregressive residuals and other challenges in
-    regression models.
-    **Limitations**:
-    - The Ljung-Box test cannot detect all types of non-linearity or complex interrelationships among variables.
+    ### Purpose
+    The Ljung-Box test is a type of statistical test utilized to ascertain whether there are autocorrelations within a
+    given dataset that differ significantly from zero. In the context of a machine learning model, this test is
+    primarily used to evaluate data utilized in regression tasks, especially those involving time series and
+    forecasting.
+    ### Test Mechanism
+    The test operates by iterating over each feature within the training dataset and applying the `acorr_ljungbox`
+    function from the `statsmodels.stats.diagnostic` library. This function calculates the Ljung-Box statistic and
+    p-value for each feature. These results are then stored in a dictionary where the keys are the feature names and
+    the values are dictionaries containing the statistic and p-value respectively. Generally, a lower p-value indicates
+    a higher likelihood of significant autocorrelations within the feature.
+    ### Signs of High Risk
+    - High Ljung-Box statistic values or low p-values.
+    - Presence of significant autocorrelations in the respective features.
+    - Potential for negative impact on model performance or bias if autocorrelations are not properly handled.
+    ### Strengths
+    - Powerful tool for detecting autocorrelations within datasets, especially in time series data.
+    - Provides quantitative measures (statistic and p-value) for precise evaluation.
+    - Helps avoid issues related to autoregressive residuals and other challenges in regression models.
+    ### Limitations
+    - Cannot detect all types of non-linearity or complex interrelationships among variables.
     - Testing individual features may not fully encapsulate the dynamics of the data if features interact with each
     other.
-    - It is designed more for traditional statistical models and may not be fully compatible with certain types of
-    complex machine learning models.
+    - Designed more for traditional statistical models and may not be fully compatible with certain types of complex
+    machine learning models.
     """
     name = "ljung_box"

validmind/tests/model_validation/statsmodels/Lilliefors.py CHANGED Viewed

@@ -14,44 +14,47 @@ class Lilliefors(Metric):
     """
     Assesses the normality of feature distributions in an ML model's training dataset using the Lilliefors test.
-    **Purpose**: The purpose of this metric is to utilize the Lilliefors test, named in honor of the Swedish
-    statistician Hubert Lilliefors, in order to assess whether the features of the machine learning model's training
-    dataset conform to a normal distribution. This is done because the assumption of normal distribution plays a vital
-    role in numerous statistical procedures as well as numerous machine learning models. Should the features fail to
-    follow a normal distribution, some model types may not operate at optimal efficiency. This can potentially lead to
-    inaccurate predictions.
-    **Test Mechanism**: The application of this test happens across all feature columns within the training dataset.
-    For each feature, the Lilliefors test returns a test statistic and p-value. The test statistic quantifies how far
-    the feature's distribution is from an ideal normal distribution, whereas the p-value aids in determining the
-    statistical relevance of this deviation. The final results are stored within a dictionary, the keys of which
-    correspond to the name of the feature column, and the values being another dictionary which houses the test
-    statistic and p-value.
-    **Signs of High Risk**:
+    ### Purpose
+    The purpose of this metric is to utilize the Lilliefors test, named in honor of the Swedish statistician Hubert
+    Lilliefors, in order to assess whether the features of the machine learning model's training dataset conform to a
+    normal distribution. This is done because the assumption of normal distribution plays a vital role in numerous
+    statistical procedures as well as numerous machine learning models. Should the features fail to follow a normal
+    distribution, some model types may not operate at optimal efficiency. This can potentially lead to inaccurate
+    predictions.
+    ### Test Mechanism
+    The application of this test happens across all feature columns within the training dataset. For each feature, the
+    Lilliefors test returns a test statistic and p-value. The test statistic quantifies how far the feature's
+    distribution is from an ideal normal distribution, whereas the p-value aids in determining the statistical
+    relevance of this deviation. The final results are stored within a dictionary, the keys of which correspond to the
+    name of the feature column, and the values being another dictionary which houses the test statistic and p-value.
+    ### Signs of High Risk
     - If the p-value corresponding to a specific feature sinks below a pre-established significance level, generally
     set at 0.05, then it can be deduced that the distribution of that feature significantly deviates from a normal
     distribution. This can present a high risk for models that assume normality, as these models may perform
     inaccurately or inefficiently in the presence of such a feature.
-    **Strengths**:
+    ### Strengths
     - One advantage of the Lilliefors test is its utility irrespective of whether the mean and variance of the normal
     distribution are known in advance. This makes it a more robust option in real-world situations where these values
     might not be known.
-    - Second, the test has the ability to screen every feature column, offering a holistic view of the dataset.
+    - The test has the ability to screen every feature column, offering a holistic view of the dataset.
-    **Limitations**:
+    ### Limitations
     - Despite the practical applications of the Lilliefors test in validating normality, it does come with some
     limitations.
-    - Firstly, it is only capable of testing unidimensional data, thus rendering it ineffective for datasets with
-    interactions between features or multi-dimensional phenomena.
-    - Additionally, the test might not be as sensitive as some other tests (like the Anderson-Darling test) in
-    detecting deviations from a normal distribution.
-    - Lastly, like any other statistical test, Lilliefors test may also produce false positives or negatives. Hence,
-    banking solely on this test, without considering other characteristics of the data, may give rise to risks.
+    - It is only capable of testing unidimensional data, thus rendering it ineffective for datasets with interactions
+    between features or multi-dimensional phenomena.
+    - The test might not be as sensitive as some other tests (like the Anderson-Darling test) in detecting deviations
+    from a normal distribution.
+    - Like any other statistical test, Lilliefors test may also produce false positives or negatives. Hence, banking
+    solely on this test, without considering other characteristics of the data, may give rise to risks.
     """
     name = "lilliefors_test"

validmind/tests/model_validation/statsmodels/PredictionProbabilitiesHistogram.py CHANGED Viewed

@@ -2,134 +2,102 @@
 # See the LICENSE file in the root of this repository for details.
 # SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial
-from dataclasses import dataclass
 import plotly.graph_objects as go
 from matplotlib import cm
-from validmind.vm_models import Figure, Metric
+from validmind import tags, tasks
-@dataclass
-class PredictionProbabilitiesHistogram(Metric):
+@tags("visualization", "credit_risk", "logistic_regression")
+@tasks("classification")
+def PredictionProbabilitiesHistogram(
+    dataset, model, title="Histogram of Predictive Probabilities"
+):
     """
-    Generates and visualizes histograms of the Probability of Default predictions for both positive and negative
-    classes in training and testing datasets.
-    **Purpose**: This code is designed to generate histograms that display the Probability of Default (PD) predictions
-    for positive and negative classes in both the training and testing datasets. By doing so, it evaluates the
-    performance of a logistic regression model, particularly in the context of credit risk prediction.
-    **Test Mechanism**: The metric executes these steps to run the test:
-    - Firstly, it extracts the target column from both the train and test datasets.
-    - The model's predict function is then used to calculate probabilities.
-    - These probabilities are added as a new column to the training and testing dataframes.
-    - Histograms are generated for each class (0 or 1 in binary classification scenarios) within the training and
-    testing datasets.
-    - To enhance visualization, the histograms are set to have different opacities.
-    - The four histograms (two for training data and two for testing) are overlaid on two different subplot frames (one
-    for training and one for testing data).
-    - The test returns a plotly graph object displaying the visualization.
-    **Signs of High Risk**: Several indicators could suggest a high risk or failure in the model's performance. These
-    include:
-    - Significant discrepancies observed between the histograms of training and testing data.
+    Assesses the predictive probability distribution for binary classification to evaluate model performance and
+    potential overfitting or bias.
+    ### Purpose
+    The Prediction Probabilities Histogram test is designed to generate histograms displaying the Probability of
+    Default (PD) predictions for both positive and negative classes in training and testing datasets. This helps in
+    evaluating the performance of a logistic regression model, particularly for credit risk prediction.
+    ### Test Mechanism
+    The metric follows these steps to execute the test:
+    - Extracts the target column from both the train and test datasets.
+    - Uses the model's predict function to calculate probabilities.
+    - Adds these probabilities as a new column to the training and testing dataframes.
+    - Generates histograms for each class (0 or 1) within the training and testing datasets.
+    - Sets different opacities for the histograms to enhance visualization.
+    - Overlays the four histograms (two for training and two for testing) on two different subplot frames.
+    - Returns a plotly graph object displaying the visualization.
+    ### Signs of High Risk
+    - Significant discrepancies between the histograms of training and testing data.
     - Large disparities between the histograms for the positive and negative classes.
-    - These issues could signal potential overfitting or bias in the model.
-    - Unevenly distributed probabilities may also indicate that the model does not accurately predict outcomes.
-    **Strengths**: This metric and test offer several benefits, including:
-    - The visual representation of the PD predictions made by the model, which aids in understanding the model's
-    behaviour.
-    - The ability to assess both the training and testing datasets, adding depth to the validation of the model.
-    - Highlighting disparities between multiple classes, providing potential insights into class imbalance or data
-    skewness issues.
-    - Particularly beneficial for credit risk prediction, it effectively visualizes the spread of risk across different
-    classes.
-    **Limitations**: Despite its strengths, the test has several limitations:
-    - It is specifically tailored for binary classification scenarios, where the target variable only has two classes;
-    as such, it isn't suited for multi-class classification tasks.
-    - This metric is mainly applicable for logistic regression models. It might not be effective or accurate when used
-    on other model types.
-    - While the test provides a robust visual representation of the model's PD predictions, it does not provide a
-    quantifiable measure or score to assess model performance.
+    - Potential overfitting or bias indicated by significant issues.
+    - Unevenly distributed probabilities suggesting inaccurate model predictions.
+    ### Strengths
+    - Offers a visual representation of the PD predictions made by the model, aiding in understanding its behavior.
+    - Assesses both the training and testing datasets, adding depth to model validation.
+    - Highlights disparities between classes, providing insights into class imbalance or data skewness.
+    - Effectively visualizes risk spread, which is particularly beneficial for credit risk prediction.
+    ### Limitations
+    - Specifically tailored for binary classification scenarios and not suited for multi-class classification tasks.
+    - Mainly applicable to logistic regression models, and may not be effective for other model types.
+    - Provides a robust visual representation but lacks a quantifiable measure to assess model performance.
     """
-    name = "prediction_probabilities_histogram"
-    required_inputs = ["model", "datasets"]
-    tasks = ["classification"]
-    tags = ["tabular_data", "visualization", "credit_risk", "logistic_regression"]
-    default_params = {"title": "Histogram of Predictive Probabilities"}
-    @staticmethod
-    def plot_prob_histogram(dataframes, dataset_titles, target_col, title):
-        figures = []
-        # Generate a colormap and convert to Plotly-accepted color format
-        # Adjust 'viridis' to any other matplotlib colormap if desired
-        colormap = cm.get_cmap("viridis")
-        for i, (df, dataset_title) in enumerate(zip(dataframes, dataset_titles)):
-            fig = go.Figure()
-            # Get unique classes and assign colors
-            classes = sorted(df[target_col].unique())
-            colors = [
-                colormap(i / len(classes))[:3] for i in range(len(classes))
-            ]  # RGB
-            color_dict = {
-                cls: f"rgb({int(rgb[0]*255)}, {int(rgb[1]*255)}, {int(rgb[2]*255)})"
-                for cls, rgb in zip(classes, colors)
-            }
-            # Ensure classes are plotted in the specified order
-            for class_value in sorted(df[target_col].unique()):
-                fig.add_trace(
-                    go.Histogram(
-                        x=df[df[target_col] == class_value]["probabilities"],
-                        opacity=0.75,
-                        name=f"{dataset_title} {target_col} = {class_value}",
-                        marker=dict(
-                            color=color_dict[class_value],
-                        ),
-                    )
-                )
-            fig.update_layout(
-                barmode="overlay",
-                title_text=f"{title} - {dataset_title}",
-                xaxis_title="Probability",
-                yaxis_title="Frequency",
-            )
-            figures.append(fig)
-        return figures
-    def run(self):
-        dataset_titles = [dataset.input_id for dataset in self.inputs.datasets]
-        target_column = self.inputs.datasets[0].target_column
-        title = self.params.get("title", self.default_params["title"])
-        dataframes = []
-        metric_value = {"prob_histogram": {}}
-        for _, dataset in enumerate(self.inputs.datasets):
-            df = dataset.df.copy()
-            y_prob = dataset.y_prob(self.inputs.model)
-            df["probabilities"] = y_prob
-            dataframes.append(df)
-            metric_value["prob_histogram"][dataset.input_id] = list(df["probabilities"])
-        figures = self.plot_prob_histogram(
-            dataframes, dataset_titles, target_column, title
-        )
+    df = dataset.df
+    df["probabilities"] = dataset.y_prob(model)
-        figures_list = [
-            Figure(
-                for_object=self,
-                key=f"prob_histogram_{title.replace(' ', '_')}_{i+1}",
-                figure=fig,
-            )
-            for i, fig in enumerate(figures)
-        ]
+    fig = _plot_prob_histogram(df, dataset.target_column, title)
+    return fig
+def _plot_prob_histogram(df, target_col, title):
-        return self.cache_results(metric_value=metric_value, figures=figures_list)
+    # Generate a colormap and convert to Plotly-accepted color format
+    # Adjust 'viridis' to any other matplotlib colormap if desired
+    colormap = cm.get_cmap("viridis")
+    fig = go.Figure()
+    # Get unique classes and assign colors
+    classes = sorted(df[target_col].unique())
+    colors = [colormap(i / len(classes))[:3] for i in range(len(classes))]  # RGB
+    color_dict = {
+        cls: f"rgb({int(rgb[0]*255)}, {int(rgb[1]*255)}, {int(rgb[2]*255)})"
+        for cls, rgb in zip(classes, colors)
+    }
+    # Ensure classes are plotted in the specified order
+    for class_value in sorted(df[target_col].unique()):
+        fig.add_trace(
+            go.Histogram(
+                x=df[df[target_col] == class_value]["probabilities"],
+                opacity=0.75,
+                name=f"{target_col} = {class_value}",
+                marker=dict(
+                    color=color_dict[class_value],
+                ),
+            )
+        )
+    fig.update_layout(
+        barmode="overlay",
+        title_text=f"{title}",
+        xaxis_title="Probability",
+        yaxis_title="Frequency",
+    )
+    return fig

validmind/tests/model_validation/statsmodels/RegressionCoeffs.py ADDED Viewed

@@ -0,0 +1,100 @@
+# Copyright © 2023-2024 ValidMind Inc. All rights reserved.
+# See the LICENSE file in the root of this repository for details.
+# SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial
+import pandas as pd
+import plotly.graph_objects as go
+from scipy import stats
+from validmind.errors import SkipTestError
+from validmind import tags, tasks
+@tags("tabular_data", "visualization", "model_training")
+@tasks("regression")
+def RegressionCoeffs(model):
+    """
+    Assesses the significance and uncertainty of predictor variables in a regression model through visualization of
+    coefficients and their 95% confidence intervals.
+    ### Purpose
+    The `RegressionCoeffs` metric visualizes the estimated regression coefficients alongside their 95% confidence intervals,
+    providing insights into the impact and significance of predictor variables on the response variable. This visualization
+    helps to understand the variability and uncertainty in the model's estimates, aiding in the evaluation of the
+    significance of each predictor.
+    ### Test Mechanism
+    The function operates by extracting the estimated coefficients and their standard errors from the regression model.
+    Using these, it calculates the confidence intervals at a 95% confidence level, which indicates the range within which
+    the true coefficient value is expected to fall 95% of the time. The confidence intervals are computed using the
+    Z-value associated with the 95% confidence level. The coefficients and their confidence intervals are then visualized
+    in a bar plot. The x-axis represents the predictor variables, the y-axis represents the estimated coefficients, and
+    the error bars depict the confidence intervals.
+    ### Signs of High Risk
+    - The confidence interval for a coefficient contains the zero value, suggesting that the predictor may not significantly
+    contribute to the model.
+    - Multiple coefficients with confidence intervals that include zero, potentially indicating issues with model reliability.
+    - Very wide confidence intervals, which may suggest high uncertainty in the coefficient estimates and potential model
+    instability.
+    ### Strengths
+    - Provides a clear visualization that allows for easy interpretation of the significance and impact of predictor
+    variables.
+    - Includes confidence intervals, which provide additional information about the uncertainty surrounding each coefficient
+    estimate.
+    ### Limitations
+    - The method assumes normality of residuals and independence of observations, assumptions that may not always hold true
+    in practice.
+    - It does not address issues related to multi-collinearity among predictor variables, which can affect the interpretation
+    of coefficients.
+    - This metric is limited to regression tasks using tabular data and is not applicable to other types of machine learning
+    tasks or data structures.
+    """
+    if model.library != "statsmodels":
+        raise SkipTestError("Only statsmodels are supported for this metric")
+    # Extract estimated coefficients and standard errors
+    coefficients = model.regression_coefficients()
+    coef = pd.to_numeric(coefficients["coef"])
+    std_err = pd.to_numeric(coefficients["std err"])
+    # Calculate confidence intervals
+    confidence_level = 0.95  # 95% confidence interval
+    z_value = stats.norm.ppf((1 + confidence_level) / 2)  # Calculate Z-value
+    lower_ci = coef - z_value * std_err
+    upper_ci = coef + z_value * std_err
+    # Create a bar plot with confidence intervals
+    fig = go.Figure()
+    fig.add_trace(
+        go.Bar(
+            x=list(coefficients["Feature"].values),
+            y=coef,
+            name="Estimated Coefficients",
+            error_y=dict(
+                type="data",
+                symmetric=False,
+                arrayminus=lower_ci,
+                array=upper_ci,
+                visible=True,
+            ),
+        )
+    )
+    fig.update_layout(
+        title=f"{model.input_id} Coefficients with Confidence Intervals",
+        xaxis_title="Predictor Variables",
+        yaxis_title="Coefficients",
+    )
+    return (fig, coefficients)

validmind 2.5.8__py3-none-any.whl → 2.5.15__py3-none-any.whl

validmind 2.5.8py3-none-any.whl → 2.5.15py3-none-any.whl