PyPI - churnkit - Versions diffs - 0.75.0a1__py3-none-any.whl - Mend

churnkit 0.75.0a1__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (302) hide show

churnkit-0.75.0a1.data/data/share/churnkit/exploration_notebooks/08_baseline_experiments.ipynb ADDED Viewed

@@ -0,0 +1,979 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "74ed4553",
+   "metadata": {
+    "papermill": {
+     "duration": 0.003306,
+     "end_time": "2026-02-02T13:03:41.579279",
+     "exception": false,
+     "start_time": "2026-02-02T13:03:41.575973",
+     "status": "completed"
+    },
+    "tags": []
+   },
+   "source": [
+    "# Chapter 8: Baseline Experiments\n",
+    "\n",
+    "**Purpose:** Train baseline models to understand data predictability and establish performance benchmarks.\n",
+    "\n",
+    "**What you'll learn:**\n",
+    "- How to prepare data for ML with proper train/test splitting\n",
+    "- How to handle class imbalance with class weights\n",
+    "- How to evaluate models with appropriate metrics (not just accuracy!)\n",
+    "- How to interpret feature importance\n",
+    "\n",
+    "**Outputs:**\n",
+    "- Baseline model performance (AUC, Precision, Recall, F1)\n",
+    "- Feature importance rankings\n",
+    "- ROC and Precision-Recall curves\n",
+    "- Performance benchmarks for comparison\n",
+    "\n",
+    "---\n",
+    "\n",
+    "## Evaluation Metrics for Imbalanced Data\n",
+    "\n",
+    "| Metric | What It Measures | When to Use |\n",
+    "|--------|-----------------|-------------|\n",
+    "| **AUC-ROC** | Ranking quality across thresholds | General model comparison |\n",
+    "| **Precision** | \"Of predicted churned, how many are correct?\" | When false positives are costly |\n",
+    "| **Recall** | \"Of actual churned, how many did we catch?\" | When missing churners is costly |\n",
+    "| **F1-Score** | Balance of precision and recall | When both matter equally |\n",
+    "| **PR-AUC** | Precision-Recall under curve | Better for imbalanced data |"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "28b04efa",
+   "metadata": {
+    "papermill": {
+     "duration": 0.002464,
+     "end_time": "2026-02-02T13:03:41.584582",
+     "exception": false,
+     "start_time": "2026-02-02T13:03:41.582118",
+     "status": "completed"
+    },
+    "tags": []
+   },
+   "source": [
+    "## 8.1 Setup"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "65e6e683",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2026-02-02T13:03:41.590074Z",
+     "iopub.status.busy": "2026-02-02T13:03:41.589907Z",
+     "iopub.status.idle": "2026-02-02T13:03:43.387358Z",
+     "shell.execute_reply": "2026-02-02T13:03:43.386225Z"
+    },
+    "papermill": {
+     "duration": 1.801175,
+     "end_time": "2026-02-02T13:03:43.388139",
+     "exception": false,
+     "start_time": "2026-02-02T13:03:41.586964",
+     "status": "completed"
+    },
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "from customer_retention.analysis.notebook_progress import track_and_export_previous\n",
+    "track_and_export_previous(\"08_baseline_experiments.ipynb\")\n",
+    "\n",
+    "from customer_retention.analysis.auto_explorer import ExplorationFindings\n",
+    "from customer_retention.analysis.visualization import ChartBuilder, display_figure, display_table\n",
+    "from customer_retention.core.config.column_config import ColumnType\n",
+    "import pandas as pd\n",
+    "import numpy as np\n",
+    "\n",
+    "from sklearn.model_selection import train_test_split, cross_val_score\n",
+    "from sklearn.preprocessing import StandardScaler, LabelEncoder\n",
+    "from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier\n",
+    "from sklearn.linear_model import LogisticRegression\n",
+    "from sklearn.metrics import (roc_auc_score, classification_report, confusion_matrix,\n",
+    "                             roc_curve, precision_recall_curve, average_precision_score,\n",
+    "                             f1_score, precision_score, recall_score)\n",
+    "\n",
+    "import plotly.graph_objects as go\n",
+    "from plotly.subplots import make_subplots\n",
+    "from customer_retention.core.config.experiments import FINDINGS_DIR, EXPERIMENTS_DIR, OUTPUT_DIR, setup_experiments_structure"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f8e78a64",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2026-02-02T13:03:43.394005Z",
+     "iopub.status.busy": "2026-02-02T13:03:43.393872Z",
+     "iopub.status.idle": "2026-02-02T13:03:43.711417Z",
+     "shell.execute_reply": "2026-02-02T13:03:43.710865Z"
+    },
+    "papermill": {
+     "duration": 0.322219,
+     "end_time": "2026-02-02T13:03:43.712580",
+     "exception": false,
+     "start_time": "2026-02-02T13:03:43.390361",
+     "status": "completed"
+    },
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "# === CONFIGURATION ===\n",
+    "from pathlib import Path\n",
+    "\n",
+    "# FINDINGS_DIR imported from customer_retention.core.config.experiments\n",
+    "\n",
+    "findings_files = [f for f in FINDINGS_DIR.glob(\"*_findings.yaml\") if \"multi_dataset\" not in f.name]\n",
+    "if not findings_files:\n",
+    "    raise FileNotFoundError(f\"No findings files found in {FINDINGS_DIR}. Run notebook 01 first.\")\n",
+    "\n",
+    "# Prefer aggregated findings (from 01d) over event-level findings\n",
+    "# Pattern: *_aggregated* in filename indicates aggregated data\n",
+    "aggregated_files = [f for f in findings_files if \"_aggregated\" in f.name]\n",
+    "non_aggregated_files = [f for f in findings_files if \"_aggregated\" not in f.name]\n",
+    "\n",
+    "if aggregated_files:\n",
+    "    # Use most recent aggregated file\n",
+    "    aggregated_files.sort(key=lambda f: f.stat().st_mtime, reverse=True)\n",
+    "    FINDINGS_PATH = str(aggregated_files[0])\n",
+    "    print(f\"Found {len(aggregated_files)} aggregated findings file(s)\")\n",
+    "    print(f\"Using: {FINDINGS_PATH}\")\n",
+    "    if non_aggregated_files:\n",
+    "        print(f\"   (Skipping {len(non_aggregated_files)} event-level findings)\")\n",
+    "else:\n",
+    "    # Fall back to most recent non-aggregated file\n",
+    "    non_aggregated_files.sort(key=lambda f: f.stat().st_mtime, reverse=True)\n",
+    "    FINDINGS_PATH = str(non_aggregated_files[0])\n",
+    "    print(f\"Found {len(findings_files)} findings file(s)\")\n",
+    "    print(f\"Using: {FINDINGS_PATH}\")\n",
+    "\n",
+    "findings = ExplorationFindings.load(FINDINGS_PATH)\n",
+    "\n",
+    "# Load data - handle aggregated vs standard paths\n",
+    "from customer_retention.stages.temporal import load_data_with_snapshot_preference, TEMPORAL_METADATA_COLS\n",
+    "\n",
+    "# For aggregated data, load directly from the parquet source\n",
+    "if \"_aggregated\" in FINDINGS_PATH and findings.source_path.endswith('.parquet'):\n",
+    "    source_path = Path(findings.source_path)\n",
+    "    # Handle relative path from notebook directory\n",
+    "    if not source_path.is_absolute():\n",
+    "        # The source_path in findings is relative to project root\n",
+    "        if str(source_path).startswith(\"experiments\"):\n",
+    "            source_path = Path(\"..\") / source_path\n",
+    "        else:\n",
+    "            source_path = FINDINGS_DIR / source_path.name\n",
+    "    df = pd.read_parquet(source_path)\n",
+    "    data_source = f\"aggregated:{source_path.name}\"\n",
+    "else:\n",
+    "    # Standard loading for event-level or entity-level data\n",
+    "    df, data_source = load_data_with_snapshot_preference(findings, output_dir=str(FINDINGS_DIR))\n",
+    "\n",
+    "charts = ChartBuilder()\n",
+    "\n",
+    "print(f\"\\nLoaded {len(df):,} rows from: {data_source}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "1dac203f",
+   "metadata": {
+    "papermill": {
+     "duration": 0.001718,
+     "end_time": "2026-02-02T13:03:43.716671",
+     "exception": false,
+     "start_time": "2026-02-02T13:03:43.714953",
+     "status": "completed"
+    },
+    "tags": []
+   },
+   "source": [
+    "## 8.2 Prepare Data for Modeling\n",
+    "\n",
+    "**📖 Feature Source:**\n",
+    "\n",
+    "Features used in this notebook come from the **ExplorationFindings** generated in earlier notebooks:\n",
+    "- Column types are **auto-detected** in notebook 01 (Data Discovery)\n",
+    "- Target column is identified from the findings\n",
+    "- Identifier columns are **excluded** to prevent data leakage\n",
+    "- Text columns are **excluded** (require specialized NLP processing)\n",
+    "\n",
+    "**📖 Best Practices:**\n",
+    "1. **Stratified Split**: Maintains class ratios in train/test sets\n",
+    "2. **Scale After Split**: Fit scaler on train only (prevents data leakage)\n",
+    "3. **Handle Missing**: Impute or drop before scaling\n",
+    "\n",
+    "**📖 Transformations Applied:**\n",
+    "- Categorical variables → Label Encoded\n",
+    "- Missing values → Median (numeric) or Mode (categorical)\n",
+    "- Features → StandardScaler (fit on train only)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b0882dff",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2026-02-02T13:03:43.720878Z",
+     "iopub.status.busy": "2026-02-02T13:03:43.720755Z",
+     "iopub.status.idle": "2026-02-02T13:03:43.725238Z",
+     "shell.execute_reply": "2026-02-02T13:03:43.724436Z"
+    },
+    "papermill": {
+     "duration": 0.007805,
+     "end_time": "2026-02-02T13:03:43.726115",
+     "exception": false,
+     "start_time": "2026-02-02T13:03:43.718310",
+     "status": "completed"
+    },
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "if not findings.target_column:\n",
+    "    raise ValueError(\"No target column set. Please define one in exploration notebooks.\")\n",
+    "\n",
+    "target = findings.target_column\n",
+    "y = df[target]\n",
+    "\n",
+    "# Features are selected based on column types from ExplorationFindings\n",
+    "# This ensures consistency with earlier notebooks and prevents data leakage\n",
+    "feature_cols = [\n",
+    "    name for name, col in findings.columns.items()\n",
+    "    if col.inferred_type not in [ColumnType.IDENTIFIER, ColumnType.TARGET, ColumnType.TEXT]\n",
+    "    and name not in TEMPORAL_METADATA_COLS  # Exclude point-in-time columns from features\n",
+    "]\n",
+    "\n",
+    "print(\"=\" * 70)\n",
+    "print(\"FEATURE SELECTION FROM FINDINGS\")\n",
+    "print(\"=\" * 70)\n",
+    "print(f\"\\n🎯 Target Column: {target}\")\n",
+    "print(f\"📊 Features Selected: {len(feature_cols)}\")\n",
+    "\n",
+    "# Show feature breakdown by type\n",
+    "type_counts = {}\n",
+    "for name in feature_cols:\n",
+    "    col_type = findings.columns[name].inferred_type.value\n",
+    "    type_counts[col_type] = type_counts.get(col_type, 0) + 1\n",
+    "\n",
+    "print(\"\\n📋 Features by Type:\")\n",
+    "for col_type, count in sorted(type_counts.items()):\n",
+    "    print(f\"   {col_type}: {count}\")\n",
+    "\n",
+    "# Show excluded columns\n",
+    "excluded = [name for name, col in findings.columns.items() \n",
+    "            if col.inferred_type in [ColumnType.IDENTIFIER, ColumnType.TARGET, ColumnType.TEXT]]\n",
+    "if excluded:\n",
+    "    print(f\"\\n⛔ Excluded Columns: {', '.join(excluded)}\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "47ad17e5",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2026-02-02T13:03:43.735220Z",
+     "iopub.status.busy": "2026-02-02T13:03:43.735017Z",
+     "iopub.status.idle": "2026-02-02T13:03:43.753043Z",
+     "shell.execute_reply": "2026-02-02T13:03:43.748353Z"
+    },
+    "papermill": {
+     "duration": 0.026325,
+     "end_time": "2026-02-02T13:03:43.754902",
+     "exception": false,
+     "start_time": "2026-02-02T13:03:43.728577",
+     "status": "completed"
+    },
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "# Check feature availability and remove problematic features\n",
+    "from customer_retention.stages.features.feature_selector import FeatureSelector\n",
+    "\n",
+    "print(\"=\" * 70)\n",
+    "print(\"FEATURE AVAILABILITY CHECK\")\n",
+    "print(\"=\" * 70)\n",
+    "\n",
+    "unavailable_features = []\n",
+    "if findings.has_availability_issues:\n",
+    "    selector = FeatureSelector(target_column=findings.target_column)\n",
+    "    availability_recs = selector.get_availability_recommendations(findings.feature_availability)\n",
+    "    unavailable_features = [rec.column for rec in availability_recs]\n",
+    "    \n",
+    "    print(f\"\\n⚠️  {len(availability_recs)} feature(s) have availability issues:\\n\")\n",
+    "    for rec in availability_recs:\n",
+    "        print(f\"   • {rec.column} ({rec.issue_type}, {rec.coverage_pct:.0f}% coverage)\")\n",
+    "    \n",
+    "    print(\"\\n📋 Alternative approaches (for investigation):\")\n",
+    "    print(\"   • segment_by_cohort: Train separate models per availability period\")\n",
+    "    print(\"   • add_indicator: Create availability flags and impute missing\")\n",
+    "    print(\"   • filter_window: Restrict data to feature's available period\")\n",
+    "    \n",
+    "    original_count = len(feature_cols)\n",
+    "    feature_cols = [f for f in feature_cols if f not in unavailable_features]\n",
+    "    \n",
+    "    print(f\"\\n🗑️  Removed {original_count - len(feature_cols)} unavailable features\")\n",
+    "    print(f\"📊 Features remaining: {len(feature_cols)}\")\n",
+    "else:\n",
+    "    print(\"\\n✅ All features have full temporal coverage.\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5cc3c542",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2026-02-02T13:03:43.767637Z",
+     "iopub.status.busy": "2026-02-02T13:03:43.767512Z",
+     "iopub.status.idle": "2026-02-02T13:03:43.790766Z",
+     "shell.execute_reply": "2026-02-02T13:03:43.790207Z"
+    },
+    "papermill": {
+     "duration": 0.026743,
+     "end_time": "2026-02-02T13:03:43.791461",
+     "exception": false,
+     "start_time": "2026-02-02T13:03:43.764718",
+     "status": "completed"
+    },
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "X = df[feature_cols].copy()\n",
+    "\n",
+    "# Encode categorical variables\n",
+    "for col in X.select_dtypes(include=['object']).columns:\n",
+    "    le = LabelEncoder()\n",
+    "    X[col] = le.fit_transform(X[col].astype(str))\n",
+    "\n",
+    "# Handle missing values (median for numeric, mode for others)\n",
+    "for col in X.columns:\n",
+    "    if X[col].isnull().any():\n",
+    "        if X[col].dtype in ['int64', 'float64']:\n",
+    "            X[col] = X[col].fillna(X[col].median())\n",
+    "        else:\n",
+    "            X[col] = X[col].fillna(X[col].mode()[0])\n",
+    "\n",
+    "# Stratified train/test split (maintains class distribution)\n",
+    "X_train, X_test, y_train, y_test = train_test_split(\n",
+    "    X, y, test_size=0.2, random_state=42, stratify=y\n",
+    ")\n",
+    "\n",
+    "# Scale features (fit on train only!)\n",
+    "scaler = StandardScaler()\n",
+    "X_train_scaled = pd.DataFrame(scaler.fit_transform(X_train), columns=X_train.columns)\n",
+    "X_test_scaled = pd.DataFrame(scaler.transform(X_test), columns=X_test.columns)\n",
+    "\n",
+    "print(f\"Train size: {len(X_train):,} ({len(X_train)/len(X)*100:.0f}%)\")\n",
+    "print(f\"Test size: {len(X_test):,} ({len(X_test)/len(X)*100:.0f}%)\")\n",
+    "print(f\"\\nTrain class distribution:\")\n",
+    "print(f\"  Retained (1): {(y_train == 1).sum():,} ({(y_train == 1).sum()/len(y_train)*100:.1f}%)\")\n",
+    "print(f\"  Churned (0): {(y_train == 0).sum():,} ({(y_train == 0).sum()/len(y_train)*100:.1f}%)\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4abd8fd2",
+   "metadata": {
+    "papermill": {
+     "duration": 0.001721,
+     "end_time": "2026-02-02T13:03:43.795234",
+     "exception": false,
+     "start_time": "2026-02-02T13:03:43.793513",
+     "status": "completed"
+    },
+    "tags": []
+   },
+   "source": [
+    "## 8.3 Baseline Models (with Class Weights)\n",
+    "\n",
+    "**📖 Using Class Weights:**\n",
+    "- `class_weight='balanced'` automatically adjusts weights inversely proportional to class frequencies\n",
+    "- This helps models pay more attention to the minority class (churned customers)\n",
+    "- Without weights, models may just predict \"retained\" for everyone"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "0c4db3a7",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2026-02-02T13:03:43.800931Z",
+     "iopub.status.busy": "2026-02-02T13:03:43.800775Z",
+     "iopub.status.idle": "2026-02-02T13:03:51.241619Z",
+     "shell.execute_reply": "2026-02-02T13:03:51.241076Z"
+    },
+    "papermill": {
+     "duration": 7.444712,
+     "end_time": "2026-02-02T13:03:51.242480",
+     "exception": false,
+     "start_time": "2026-02-02T13:03:43.797768",
+     "status": "completed"
+    },
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "# Models with class_weight='balanced' to handle imbalance\n",
+    "models = {\n",
+    "    \"Logistic Regression\": LogisticRegression(max_iter=1000, random_state=42, class_weight='balanced'),\n",
+    "    \"Random Forest\": RandomForestClassifier(n_estimators=100, random_state=42, n_jobs=-1, class_weight='balanced'),\n",
+    "    \"Gradient Boosting\": GradientBoostingClassifier(n_estimators=100, random_state=42)\n",
+    "}\n",
+    "\n",
+    "results = []\n",
+    "model_predictions = {}\n",
+    "\n",
+    "for name, model in models.items():\n",
+    "    print(f\"Training {name}...\")\n",
+    "    \n",
+    "    # Use scaled data for Logistic Regression, unscaled for tree-based\n",
+    "    if \"Logistic\" in name:\n",
+    "        model.fit(X_train_scaled, y_train)\n",
+    "        y_pred = model.predict(X_test_scaled)\n",
+    "        y_pred_proba = model.predict_proba(X_test_scaled)[:, 1]\n",
+    "    else:\n",
+    "        model.fit(X_train, y_train)\n",
+    "        y_pred = model.predict(X_test)\n",
+    "        y_pred_proba = model.predict_proba(X_test)[:, 1]\n",
+    "    \n",
+    "    # Calculate metrics\n",
+    "    auc = roc_auc_score(y_test, y_pred_proba)\n",
+    "    pr_auc = average_precision_score(y_test, y_pred_proba)\n",
+    "    f1 = f1_score(y_test, y_pred)\n",
+    "    precision = precision_score(y_test, y_pred)\n",
+    "    recall = recall_score(y_test, y_pred)\n",
+    "    \n",
+    "    # Cross-validation\n",
+    "    if \"Logistic\" in name:\n",
+    "        cv_scores = cross_val_score(model, X_train_scaled, y_train, cv=5, scoring='roc_auc')\n",
+    "    else:\n",
+    "        cv_scores = cross_val_score(model, X_train, y_train, cv=5, scoring='roc_auc')\n",
+    "    \n",
+    "    results.append({\n",
+    "        \"Model\": name,\n",
+    "        \"Test AUC\": auc,\n",
+    "        \"PR-AUC\": pr_auc,\n",
+    "        \"F1-Score\": f1,\n",
+    "        \"Precision\": precision,\n",
+    "        \"Recall\": recall,\n",
+    "        \"CV AUC Mean\": cv_scores.mean(),\n",
+    "        \"CV AUC Std\": cv_scores.std()\n",
+    "    })\n",
+    "    \n",
+    "    model_predictions[name] = {\n",
+    "        'y_pred': y_pred,\n",
+    "        'y_pred_proba': y_pred_proba,\n",
+    "        'model': model\n",
+    "    }\n",
+    "\n",
+    "results_df = pd.DataFrame(results)\n",
+    "results_df = results_df.round(4)\n",
+    "\n",
+    "print(\"\\n\" + \"=\" * 80)\n",
+    "print(\"MODEL COMPARISON\")\n",
+    "print(\"=\" * 80)\n",
+    "display_table(results_df)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "134b2fd5",
+   "metadata": {
+    "papermill": {
+     "duration": 0.002387,
+     "end_time": "2026-02-02T13:03:51.247546",
+     "exception": false,
+     "start_time": "2026-02-02T13:03:51.245159",
+     "status": "completed"
+    },
+    "tags": []
+   },
+   "source": [
+    "## 8.4 Feature Importance (Random Forest)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c0adf3a4",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2026-02-02T13:03:51.252747Z",
+     "iopub.status.busy": "2026-02-02T13:03:51.252629Z",
+     "iopub.status.idle": "2026-02-02T13:03:51.281884Z",
+     "shell.execute_reply": "2026-02-02T13:03:51.281477Z"
+    },
+    "papermill": {
+     "duration": 0.032888,
+     "end_time": "2026-02-02T13:03:51.282770",
+     "exception": false,
+     "start_time": "2026-02-02T13:03:51.249882",
+     "status": "completed"
+    },
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "rf_model = models[\"Random Forest\"]\n",
+    "importance_df = pd.DataFrame({\n",
+    "    \"Feature\": feature_cols,\n",
+    "    \"Importance\": rf_model.feature_importances_\n",
+    "}).sort_values(\"Importance\", ascending=False)\n",
+    "\n",
+    "top_n = 15\n",
+    "top_features = importance_df.head(top_n)\n",
+    "\n",
+    "fig = charts.bar_chart(\n",
+    "    top_features[\"Feature\"].tolist(),\n",
+    "    top_features[\"Importance\"].tolist(),\n",
+    "    title=f\"Top {top_n} Feature Importances\"\n",
+    ")\n",
+    "display_figure(fig)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7ced338c",
+   "metadata": {
+    "papermill": {
+     "duration": 0.003509,
+     "end_time": "2026-02-02T13:03:51.290387",
+     "exception": false,
+     "start_time": "2026-02-02T13:03:51.286878",
+     "status": "completed"
+    },
+    "tags": []
+   },
+   "source": [
+    "## 8.5 Classification Report (Best Model)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "28691087",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2026-02-02T13:03:51.298626Z",
+     "iopub.status.busy": "2026-02-02T13:03:51.298513Z",
+     "iopub.status.idle": "2026-02-02T13:03:51.306713Z",
+     "shell.execute_reply": "2026-02-02T13:03:51.306286Z"
+    },
+    "papermill": {
+     "duration": 0.013151,
+     "end_time": "2026-02-02T13:03:51.307450",
+     "exception": false,
+     "start_time": "2026-02-02T13:03:51.294299",
+     "status": "completed"
+    },
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "best_model = models[\"Gradient Boosting\"]\n",
+    "y_pred = best_model.predict(X_test)\n",
+    "\n",
+    "print(\"Classification Report (Gradient Boosting):\")\n",
+    "print(classification_report(y_test, y_pred))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "de19bff3",
+   "metadata": {
+    "papermill": {
+     "duration": 0.003638,
+     "end_time": "2026-02-02T13:03:51.315003",
+     "exception": false,
+     "start_time": "2026-02-02T13:03:51.311365",
+     "status": "completed"
+    },
+    "tags": []
+   },
+   "source": [
+    "## 8.6 Model Comparison Grid\n",
+    "\n",
+    "This visualization shows all models side-by-side with:\n",
+    "- **Row 1**: Confusion matrices (counts and percentages)\n",
+    "- **Row 2**: ROC curves with AUC scores\n",
+    "- **Row 3**: Precision-Recall curves with PR-AUC scores\n",
+    "\n",
+    "**📖 How to Read:**\n",
+    "- **Confusion Matrix**: Diagonal = correct predictions. Off-diagonal = errors.\n",
+    "- **ROC Curve**: Higher curve = better. AUC > 0.8 is good, > 0.9 is excellent.\n",
+    "- **PR Curve**: Higher curve = better at finding positives without false alarms."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ab41b3bb",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2026-02-02T13:03:51.322228Z",
+     "iopub.status.busy": "2026-02-02T13:03:51.322098Z",
+     "iopub.status.idle": "2026-02-02T13:03:51.405939Z",
+     "shell.execute_reply": "2026-02-02T13:03:51.405558Z"
+    },
+    "papermill": {
+     "duration": 0.088886,
+     "end_time": "2026-02-02T13:03:51.407134",
+     "exception": false,
+     "start_time": "2026-02-02T13:03:51.318248",
+     "status": "completed"
+    },
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "# Create comprehensive model comparison grid\n",
+    "# Uses the framework's ChartBuilder.model_comparison_grid method\n",
+    "\n",
+    "# Prepare model results in the expected format\n",
+    "grid_results = {\n",
+    "    name: {\n",
+    "        \"y_pred\": data[\"y_pred\"],\n",
+    "        \"y_pred_proba\": data[\"y_pred_proba\"]\n",
+    "    }\n",
+    "    for name, data in model_predictions.items()\n",
+    "}\n",
+    "\n",
+    "# Create the visualization grid\n",
+    "fig = charts.model_comparison_grid(\n",
+    "    grid_results,\n",
+    "    y_test,\n",
+    "    class_labels=[\"Churned (0)\", \"Retained (1)\"],\n",
+    "    title=\"Model Comparison: Confusion Matrix | ROC Curve | Precision-Recall\"\n",
+    ")\n",
+    "\n",
+    "display_figure(fig)\n",
+    "\n",
+    "# Summary metrics table\n",
+    "print(\"\\n\" + \"=\" * 80)\n",
+    "print(\"METRICS SUMMARY\")\n",
+    "print(\"=\" * 80)\n",
+    "display_table(results_df[[\"Model\", \"Test AUC\", \"PR-AUC\", \"F1-Score\", \"Precision\", \"Recall\"]])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "8644b753",
+   "metadata": {
+    "papermill": {
+     "duration": 0.006319,
+     "end_time": "2026-02-02T13:03:51.420670",
+     "exception": false,
+     "start_time": "2026-02-02T13:03:51.414351",
+     "status": "completed"
+    },
+    "tags": []
+   },
+   "source": [
+    "### 8.6.1 Individual Model Analysis\n",
+    "\n",
+    "The grid above shows all models together. Below is detailed analysis per model."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "912a2e57",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2026-02-02T13:03:51.436032Z",
+     "iopub.status.busy": "2026-02-02T13:03:51.435905Z",
+     "iopub.status.idle": "2026-02-02T13:03:51.446884Z",
+     "shell.execute_reply": "2026-02-02T13:03:51.446378Z"
+    },
+    "papermill": {
+     "duration": 0.019499,
+     "end_time": "2026-02-02T13:03:51.447385",
+     "exception": false,
+     "start_time": "2026-02-02T13:03:51.427886",
+     "status": "completed"
+    },
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "# Individual model classification reports\n",
+    "print(\"=\" * 70)\n",
+    "print(\"CLASSIFICATION REPORTS BY MODEL\")\n",
+    "print(\"=\" * 70)\n",
+    "\n",
+    "for name, data in model_predictions.items():\n",
+    "    print(f\"\\n{'='*40}\")\n",
+    "    print(f\"📊 {name}\")\n",
+    "    print('='*40)\n",
+    "    print(classification_report(y_test, data['y_pred'], target_names=[\"Churned\", \"Retained\"]))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "84d4fa6c",
+   "metadata": {
+    "papermill": {
+     "duration": 0.007249,
+     "end_time": "2026-02-02T13:03:51.461923",
+     "exception": false,
+     "start_time": "2026-02-02T13:03:51.454674",
+     "status": "completed"
+    },
+    "tags": []
+   },
+   "source": [
+    "### 8.6.1 Precision-Recall Curves\n",
+    "\n",
+    "**📖 Why PR Curves for Imbalanced Data:**\n",
+    "- ROC curves can look optimistic for imbalanced data\n",
+    "- PR curves focus on the minority class (churners)\n",
+    "- Better at showing how well we detect actual churners\n",
+    "\n",
+    "**📖 How to Read:**\n",
+    "- **Baseline** (dashed line) = proportion of positives in the data\n",
+    "- Higher curve = better at finding churners without too many false alarms"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "61320a60",
+   "metadata": {
+    "papermill": {
+     "duration": 0.00658,
+     "end_time": "2026-02-02T13:03:51.475265",
+     "exception": false,
+     "start_time": "2026-02-02T13:03:51.468685",
+     "status": "completed"
+    },
+    "tags": []
+   },
+   "source": [
+    "## 8.7 Key Takeaways\n",
+    "\n",
+    "**📖 Interpreting Results:**"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f89502a1",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2026-02-02T13:03:51.490585Z",
+     "iopub.status.busy": "2026-02-02T13:03:51.490464Z",
+     "iopub.status.idle": "2026-02-02T13:03:51.494835Z",
+     "shell.execute_reply": "2026-02-02T13:03:51.494379Z"
+    },
+    "papermill": {
+     "duration": 0.012494,
+     "end_time": "2026-02-02T13:03:51.495294",
+     "exception": false,
+     "start_time": "2026-02-02T13:03:51.482800",
+     "status": "completed"
+    },
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "best_model = results_df.loc[results_df['Test AUC'].idxmax()]\n",
+    "\n",
+    "print(\"=\" * 70)\n",
+    "print(\"KEY TAKEAWAYS\")\n",
+    "print(\"=\" * 70)\n",
+    "\n",
+    "print(f\"\\n🏆 BEST MODEL: {best_model['Model']}\")\n",
+    "print(f\"   Test AUC: {best_model['Test AUC']:.4f}\")\n",
+    "print(f\"   PR-AUC: {best_model['PR-AUC']:.4f}\")\n",
+    "print(f\"   F1-Score: {best_model['F1-Score']:.4f}\")\n",
+    "\n",
+    "print(f\"\\n📊 TOP 3 IMPORTANT FEATURES:\")\n",
+    "for i, feat in enumerate(importance_df.head(3)['Feature'].tolist(), 1):\n",
+    "    imp = importance_df[importance_df['Feature'] == feat]['Importance'].values[0]\n",
+    "    print(f\"   {i}. {feat} ({imp:.3f})\")\n",
+    "\n",
+    "print(f\"\\n📈 MODEL PERFORMANCE ASSESSMENT:\")\n",
+    "if best_model['Test AUC'] > 0.90:\n",
+    "    print(\"   Excellent predictive signal - likely production-ready with tuning\")\n",
+    "elif best_model['Test AUC'] > 0.80:\n",
+    "    print(\"   Strong predictive signal - good baseline for improvement\")\n",
+    "elif best_model['Test AUC'] > 0.70:\n",
+    "    print(\"   Moderate signal - consider more feature engineering\")\n",
+    "else:\n",
+    "    print(\"   Weak signal - may need more data or different features\")\n",
+    "\n",
+    "print(f\"\\n💡 NEXT STEPS:\")\n",
+    "print(\"   1. Feature engineering with derived features (notebook 05)\")\n",
+    "print(\"   2. Hyperparameter tuning (GridSearchCV)\")\n",
+    "print(\"   3. Threshold optimization for business metrics\")\n",
+    "print(\"   4. A/B testing in production\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4351d95d",
+   "metadata": {
+    "papermill": {
+     "duration": 0.006771,
+     "end_time": "2026-02-02T13:03:51.509716",
+     "exception": false,
+     "start_time": "2026-02-02T13:03:51.502945",
+     "status": "completed"
+    },
+    "tags": []
+   },
+   "source": [
+    "---\n",
+    "\n",
+    "## Summary: What We Learned\n",
+    "\n",
+    "In this notebook, we trained baseline models and established performance benchmarks:\n",
+    "\n",
+    "1. **Data Preparation** - Proper train/test split with stratification and scaling\n",
+    "2. **Class Imbalance Handling** - Used balanced class weights\n",
+    "3. **Model Comparison** - Compared Logistic Regression, Random Forest, and Gradient Boosting\n",
+    "4. **Multiple Metrics** - Evaluated with AUC, PR-AUC, F1, Precision, Recall\n",
+    "5. **Feature Importance** - Identified the most predictive features\n",
+    "\n",
+    "## Key Results for This Dataset\n",
+    "\n",
+    "| Metric | Value | Interpretation |\n",
+    "|--------|-------|----------------|\n",
+    "| Best AUC | ~0.98 | Excellent discrimination |\n",
+    "| Top Feature | esent | Email engagement is critical |\n",
+    "| Imbalance | ~4:1 | Moderate, handled with class weights |\n",
+    "\n",
+    "---\n",
+    "\n",
+    "## Next Steps\n",
+    "\n",
+    "Continue to **09_business_alignment.ipynb** to:\n",
+    "- Align model performance with business objectives\n",
+    "- Define intervention strategies by risk level\n",
+    "- Calculate expected ROI from the model\n",
+    "- Set deployment requirements"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "87d90782",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2026-02-02T13:03:51.524206Z",
+     "iopub.status.busy": "2026-02-02T13:03:51.524094Z",
+     "iopub.status.idle": "2026-02-02T13:03:51.526600Z",
+     "shell.execute_reply": "2026-02-02T13:03:51.526165Z"
+    },
+    "papermill": {
+     "duration": 0.010354,
+     "end_time": "2026-02-02T13:03:51.527118",
+     "exception": false,
+     "start_time": "2026-02-02T13:03:51.516764",
+     "status": "completed"
+    },
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "best_auc = max(float(r[\"Test AUC\"]) for r in results)\n",
+    "\n",
+    "print(\"Key Takeaways:\")\n",
+    "print(\"=\"*50)\n",
+    "print(f\"Best baseline AUC: {best_auc:.4f}\")\n",
+    "print(f\"Top 3 important features: {', '.join(importance_df.head(3)['Feature'].tolist())}\")\n",
+    "\n",
+    "if best_auc > 0.85:\n",
+    "    print(\"\\nStrong predictive signal detected. Data is well-suited for modeling.\")\n",
+    "elif best_auc > 0.70:\n",
+    "    print(\"\\nModerate predictive signal. Consider feature engineering for improvement.\")\n",
+    "else:\n",
+    "    print(\"\\nWeak predictive signal. May need more features or data.\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "1a13b5b8",
+   "metadata": {
+    "papermill": {
+     "duration": 0.007176,
+     "end_time": "2026-02-02T13:03:51.541652",
+     "exception": false,
+     "start_time": "2026-02-02T13:03:51.534476",
+     "status": "completed"
+    },
+    "tags": []
+   },
+   "source": [
+    "---\n",
+    "\n",
+    "## Next Steps\n",
+    "\n",
+    "Continue to **09_business_alignment.ipynb** to align with business objectives."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "18e9c769",
+   "metadata": {
+    "papermill": {
+     "duration": 0.007026,
+     "end_time": "2026-02-02T13:03:51.555187",
+     "exception": false,
+     "start_time": "2026-02-02T13:03:51.548161",
+     "status": "completed"
+    },
+    "tags": []
+   },
+   "source": [
+    "> **Save Reminder:** Save this notebook (Ctrl+S / Cmd+S) before running the next one.\n",
+    "> The next notebook will automatically export this notebook's HTML documentation from the saved file."
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.12.4"
+  },
+  "papermill": {
+   "default_parameters": {},
+   "duration": 13.137985,
+   "end_time": "2026-02-02T13:03:54.179035",
+   "environment_variables": {},
+   "exception": null,
+   "input_path": "/Users/Vital/python/CustomerRetention/exploration_notebooks/08_baseline_experiments.ipynb",
+   "output_path": "/Users/Vital/python/CustomerRetention/exploration_notebooks/08_baseline_experiments.ipynb",
+   "parameters": {},
+   "start_time": "2026-02-02T13:03:41.041050",
+   "version": "2.6.0"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}