npm - @synsci/cli-darwin-x64-baseline - Versions diffs - 1.1.77 → 1.1.78 - Mend

@synsci/cli-darwin-x64-baseline 1.1.77 → 1.1.78

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (830) hide show

package/bin/skills/scikit-survival/references/ensemble-models.md ADDED Viewed

@@ -0,0 +1,327 @@
+# Ensemble Models for Survival Analysis
+## Random Survival Forests
+### Overview
+Random Survival Forests extend the random forest algorithm to survival analysis with censored data. They build multiple decision trees on bootstrap samples and aggregate predictions.
+### How They Work
+1. **Bootstrap Sampling**: Each tree is built on a different bootstrap sample of the training data
+2. **Feature Randomness**: At each node, only a random subset of features is considered for splitting
+3. **Survival Function Estimation**: At terminal nodes, Kaplan-Meier and Nelson-Aalen estimators compute survival functions
+4. **Ensemble Aggregation**: Final predictions average survival functions across all trees
+### When to Use
+- Complex non-linear relationships between features and survival
+- No assumptions about functional form needed
+- Want robust predictions with minimal tuning
+- Need feature importance estimates
+- Have sufficient sample size (typically n > 100)
+### Key Parameters
+- `n_estimators`: Number of trees (default: 100)
+  - More trees = more stable predictions but slower
+  - Typical range: 100-1000
+- `max_depth`: Maximum depth of trees
+  - Controls tree complexity
+  - None = nodes expanded until pure or min_samples_split
+- `min_samples_split`: Minimum samples to split a node (default: 6)
+  - Larger values = more regularization
+- `min_samples_leaf`: Minimum samples at leaf nodes (default: 3)
+  - Prevents overfitting to small groups
+- `max_features`: Number of features to consider at each split
+  - 'sqrt': sqrt(n_features) - good default
+  - 'log2': log2(n_features)
+  - None: all features
+- `n_jobs`: Number of parallel jobs (-1 uses all processors)
+### Example Usage
+```python
+from sksurv.ensemble import RandomSurvivalForest
+from sksurv.datasets import load_breast_cancer
+# Load data
+X, y = load_breast_cancer()
+# Fit Random Survival Forest
+rsf = RandomSurvivalForest(n_estimators=1000,
+                           min_samples_split=10,
+                           min_samples_leaf=15,
+                           max_features="sqrt",
+                           n_jobs=-1,
+                           random_state=42)
+rsf.fit(X, y)
+# Predict risk scores
+risk_scores = rsf.predict(X)
+# Predict survival functions
+surv_funcs = rsf.predict_survival_function(X)
+# Predict cumulative hazard functions
+chf_funcs = rsf.predict_cumulative_hazard_function(X)
+```
+### Feature Importance
+**Important**: Built-in feature importance based on split impurity is not reliable for survival data. Use permutation-based feature importance instead.
+```python
+from sklearn.inspection import permutation_importance
+from sksurv.metrics import concordance_index_censored
+# Define scoring function
+def score_survival_model(model, X, y):
+    prediction = model.predict(X)
+    result = concordance_index_censored(y['event'], y['time'], prediction)
+    return result[0]
+# Compute permutation importance
+perm_importance = permutation_importance(
+    rsf, X, y,
+    n_repeats=10,
+    random_state=42,
+    scoring=score_survival_model
+)
+# Get feature importance
+feature_importance = perm_importance.importances_mean
+```
+## Gradient Boosting Survival Analysis
+### Overview
+Gradient boosting builds an ensemble by sequentially adding weak learners that correct errors of previous learners. The model is: **f(x) = Σ β_m g(x; θ_m)**
+### Model Types
+#### GradientBoostingSurvivalAnalysis
+Uses regression trees as base learners. Can capture complex non-linear relationships.
+**When to Use:**
+- Need to model complex non-linear relationships
+- Want high predictive performance
+- Have sufficient data to avoid overfitting
+- Can tune hyperparameters carefully
+#### ComponentwiseGradientBoostingSurvivalAnalysis
+Uses component-wise least squares as base learners. Produces linear models with automatic feature selection.
+**When to Use:**
+- Want interpretable linear model
+- Need automatic feature selection (like Lasso)
+- Have high-dimensional data
+- Prefer sparse models
+### Loss Functions
+#### Cox's Partial Likelihood (default)
+Maintains proportional hazards framework but replaces linear model with additive ensemble model.
+**Appropriate for:**
+- Standard survival analysis settings
+- When proportional hazards is reasonable
+- Most use cases
+#### Accelerated Failure Time (AFT)
+Assumes features accelerate or decelerate survival time by a constant factor. Loss function: **(1/n) Σ ω_i (log y_i - f(x_i))²**
+**Appropriate for:**
+- AFT framework preferred over proportional hazards
+- Want to model time directly
+- Need to interpret effects on survival time
+### Regularization Strategies
+Three main techniques prevent overfitting:
+1. **Learning Rate** (`learning_rate < 1`)
+   - Shrinks contribution of each base learner
+   - Smaller values need more iterations but better generalization
+   - Typical range: 0.01 - 0.1
+2. **Dropout** (`dropout_rate > 0`)
+   - Randomly drops previous learners during training
+   - Forces learners to be more robust
+   - Typical range: 0.01 - 0.2
+3. **Subsampling** (`subsample < 1`)
+   - Uses random subset of data for each iteration
+   - Adds randomness and reduces overfitting
+   - Typical range: 0.5 - 0.9
+**Recommendation**: Combine small learning rate with early stopping for best performance.
+### Key Parameters
+- `loss`: Loss function ('coxph' or 'ipcwls')
+- `learning_rate`: Shrinks contribution of each tree (default: 0.1)
+- `n_estimators`: Number of boosting iterations (default: 100)
+- `subsample`: Fraction of samples for each iteration (default: 1.0)
+- `dropout_rate`: Dropout rate for learners (default: 0.0)
+- `max_depth`: Maximum depth of trees (default: 3)
+- `min_samples_split`: Minimum samples to split node (default: 2)
+- `min_samples_leaf`: Minimum samples at leaf (default: 1)
+- `max_features`: Features to consider at each split
+### Example Usage
+```python
+from sksurv.ensemble import GradientBoostingSurvivalAnalysis
+from sklearn.model_selection import train_test_split
+# Split data
+X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
+# Fit gradient boosting model
+gbs = GradientBoostingSurvivalAnalysis(
+    loss='coxph',
+    learning_rate=0.05,
+    n_estimators=200,
+    subsample=0.8,
+    dropout_rate=0.1,
+    max_depth=3,
+    random_state=42
+)
+gbs.fit(X_train, y_train)
+# Predict risk scores
+risk_scores = gbs.predict(X_test)
+# Predict survival functions
+surv_funcs = gbs.predict_survival_function(X_test)
+# Predict cumulative hazard functions
+chf_funcs = gbs.predict_cumulative_hazard_function(X_test)
+```
+### Early Stopping
+Use validation set to prevent overfitting:
+```python
+from sklearn.model_selection import train_test_split
+# Create train/validation split
+X_tr, X_val, y_tr, y_val = train_test_split(X_train, y_train, test_size=0.2, random_state=42)
+# Fit with early stopping
+gbs = GradientBoostingSurvivalAnalysis(
+    n_estimators=1000,
+    learning_rate=0.01,
+    max_depth=3,
+    validation_fraction=0.2,
+    n_iter_no_change=10,
+    random_state=42
+)
+gbs.fit(X_tr, y_tr)
+# Number of iterations used
+print(f"Used {gbs.n_estimators_} iterations")
+```
+### Hyperparameter Tuning
+```python
+from sklearn.model_selection import GridSearchCV
+param_grid = {
+    'learning_rate': [0.01, 0.05, 0.1],
+    'n_estimators': [100, 200, 300],
+    'max_depth': [3, 5, 7],
+    'subsample': [0.8, 1.0]
+}
+cv = GridSearchCV(
+    GradientBoostingSurvivalAnalysis(),
+    param_grid,
+    scoring='concordance_index_ipcw',
+    cv=5,
+    n_jobs=-1
+)
+cv.fit(X, y)
+best_model = cv.best_estimator_
+```
+## ComponentwiseGradientBoostingSurvivalAnalysis
+### Overview
+Uses component-wise least squares, producing sparse linear models with automatic feature selection similar to Lasso.
+### When to Use
+- Want interpretable linear model
+- Need automatic feature selection
+- Have high-dimensional data with many irrelevant features
+- Prefer coefficient-based interpretation
+### Example Usage
+```python
+from sksurv.ensemble import ComponentwiseGradientBoostingSurvivalAnalysis
+# Fit componentwise boosting
+cgbs = ComponentwiseGradientBoostingSurvivalAnalysis(
+    loss='coxph',
+    learning_rate=0.1,
+    n_estimators=100
+)
+cgbs.fit(X, y)
+# Get selected features and coefficients
+coef = cgbs.coef_
+selected_features = [i for i, c in enumerate(coef) if c != 0]
+```
+## ExtraSurvivalTrees
+Extremely randomized survival trees - similar to Random Survival Forest but with additional randomness in split selection.
+### When to Use
+- Want even more regularization than Random Survival Forest
+- Have limited data
+- Need faster training
+### Key Difference
+Instead of finding the best split for selected features, it randomly selects split points, adding more diversity to the ensemble.
+```python
+from sksurv.ensemble import ExtraSurvivalTrees
+est = ExtraSurvivalTrees(n_estimators=100, random_state=42)
+est.fit(X, y)
+```
+## Model Comparison
+| Model | Complexity | Interpretability | Performance | Speed |
+|-------|-----------|------------------|-------------|-------|
+| Random Survival Forest | Medium | Low | High | Medium |
+| GradientBoostingSurvivalAnalysis | High | Low | Highest | Slow |
+| ComponentwiseGradientBoostingSurvivalAnalysis | Low | High | Medium | Fast |
+| ExtraSurvivalTrees | Medium | Low | Medium-High | Fast |
+**General Recommendations:**
+- **Best overall performance**: GradientBoostingSurvivalAnalysis with tuning
+- **Best balance**: RandomSurvivalForest
+- **Best interpretability**: ComponentwiseGradientBoostingSurvivalAnalysis
+- **Fastest training**: ExtraSurvivalTrees

package/bin/skills/scikit-survival/references/evaluation-metrics.md ADDED Viewed

@@ -0,0 +1,378 @@
+# Evaluation Metrics for Survival Models
+## Overview
+Evaluating survival models requires specialized metrics that account for censored data. scikit-survival provides three main categories of metrics:
+1. Concordance Index (C-index)
+2. Time-dependent ROC and AUC
+3. Brier Score
+## Concordance Index (C-index)
+### What It Measures
+The concordance index measures the rank correlation between predicted risk scores and observed event times. It represents the probability that, for a random pair of subjects, the model correctly orders their survival times.
+**Range**: 0 to 1
+- 0.5 = random predictions
+- 1.0 = perfect concordance
+- Typical good performance: 0.7-0.8
+### Two Implementations
+#### Harrell's C-index (concordance_index_censored)
+The traditional estimator, simpler but has limitations.
+**When to Use:**
+- Low censoring rates (< 40%)
+- Quick evaluation during development
+- Comparing models on same dataset
+**Limitations:**
+- Becomes increasingly biased with high censoring rates
+- Overestimates performance starting at approximately 49% censoring
+```python
+from sksurv.metrics import concordance_index_censored
+# Compute Harrell's C-index
+result = concordance_index_censored(y_test['event'], y_test['time'], risk_scores)
+c_index = result[0]
+print(f"Harrell's C-index: {c_index:.3f}")
+```
+#### Uno's C-index (concordance_index_ipcw)
+Inverse probability of censoring weighted (IPCW) estimator that corrects for censoring bias.
+**When to Use:**
+- Moderate to high censoring rates (> 40%)
+- Need unbiased estimates
+- Comparing models across different datasets
+- Publishing results (more robust)
+**Advantages:**
+- Remains stable even with high censoring
+- More reliable estimates
+- Less biased
+```python
+from sksurv.metrics import concordance_index_ipcw
+# Compute Uno's C-index
+# Requires training data for IPCW calculation
+c_index, concordant, discordant, tied_risk = concordance_index_ipcw(
+    y_train, y_test, risk_scores
+)
+print(f"Uno's C-index: {c_index:.3f}")
+```
+### Choosing Between Harrell's and Uno's
+**Use Uno's C-index when:**
+- Censoring rate > 40%
+- Need most accurate estimates
+- Comparing models from different studies
+- Publishing research
+**Use Harrell's C-index when:**
+- Low censoring rates
+- Quick model comparisons during development
+- Computational efficiency is critical
+### Example Comparison
+```python
+from sksurv.metrics import concordance_index_censored, concordance_index_ipcw
+# Harrell's C-index
+harrell = concordance_index_censored(y_test['event'], y_test['time'], risk_scores)[0]
+# Uno's C-index
+uno = concordance_index_ipcw(y_train, y_test, risk_scores)[0]
+print(f"Harrell's C-index: {harrell:.3f}")
+print(f"Uno's C-index: {uno:.3f}")
+```
+## Time-Dependent ROC and AUC
+### What It Measures
+Time-dependent AUC evaluates model discrimination at specific time points. It distinguishes subjects who experience events by time *t* from those who don't.
+**Question answered**: "How well does the model predict who will have an event by time t?"
+### When to Use
+- Predicting event occurrence within specific time windows
+- Clinical decision-making at specific timepoints (e.g., 5-year survival)
+- Want to evaluate performance across different time horizons
+- Need both discrimination and timing information
+### Key Function: cumulative_dynamic_auc
+```python
+from sksurv.metrics import cumulative_dynamic_auc
+# Define evaluation times
+times = [365, 730, 1095, 1460, 1825]  # 1, 2, 3, 4, 5 years
+# Compute time-dependent AUC
+auc, mean_auc = cumulative_dynamic_auc(
+    y_train, y_test, risk_scores, times
+)
+# Plot AUC over time
+import matplotlib.pyplot as plt
+plt.plot(times, auc, marker='o')
+plt.xlabel('Time (days)')
+plt.ylabel('Time-dependent AUC')
+plt.title('Model Discrimination Over Time')
+plt.show()
+print(f"Mean AUC: {mean_auc:.3f}")
+```
+### Interpretation
+- **AUC at time t**: Probability model correctly ranks a subject who had event by time t above one who didn't
+- **Varying AUC over time**: Indicates model performance changes with time horizon
+- **Mean AUC**: Overall summary of discrimination across all time points
+### Example: Comparing Models
+```python
+# Compare two models
+auc1, mean_auc1 = cumulative_dynamic_auc(y_train, y_test, risk_scores1, times)
+auc2, mean_auc2 = cumulative_dynamic_auc(y_train, y_test, risk_scores2, times)
+plt.plot(times, auc1, marker='o', label='Model 1')
+plt.plot(times, auc2, marker='s', label='Model 2')
+plt.xlabel('Time (days)')
+plt.ylabel('Time-dependent AUC')
+plt.legend()
+plt.show()
+```
+## Brier Score
+### What It Measures
+Brier score extends mean squared error to survival data with censoring. It measures both discrimination (ranking) and calibration (accuracy of predicted probabilities).
+**Formula**: **(1/n) Σ (S(t|x_i) - I(T_i > t))²**
+where S(t|x_i) is predicted survival probability at time t for subject i.
+**Range**: 0 to 1
+- 0 = perfect predictions
+- Lower is better
+- Typical good performance: < 0.2
+### When to Use
+- Need calibration assessment (not just ranking)
+- Want to evaluate predicted probabilities, not just risk scores
+- Comparing models that output survival functions
+- Clinical applications requiring probability estimates
+### Key Functions
+#### brier_score: Single Time Point
+```python
+from sksurv.metrics import brier_score
+# Compute Brier score at specific time
+time_point = 1825  # 5 years
+surv_probs = model.predict_survival_function(X_test)
+# Extract survival probability at time_point for each subject
+surv_at_t = [fn(time_point) for fn in surv_probs]
+bs = brier_score(y_train, y_test, surv_at_t, time_point)[1]
+print(f"Brier score at {time_point} days: {bs:.3f}")
+```
+#### integrated_brier_score: Summary Across Time
+```python
+from sksurv.metrics import integrated_brier_score
+# Compute integrated Brier score
+times = [365, 730, 1095, 1460, 1825]
+surv_probs = model.predict_survival_function(X_test)
+ibs = integrated_brier_score(y_train, y_test, surv_probs, times)
+print(f"Integrated Brier Score: {ibs:.3f}")
+```
+### Interpretation
+- **Brier score at time t**: Expected squared difference between predicted and actual survival at time t
+- **Integrated Brier Score**: Weighted average of Brier scores across time
+- **Lower values = better predictions**
+### Comparison with Null Model
+Always compare against a baseline (e.g., Kaplan-Meier):
+```python
+from sksurv.nonparametric import kaplan_meier_estimator
+# Compute Kaplan-Meier baseline
+time_km, surv_km = kaplan_meier_estimator(y_train['event'], y_train['time'])
+# Predict with KM for each test subject
+surv_km_test = [surv_km[time_km <= time_point][-1] if any(time_km <= time_point) else 1.0
+                for _ in range(len(X_test))]
+bs_km = brier_score(y_train, y_test, surv_km_test, time_point)[1]
+bs_model = brier_score(y_train, y_test, surv_at_t, time_point)[1]
+print(f"Kaplan-Meier Brier Score: {bs_km:.3f}")
+print(f"Model Brier Score: {bs_model:.3f}")
+print(f"Improvement: {(bs_km - bs_model) / bs_km * 100:.1f}%")
+```
+## Using Metrics with Cross-Validation
+### Concordance Index Scorer
+```python
+from sklearn.model_selection import cross_val_score
+from sksurv.metrics import as_concordance_index_ipcw_scorer
+# Create scorer
+scorer = as_concordance_index_ipcw_scorer()
+# Perform cross-validation
+scores = cross_val_score(model, X, y, cv=5, scoring=scorer)
+print(f"Mean C-index: {scores.mean():.3f} (±{scores.std():.3f})")
+```
+### Integrated Brier Score Scorer
+```python
+from sksurv.metrics import as_integrated_brier_score_scorer
+# Define time points for evaluation
+times = np.percentile(y['time'][y['event']], [25, 50, 75])
+# Create scorer
+scorer = as_integrated_brier_score_scorer(times)
+# Perform cross-validation
+scores = cross_val_score(model, X, y, cv=5, scoring=scorer)
+print(f"Mean IBS: {scores.mean():.3f} (±{scores.std():.3f})")
+```
+## Model Selection with GridSearchCV
+```python
+from sklearn.model_selection import GridSearchCV
+from sksurv.ensemble import RandomSurvivalForest
+from sksurv.metrics import as_concordance_index_ipcw_scorer
+# Define parameter grid
+param_grid = {
+    'n_estimators': [100, 200, 300],
+    'min_samples_split': [10, 20, 30],
+    'max_depth': [None, 10, 20]
+}
+# Create scorer
+scorer = as_concordance_index_ipcw_scorer()
+# Perform grid search
+cv = GridSearchCV(
+    RandomSurvivalForest(random_state=42),
+    param_grid,
+    scoring=scorer,
+    cv=5,
+    n_jobs=-1
+)
+cv.fit(X, y)
+print(f"Best parameters: {cv.best_params_}")
+print(f"Best C-index: {cv.best_score_:.3f}")
+```
+## Comprehensive Model Evaluation
+### Recommended Evaluation Pipeline
+```python
+from sksurv.metrics import (
+    concordance_index_censored,
+    concordance_index_ipcw,
+    cumulative_dynamic_auc,
+    integrated_brier_score
+)
+def evaluate_survival_model(model, X_train, X_test, y_train, y_test):
+    """Comprehensive evaluation of survival model"""
+    # Get predictions
+    risk_scores = model.predict(X_test)
+    surv_funcs = model.predict_survival_function(X_test)
+    # 1. Concordance Index (both versions)
+    c_harrell = concordance_index_censored(y_test['event'], y_test['time'], risk_scores)[0]
+    c_uno = concordance_index_ipcw(y_train, y_test, risk_scores)[0]
+    # 2. Time-dependent AUC
+    times = np.percentile(y_test['time'][y_test['event']], [25, 50, 75])
+    auc, mean_auc = cumulative_dynamic_auc(y_train, y_test, risk_scores, times)
+    # 3. Integrated Brier Score
+    ibs = integrated_brier_score(y_train, y_test, surv_funcs, times)
+    # Print results
+    print("=" * 50)
+    print("Model Evaluation Results")
+    print("=" * 50)
+    print(f"Harrell's C-index:  {c_harrell:.3f}")
+    print(f"Uno's C-index:      {c_uno:.3f}")
+    print(f"Mean AUC:           {mean_auc:.3f}")
+    print(f"Integrated Brier:   {ibs:.3f}")
+    print("=" * 50)
+    return {
+        'c_harrell': c_harrell,
+        'c_uno': c_uno,
+        'mean_auc': mean_auc,
+        'ibs': ibs,
+        'time_auc': dict(zip(times, auc))
+    }
+# Use the evaluation function
+results = evaluate_survival_model(model, X_train, X_test, y_train, y_test)
+```
+## Choosing the Right Metric
+### Decision Guide
+**Use C-index (Uno's) when:**
+- Primary goal is ranking/discrimination
+- Don't need calibrated probabilities
+- Standard survival analysis setting
+- Most common choice
+**Use Time-dependent AUC when:**
+- Need discrimination at specific time points
+- Clinical decisions at specific horizons
+- Want to understand how performance varies over time
+**Use Brier Score when:**
+- Need calibrated probability estimates
+- Both discrimination AND calibration important
+- Clinical decision-making requiring probabilities
+- Want comprehensive assessment
+**Best Practice**: Report multiple metrics for comprehensive evaluation. At minimum, report:
+- Uno's C-index (discrimination)
+- Integrated Brier Score (discrimination + calibration)
+- Time-dependent AUC at clinically relevant time points