PyPI - deskit - Versions diffs - 0.3.0__tar.gz → 1.0.0__tar.gz - Mend

deskit 0.3.0tar.gz → 1.0.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (32) hide show

{deskit-0.3.0/src/deskit.egg-info → deskit-1.0.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: deskit
-Version: 0.3.0
+Version: 1.0.0
 Summary: A Python library for Dynamic Ensemble Selection
 Author: Tikhon Vodyanov
 License-Expression: MIT
@@ -95,7 +95,7 @@ NumPy (>= 1.21)
 ## Quick start
-Full explanation of the algorithms, syntax, and parameters is available in the [documentation](https://TikaaVo.github.io/deskit/).
+For a more detailed understanding of how to use the library, consult the [documentation](https://TikaaVo.github.io/deskit/).
 ```python
 from deskit.des.knorau  import KNORAU
@@ -150,14 +150,22 @@ weights = router.predict(X_test[i])
 ## Algorithms
-| Method    | Best for | Notes                                                                                                    |
-|-----------|---|----------------------------------------------------------------------------------------------------------|
-| `DEWSU`  | Regression | Softmax over neighbourhood-averaged scores. Temperature controls sharpness.                              |
-| `DEWSI` | Regression | Like DEWS-U but scores are inverse-distance weighted.                                                   |
-| `KNORAU`  | Classification | Vote-count weighting. Each model earns one vote per neighbour it correctly classifies.                   |
-| `KNORAE`  | Classification | Intersection-based. Only models correct on all neighbours survive; falls back to smaller neighbourhoods. |
-| `KNORAIU` | Classification | Like KNORA-U but votes are inverse-distance weighted.                                                    |
-| `OLA`     | Both | Hard selection: only the single best model in the neighbourhood contributes.                             |
+Full explanation of the algorithms, syntax, and parameters is available in the [documentation](https://TikaaVo.github.io/deskit/).
+If you're struggling to decide on which algorithm to use, see the [algorithm selection guide](https://TikaaVo.github.io/deskit/selection).
+| Method     | Best for       | Notes                                                                                                  |
+|------------|----------------|--------------------------------------------------------------------------------------------------------|
+| `DEWS-U`   | Regression     | Softmax over neighborhood-averaged scores. Temperature controls sharpness.                             |
+| `DEWS-I`   | Regression     | Like DEWS-U but scores are inverse-distance weighted.                                                  |
+| `DEWS-T`   | Both           | Like DEWS-I but fits a weighted trend line over neighbor scores.                                       |
+| `DEWS-V`   | Regression     | Like DEWS-U but scores are variance-penalized.                                                                             |
+| `DEWS-IV`  | Regression     | Like DEWS-V but scores are also inverse-distance weighted.                                             |
+| `LWSE-U`   | Both           | Per-sample NNLS weight estimation over the local neighbourhood.                                        |
+| `LWSE-I`   | Both           | Like LWSE-U but rows are inverse-distance weighted.                                                    |
+| `KNORA-U`  | Classification | Each model earns one vote per neighbor it correctly classifies.                  |
+| `KNORA-E`  | Classification | Only models correct on all neighbors survive; falls back to smaller neighborhoods. |
+| `KNORA-IU` | Classification | Like KNORA-U but votes are inverse-distance weighted.                                                  |
+| `OLA`      | Both           | Hard selection: only the single best model in the neighborhood contributes.                            |
 ---
@@ -218,74 +226,76 @@ passed features either need to be run through a feature extractor beforehand, su
 ## Benchmark results
-100-seed benchmark (seeds 0–99) on standard sklearn and OpenML datasets. "Best Single" is the best
+20-seed benchmark (seeds 0–19) on standard sklearn and OpenML datasets. "Best Single" is the best
 individual model selected on the validation set. "Simple Average" is uniform
 equal-weight blending, included as a baseline.
 It is important to consider that these experiments were run with the default hyperparameters, meaning that
 they could vary greatly with different values, and results could improve with tuning.
-For a more detailed benchmark breakdown, see the [documentation](https://TikaaVo.github.io/deskit/).
+For a more detailed benchmark breakdown, see the [benchmark in the documentation](https://TikaaVo.github.io/deskit/benchmark).
 To see the full results, see `results.txt` in the `tests` folder.
-Pool: KNN, Decision Tree, SVR, Ridge, Bayesian Ridge.
 This pool was selected for having variability in architectures while avoiding a single dominant model.
-deskit algorithms tested: OLA, DEWS-U, DEWS-I, KNORA-U, KNORA-E, KNORA-IU.
+deskit algorithms tested: OLA, DEWS-U, DEWS-I, DEWS-T, DEWS-V, DEWS-IV, LWSE-U, LWSE-I, KNORA-U, KNORA-E, KNORA-IU.
 ### Regression (MAE, lower is better)
-% shown as delta vs Best Single. 100-seed mean.
+Pool: KNN, Decision Tree, SVR, Ridge, Bayesian Ridge.
+% shown as delta vs Best Single. 20-seed mean.
-| Dataset                      | Best Single | Simple Avg | deskit best             |
-|------------------------------|-------------|------------|-------------------------|
-| California Housing (sklearn) | 0.3955      | +7.93%     | **−2.68%** (DEWS-I)  |
-| Bike Sharing (OpenML)        | 51.604      | +48.39%    | **−6.25%** (DEWS-I)  |
-| Abalone (OpenML)             | **1.4923**  | +1.29%     | +1.61% (KNORA-IU)       |
-| Diabetes (sklearn)           | **44.986**  | +2.98%     | +0.88% (DEWS-I)      |
-| Concrete Strength (OpenML)   | 5.3934      | +21.30%    | **−2.85%** (KNORA-IU)   |
+| Dataset                      | Best Single | Simple Avg | deskit best               |
+|------------------------------|-------------|------------|---------------------------|
+| California Housing (sklearn) | 0.3956      | +7.99%     | **−2.54%** (DEWS-I)       |
+| Bike Sharing (OpenML)        | 51.678      | +47.77%    | **−6.86%** (DEWS-I)       |
+| Abalone (OpenML)             | **1.4981**  | +1.14%     | +1.47% (KNORA-U/KNORA-IU) |
+| Diabetes (sklearn)           | **44.504**  | +3.18%     | +0.86% (DEWS-IV)          |
+| Concrete Strength (OpenML)   | 5.2686      | +23.66%    | **−5.41%** (LWSE-I)       |
 deskit beats best single and simple averaging on 3/5 regression datasets. This shows how DES can provide a
 strong boost if used on the right dataset, but it might be counterproductive if used blindly.
 KNORA variants are designed for classification, which explains the poor performance
-on regression datasets; However, some exception can occur in certain datasets, either where
-feature space is has hard clusters (like in Concrete Strength) or when the target is discrete
+on regression datasets; However, some exceptions can occur in certain datasets when the target is discrete
 and classification-like (like in Abalone).
+DEWS-I and LWSE-I show the largest improvements on their respective datasets.
 ### Classification (Accuracy, higher is better)
-% shown as delta vs Best Single. 100-seed mean.
+Pool: KNN, Decision Tree, Gaussian NB, SVM-RBF, Logistic Regression.
+% shown as delta vs Best Single. 20-seed mean.
-| Dataset                | Best Single | Simple Avg | deskit best             |
-|------------------------|-------------|------------|-------------------------|
-| HAR (OpenML)           | 98.24%      | −0.32%     | **+0.14%** (DEWS-I)  |
-| Yeast (OpenML)         | 59.19%      | +0.46%     | **+1.48%** (KNORA-IU)   |
-| Image Segment (OpenML) | 93.65%      | +1.70%     | **+2.33%** (KNORA-IU)   |
-| Waveform (OpenML)      | **86.28%**  | −1.04%     | −0.55% (DEWS-I)      |
-| Vowel (OpenML)         | 90.54%      | −1.81%     | **+0.93%** (KNORA-IU)   |
+| Dataset                | Best Single | Simple Avg | deskit best                    |
+|------------------------|-------------|------------|--------------------------------|
+| HAR (OpenML)           | 98.24%      | −0.33%     | **+0.16%** (DEWS-T)            |
+| Yeast (OpenML)         | 58.87%      | +0.77%     | **+1.66%** (KNORA-IU)          |
+| Image Segment (OpenML) | 93.70%      | +1.40%     | **+2.25%** (DEWS-T / DEWS-IV)  |
+| Waveform (OpenML)      | **85.91%**  | −0.98%     | −0.39% (DEWS-T)                |
+| Vowel (OpenML)         | 89.95%      | −2.05%     | **+2.95%** (LWSE-I)            |
-deskit beats or matches best single and simple averaging on 4/5 classification datasets. As seen on regression, DES
-can improve or hurt performance, so it must be used wisely, but if used correctly it can show promising results.
+deskit beats or matches best single and simple averaging on 4/5 classification datasets.
 ### Speed (mean ms fit + predict, 20 seeds, all tested algorithms combined)
-Consider that usually it is recommended to only use one algorithm at a time, this benchmark ran six of them at the
-same time, so with a single one runtime is expected to be about 6x faster. For this benchmark, `preset='balanced'` was used,
+Consider that usually it is recommended to only use one algorithm at a time, this benchmark ran eleven of them at the
+same time, so with a single one runtime is expected to be about 11x faster. For this benchmark, `preset='balanced'` was used,
 so the backend was an ANN algorithm with FAISS IVF.
-| Dataset            | deskit    |
-|--------------------|-----------|
-| California Housing | 159.8 ms  |
-| Bike Sharing       | 130.3 ms  |
-| Abalone            | 32.9 ms   |
-| Diabetes           | 8.2 ms    |
-| Conrete Strength   | 10.8 ms   |
-| HAR                | 352.0 ms  |
-| Yeast              | 18.6 ms   |
-| Image Segment      | 32.4 ms   |
-| Waveform           | 58.7 ms   |
-| Vowel              | 19.6 ms   |
+| Dataset            | deskit (11 algorithms) |
+|--------------------|------------------------|
+| California Housing | 351.0 ms               |
+| Bike Sharing       | 283.5 ms               |
+| Abalone            | 72.9 ms                |
+| Diabetes           | 14.0 ms                |
+| Concrete Strength  | 22.5 ms                |
+| HAR                | 693.1 ms               |
+| Yeast              | 44.7 ms                |
+| Image Segment      | 69.9 ms                |
+| Waveform           | 124.5 ms               |
+| Vowel              | 39.0 ms                |
 deskit caches all model predictions on the validation set at fit time and reads
 from that matrix at inference.

{deskit-0.3.0 → deskit-1.0.0}/README.md RENAMED Viewed

@@ -64,7 +64,7 @@ NumPy (>= 1.21)
 ## Quick start
-Full explanation of the algorithms, syntax, and parameters is available in the [documentation](https://TikaaVo.github.io/deskit/).
+For a more detailed understanding of how to use the library, consult the [documentation](https://TikaaVo.github.io/deskit/).
 ```python
 from deskit.des.knorau  import KNORAU
@@ -119,14 +119,22 @@ weights = router.predict(X_test[i])
 ## Algorithms
-| Method    | Best for | Notes                                                                                                    |
-|-----------|---|----------------------------------------------------------------------------------------------------------|
-| `DEWSU`  | Regression | Softmax over neighbourhood-averaged scores. Temperature controls sharpness.                              |
-| `DEWSI` | Regression | Like DEWS-U but scores are inverse-distance weighted.                                                   |
-| `KNORAU`  | Classification | Vote-count weighting. Each model earns one vote per neighbour it correctly classifies.                   |
-| `KNORAE`  | Classification | Intersection-based. Only models correct on all neighbours survive; falls back to smaller neighbourhoods. |
-| `KNORAIU` | Classification | Like KNORA-U but votes are inverse-distance weighted.                                                    |
-| `OLA`     | Both | Hard selection: only the single best model in the neighbourhood contributes.                             |
+Full explanation of the algorithms, syntax, and parameters is available in the [documentation](https://TikaaVo.github.io/deskit/).
+If you're struggling to decide on which algorithm to use, see the [algorithm selection guide](https://TikaaVo.github.io/deskit/selection).
+| Method     | Best for       | Notes                                                                                                  |
+|------------|----------------|--------------------------------------------------------------------------------------------------------|
+| `DEWS-U`   | Regression     | Softmax over neighborhood-averaged scores. Temperature controls sharpness.                             |
+| `DEWS-I`   | Regression     | Like DEWS-U but scores are inverse-distance weighted.                                                  |
+| `DEWS-T`   | Both           | Like DEWS-I but fits a weighted trend line over neighbor scores.                                       |
+| `DEWS-V`   | Regression     | Like DEWS-U but scores are variance-penalized.                                                                             |
+| `DEWS-IV`  | Regression     | Like DEWS-V but scores are also inverse-distance weighted.                                             |
+| `LWSE-U`   | Both           | Per-sample NNLS weight estimation over the local neighbourhood.                                        |
+| `LWSE-I`   | Both           | Like LWSE-U but rows are inverse-distance weighted.                                                    |
+| `KNORA-U`  | Classification | Each model earns one vote per neighbor it correctly classifies.                  |
+| `KNORA-E`  | Classification | Only models correct on all neighbors survive; falls back to smaller neighborhoods. |
+| `KNORA-IU` | Classification | Like KNORA-U but votes are inverse-distance weighted.                                                  |
+| `OLA`      | Both           | Hard selection: only the single best model in the neighborhood contributes.                            |
 ---
@@ -187,74 +195,76 @@ passed features either need to be run through a feature extractor beforehand, su
 ## Benchmark results
-100-seed benchmark (seeds 0–99) on standard sklearn and OpenML datasets. "Best Single" is the best
+20-seed benchmark (seeds 0–19) on standard sklearn and OpenML datasets. "Best Single" is the best
 individual model selected on the validation set. "Simple Average" is uniform
 equal-weight blending, included as a baseline.
 It is important to consider that these experiments were run with the default hyperparameters, meaning that
 they could vary greatly with different values, and results could improve with tuning.
-For a more detailed benchmark breakdown, see the [documentation](https://TikaaVo.github.io/deskit/).
+For a more detailed benchmark breakdown, see the [benchmark in the documentation](https://TikaaVo.github.io/deskit/benchmark).
 To see the full results, see `results.txt` in the `tests` folder.
-Pool: KNN, Decision Tree, SVR, Ridge, Bayesian Ridge.
 This pool was selected for having variability in architectures while avoiding a single dominant model.
-deskit algorithms tested: OLA, DEWS-U, DEWS-I, KNORA-U, KNORA-E, KNORA-IU.
+deskit algorithms tested: OLA, DEWS-U, DEWS-I, DEWS-T, DEWS-V, DEWS-IV, LWSE-U, LWSE-I, KNORA-U, KNORA-E, KNORA-IU.
 ### Regression (MAE, lower is better)
-% shown as delta vs Best Single. 100-seed mean.
+Pool: KNN, Decision Tree, SVR, Ridge, Bayesian Ridge.
+% shown as delta vs Best Single. 20-seed mean.
-| Dataset                      | Best Single | Simple Avg | deskit best             |
-|------------------------------|-------------|------------|-------------------------|
-| California Housing (sklearn) | 0.3955      | +7.93%     | **−2.68%** (DEWS-I)  |
-| Bike Sharing (OpenML)        | 51.604      | +48.39%    | **−6.25%** (DEWS-I)  |
-| Abalone (OpenML)             | **1.4923**  | +1.29%     | +1.61% (KNORA-IU)       |
-| Diabetes (sklearn)           | **44.986**  | +2.98%     | +0.88% (DEWS-I)      |
-| Concrete Strength (OpenML)   | 5.3934      | +21.30%    | **−2.85%** (KNORA-IU)   |
+| Dataset                      | Best Single | Simple Avg | deskit best               |
+|------------------------------|-------------|------------|---------------------------|
+| California Housing (sklearn) | 0.3956      | +7.99%     | **−2.54%** (DEWS-I)       |
+| Bike Sharing (OpenML)        | 51.678      | +47.77%    | **−6.86%** (DEWS-I)       |
+| Abalone (OpenML)             | **1.4981**  | +1.14%     | +1.47% (KNORA-U/KNORA-IU) |
+| Diabetes (sklearn)           | **44.504**  | +3.18%     | +0.86% (DEWS-IV)          |
+| Concrete Strength (OpenML)   | 5.2686      | +23.66%    | **−5.41%** (LWSE-I)       |
 deskit beats best single and simple averaging on 3/5 regression datasets. This shows how DES can provide a
 strong boost if used on the right dataset, but it might be counterproductive if used blindly.
 KNORA variants are designed for classification, which explains the poor performance
-on regression datasets; However, some exception can occur in certain datasets, either where
-feature space is has hard clusters (like in Concrete Strength) or when the target is discrete
+on regression datasets; However, some exceptions can occur in certain datasets when the target is discrete
 and classification-like (like in Abalone).
+DEWS-I and LWSE-I show the largest improvements on their respective datasets.
 ### Classification (Accuracy, higher is better)
-% shown as delta vs Best Single. 100-seed mean.
+Pool: KNN, Decision Tree, Gaussian NB, SVM-RBF, Logistic Regression.
+% shown as delta vs Best Single. 20-seed mean.
-| Dataset                | Best Single | Simple Avg | deskit best             |
-|------------------------|-------------|------------|-------------------------|
-| HAR (OpenML)           | 98.24%      | −0.32%     | **+0.14%** (DEWS-I)  |
-| Yeast (OpenML)         | 59.19%      | +0.46%     | **+1.48%** (KNORA-IU)   |
-| Image Segment (OpenML) | 93.65%      | +1.70%     | **+2.33%** (KNORA-IU)   |
-| Waveform (OpenML)      | **86.28%**  | −1.04%     | −0.55% (DEWS-I)      |
-| Vowel (OpenML)         | 90.54%      | −1.81%     | **+0.93%** (KNORA-IU)   |
+| Dataset                | Best Single | Simple Avg | deskit best                    |
+|------------------------|-------------|------------|--------------------------------|
+| HAR (OpenML)           | 98.24%      | −0.33%     | **+0.16%** (DEWS-T)            |
+| Yeast (OpenML)         | 58.87%      | +0.77%     | **+1.66%** (KNORA-IU)          |
+| Image Segment (OpenML) | 93.70%      | +1.40%     | **+2.25%** (DEWS-T / DEWS-IV)  |
+| Waveform (OpenML)      | **85.91%**  | −0.98%     | −0.39% (DEWS-T)                |
+| Vowel (OpenML)         | 89.95%      | −2.05%     | **+2.95%** (LWSE-I)            |
-deskit beats or matches best single and simple averaging on 4/5 classification datasets. As seen on regression, DES
-can improve or hurt performance, so it must be used wisely, but if used correctly it can show promising results.
+deskit beats or matches best single and simple averaging on 4/5 classification datasets.
 ### Speed (mean ms fit + predict, 20 seeds, all tested algorithms combined)
-Consider that usually it is recommended to only use one algorithm at a time, this benchmark ran six of them at the
-same time, so with a single one runtime is expected to be about 6x faster. For this benchmark, `preset='balanced'` was used,
+Consider that usually it is recommended to only use one algorithm at a time, this benchmark ran eleven of them at the
+same time, so with a single one runtime is expected to be about 11x faster. For this benchmark, `preset='balanced'` was used,
 so the backend was an ANN algorithm with FAISS IVF.
-| Dataset            | deskit    |
-|--------------------|-----------|
-| California Housing | 159.8 ms  |
-| Bike Sharing       | 130.3 ms  |
-| Abalone            | 32.9 ms   |
-| Diabetes           | 8.2 ms    |
-| Conrete Strength   | 10.8 ms   |
-| HAR                | 352.0 ms  |
-| Yeast              | 18.6 ms   |
-| Image Segment      | 32.4 ms   |
-| Waveform           | 58.7 ms   |
-| Vowel              | 19.6 ms   |
+| Dataset            | deskit (11 algorithms) |
+|--------------------|------------------------|
+| California Housing | 351.0 ms               |
+| Bike Sharing       | 283.5 ms               |
+| Abalone            | 72.9 ms                |
+| Diabetes           | 14.0 ms                |
+| Concrete Strength  | 22.5 ms                |
+| HAR                | 693.1 ms               |
+| Yeast              | 44.7 ms                |
+| Image Segment      | 69.9 ms                |
+| Waveform           | 124.5 ms               |
+| Vowel              | 39.0 ms                |
 deskit caches all model predictions on the validation set at fit time and reads
 from that matrix at inference.

{deskit-0.3.0 → deskit-1.0.0}/pyproject.toml RENAMED Viewed

@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 [project]
 name = "deskit"
-version = "0.3.0"
+version = "1.0.0"
 description = "A Python library for Dynamic Ensemble Selection"
 readme = "README.md"
 license = "MIT"

deskit-1.0.0/src/deskit/des/dewsiv.py ADDED Viewed

@@ -0,0 +1,195 @@
+"""
+DEWS-IV: Distance-weighted Ensemble with Softmax — Inverse-distance + Variance-penalised.
+"""
+from deskit.base.knnbase import KNNBase
+from deskit._config import make_finder, resolve_metric, prep_fit_inputs
+from deskit.utils import to_numpy
+import numpy as np
+_SIGNED_METRICS = {'mae', 'mse'}
+def _signed_residual(y_true, y_pred):
+    return float(y_true) - float(y_pred)
+class DEWSIV(KNNBase):
+    """
+    DEWS-IV: Distance-weighted Ensemble with Softmax — Inverse-distance + Variance-penalised.
+    Combines DEWS-I and DEWS-V. The mean score is inverse-distance weighted
+    (closer neighbours contribute more), and the variance penalty is also
+    computed with the same inverse-distance weights, so erratic behaviour
+    close to the test point is penalised more heavily than erratic behaviour
+    among distant neighbours.
+    Both scores and variance are normalised to [0, 1] within each neighbourhood
+    before the penalty is applied, making the adjustment dimensionless and
+    consistent regardless of metric scale or task type.
+    For MAE and MSE, variance is computed from signed residuals
+    (y_true - y_pred) rather than raw metric values, so that a model
+    oscillating between equal positive and negative errors is correctly
+    identified as inconsistent. The mean score used for routing still comes
+    from the standard metric (MAE/MSE), only the variance term uses signed
+    residuals.
+    For all other metrics, variance is computed directly from the score matrix.
+    Parameters
+    ----------
+    task : str
+        'classification' or 'regression'.
+    metric : str or callable
+        Scoring function. 'mae' or 'mse' activate signed-residual variance;
+        all other metrics use the score matrix directly for variance.
+    mode : str
+        'max' if higher scores are better, 'min' if lower.
+    k : int
+        Neighbourhood size. Default: 10.
+    threshold : float
+        Competence gate. After per-neighbourhood normalisation (best=1.0,
+        worst=0.0), models below this fraction are excluded from softmax.
+        0.0 disables the gate; 1.0 reduces to OLA behaviour. Default: 0.5.
+    temperature : float, optional
+        Softmax sharpness. Lower = sharper routing toward the local best model.
+        Defaults to 0.1 for min-metrics, 1.0 otherwise.
+    preset : str
+        Neighbour search preset. Default: 'balanced'. See list_presets().
+    """
+    def __init__(self, task, metric='mae', mode='min', k=10,
+                 threshold=0.5, temperature=None, preset='balanced', **kwargs):
+        metric_name, metric_fn = resolve_metric(metric)
+        finder = make_finder(preset, k, **kwargs)
+        self._use_signed  = metric_name in _SIGNED_METRICS
+        self._metric_name = metric_name
+        super().__init__(metric=metric_fn, mode=mode, neighbor_finder=finder)
+        self.task         = task
+        self.threshold    = threshold
+        self._temperature = temperature
+        self._var_matrix  = None   # (n_val, n_models) signed residuals, MAE/MSE only
+    def fit(self, features, y, preds_dict):
+        """
+        Fit the routing model on validation data.
+        Parameters
+        ----------
+        features : array-like, shape (n_val, n_features)
+            Validation features. Must not overlap with train or test data.
+        y : array-like, shape (n_val,)
+            Validation ground-truth labels or values.
+        preds_dict : dict[str, array-like]
+            Validation predictions keyed by model name.
+        """
+        features, y, preds_dict = prep_fit_inputs(
+            features, y, preds_dict, self._metric_name
+        )
+        super().fit(features, y, preds_dict)
+        # Build signed residual matrix for variance computation (MAE/MSE only).
+        if self._use_signed:
+            n_val    = len(y)
+            n_models = len(self.models)
+            self._var_matrix = np.zeros((n_val, n_models))
+            for j, name in enumerate(self.models):
+                preds = np.asarray(preds_dict[name])
+                self._var_matrix[:, j] = np.vectorize(_signed_residual)(y, preds)
+    def predict(self, x, temperature=None, threshold=None):
+        """
+        Return per-sample model weights.
+        Parameters
+        ----------
+        x : array-like, shape (n_features,) or (n_samples, n_features)
+        temperature : float, optional
+            Overrides the instance temperature for this call.
+        threshold : float, optional
+            Overrides the instance threshold for this call.
+        Returns
+        -------
+        dict or list of dict
+            Single sample: {model_name: weight}. Batch: list of such dicts.
+        """
+        t  = temperature if temperature is not None else (
+             self._temperature if self._temperature is not None else
+             (0.1 if self.mode == 'min' else 1.0))
+        th = threshold if threshold is not None else self.threshold
+        x          = np.atleast_2d(to_numpy(x))
+        batch_size = x.shape[0]
+        distances, indices = self.model.kneighbors(x)     # both (batch, k)
+        # Inverse-distance weights — same as DEWS-I.
+        inv_dist   = 1.0 / np.maximum(distances, 1e-8)            # (batch, k)
+        inv_dist_w = inv_dist / inv_dist.sum(axis=1, keepdims=True)  # normalised, (batch, k)
+        # Inverse-distance weighted mean of each model's scores over K neighbours.
+        neighbor_scores = self.matrix[indices]                     # (batch, k, n_models)
+        avg_scores = (neighbor_scores * inv_dist_w[:, :, np.newaxis]).sum(axis=1)  # (batch, n_models)
+        # Select source matrix for variance computation.
+        if self._use_signed:
+            var_source = self._var_matrix[indices]                 # (batch, k, n_models)
+        else:
+            var_source = neighbor_scores
+        # Inverse-distance weighted variance: σ²_w = Σ w_i * (x_i - μ_w)²
+        # For signed residuals the mean is computed from var_source, not avg_scores,
+        # so that the variance is internally consistent with its own mean.
+        w = inv_dist_w[:, :, np.newaxis]                           # (batch, k, 1)
+        var_mean   = (var_source * w).sum(axis=1)                  # (batch, n_models)
+        residuals  = var_source - var_mean[:, np.newaxis, :]       # (batch, k, n_models)
+        local_var  = (w * residuals ** 2).sum(axis=1)              # (batch, n_models)
+        # Normalize scores to [0, 1] before applying variance penalty so that
+        # the penalty is dimensionless and consistent across metrics and scales.
+        local_min   = avg_scores.min(axis=1, keepdims=True)
+        local_max   = avg_scores.max(axis=1, keepdims=True)
+        local_range = local_max - local_min
+        norm_scores = (avg_scores - local_min) / np.where(local_range > 0, local_range, 1.0)
+        # Normalize variance to [0, 1] across models within each sample so the
+        # penalty magnitude is also scale-independent.
+        var_min   = local_var.min(axis=1, keepdims=True)
+        var_max   = local_var.max(axis=1, keepdims=True)
+        var_range = var_max - var_min
+        norm_var  = (local_var - var_min) / np.where(var_range > 0, var_range, 1.0)
+        # Penalise inconsistent models: divide normalised score by (1 + normalised variance).
+        norm_scores = norm_scores / (1.0 + norm_var)
+        # Re-normalise after penalty so the gate threshold remains meaningful.
+        local_min   = norm_scores.min(axis=1, keepdims=True)
+        local_max   = norm_scores.max(axis=1, keepdims=True)
+        local_range = local_max - local_min
+        norm_scores = (norm_scores - local_min) / np.where(local_range > 0, local_range, 1.0)
+        # Zero out models below threshold.
+        if th > 0:
+            gate        = norm_scores >= th
+            any_pass    = gate.any(axis=1, keepdims=True)
+            gate        = np.where(any_pass, gate, norm_scores == 1.0)
+            norm_scores = norm_scores * gate
+        # Softmax.
+        max_scores = norm_scores.max(axis=1, keepdims=True)
+        exp_scores = np.exp((norm_scores - max_scores) / t)
+        if th > 0:
+            exp_scores = exp_scores * gate
+        total   = exp_scores.sum(axis=1, keepdims=True)
+        weights = np.where(total > 0,
+                           exp_scores / np.where(total > 0, total, 1.0),
+                           np.full_like(exp_scores, 1.0 / len(self.models)))
+        if batch_size == 1:
+            return dict(zip(self.models, weights[0]))
+        return [dict(zip(self.models, w)) for w in weights]

deskit 0.3.0__tar.gz → 1.0.0__tar.gz

deskit 0.3.0tar.gz → 1.0.0tar.gz