deskit 0.1.0__tar.gz → 0.3.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (27) hide show
  1. {deskit-0.1.0/src/deskit.egg-info → deskit-0.3.0}/PKG-INFO +53 -45
  2. {deskit-0.1.0 → deskit-0.3.0}/README.md +53 -45
  3. {deskit-0.1.0 → deskit-0.3.0}/pyproject.toml +1 -1
  4. {deskit-0.1.0 → deskit-0.3.0}/src/deskit/__init__.py +4 -4
  5. {deskit-0.1.0 → deskit-0.3.0}/src/deskit/des/__init__.py +2 -2
  6. deskit-0.3.0/src/deskit/des/dewsi.py +130 -0
  7. deskit-0.1.0/src/deskit/des/knndws.py → deskit-0.3.0/src/deskit/des/dewsu.py +3 -3
  8. {deskit-0.1.0 → deskit-0.3.0}/src/deskit/router.py +9 -9
  9. {deskit-0.1.0 → deskit-0.3.0/src/deskit.egg-info}/PKG-INFO +53 -45
  10. {deskit-0.1.0 → deskit-0.3.0}/src/deskit.egg-info/SOURCES.txt +2 -1
  11. {deskit-0.1.0 → deskit-0.3.0}/LICENSE +0 -0
  12. {deskit-0.1.0 → deskit-0.3.0}/setup.cfg +0 -0
  13. {deskit-0.1.0 → deskit-0.3.0}/src/deskit/_config.py +0 -0
  14. {deskit-0.1.0 → deskit-0.3.0}/src/deskit/analysis.py +0 -0
  15. {deskit-0.1.0 → deskit-0.3.0}/src/deskit/base/__init__.py +0 -0
  16. {deskit-0.1.0 → deskit-0.3.0}/src/deskit/base/base.py +0 -0
  17. {deskit-0.1.0 → deskit-0.3.0}/src/deskit/base/knnbase.py +0 -0
  18. {deskit-0.1.0 → deskit-0.3.0}/src/deskit/des/knorae.py +0 -0
  19. {deskit-0.1.0 → deskit-0.3.0}/src/deskit/des/knoraiu.py +0 -0
  20. {deskit-0.1.0 → deskit-0.3.0}/src/deskit/des/knorau.py +0 -0
  21. {deskit-0.1.0 → deskit-0.3.0}/src/deskit/des/ola.py +0 -0
  22. {deskit-0.1.0 → deskit-0.3.0}/src/deskit/metrics.py +0 -0
  23. {deskit-0.1.0 → deskit-0.3.0}/src/deskit/neighbors.py +0 -0
  24. {deskit-0.1.0 → deskit-0.3.0}/src/deskit/utils.py +0 -0
  25. {deskit-0.1.0 → deskit-0.3.0}/src/deskit.egg-info/dependency_links.txt +0 -0
  26. {deskit-0.1.0 → deskit-0.3.0}/src/deskit.egg-info/requires.txt +0 -0
  27. {deskit-0.1.0 → deskit-0.3.0}/src/deskit.egg-info/top_level.txt +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: deskit
3
- Version: 0.1.0
3
+ Version: 0.3.0
4
4
  Summary: A Python library for Dynamic Ensemble Selection
5
5
  Author: Tikhon Vodyanov
6
6
  License-Expression: MIT
@@ -31,18 +31,20 @@ Dynamic: license-file
31
31
 
32
32
  # deskit
33
33
 
34
- [deskit](https://TikaaVo.github.io/deskit/) is a flexible, light, and easy-to-use ensembling library that implements
34
+ deskit is a flexible, lightweight, and easy-to-use ensembling library that implements
35
35
  Dynamic Ensemble Selection (DES) algorithms for ensembling multiple ML models
36
- on a singular dataset.
36
+ on a given dataset.
37
37
 
38
38
  The library works entirely with data, taking as input a validation dataset
39
- along with pre-computed predictions and outputting a dictionary of weights
39
+ along with precomputed predictions and outputting a dictionary of weights
40
40
  per model. This means that it can be used with any library or model without
41
41
  requiring any wrappers, including custom models, popular ML libraries, and APIs.
42
42
 
43
- deskit contains multiple different DES algorithms, and it works with both classification
43
+ deskit includes several DES algorithms, and it works with both classification
44
44
  and regression.
45
45
 
46
+ See the full documentation [here](https://TikaaVo.github.io/deskit/).
47
+
46
48
  # Dynamic Ensemble Selection
47
49
 
48
50
  Ensemble learning in machine learning refers to when multiple models trained on a
@@ -55,7 +57,7 @@ concept that there are regions of feature space where certain models perform par
55
57
  so every base model can be an expert in a different region.
56
58
  Only the most competent, or an ensemble of the most competent models is selected for the prediction.
57
59
 
58
- Through empirical studies, DES has been shown to perform best with small-sized, imbalanced, or
60
+ Through empirical studies, DES has been shown to perform best on small-sized, imbalanced, or
59
61
  heterogeneous datasets, as well as non-stationary data (concept drift), models that haven't perfected a dataset,
60
62
  and when used on an ensemble of models with differing architectures and perspectives.
61
63
 
@@ -148,13 +150,14 @@ weights = router.predict(X_test[i])
148
150
 
149
151
  ## Algorithms
150
152
 
151
- | Method | Best for | Notes |
152
- |---|---|---|
153
- | `KNNDWS` | Regression | Softmax over neighbourhood-averaged scores. Temperature controls sharpness. |
154
- | `KNORAU` | Classification | Vote-count weighting. Each model earns one vote per neighbour it correctly classifies. |
155
- | `KNORAE` | Classification | Intersection-based. Only models correct on all neighbours survive; falls back to smaller neighbourhoods. |
156
- | `KNORAIU` | Classification | Like KNORA-U but votes are inverse-distance weighted. |
157
- | `OLA` | Both | Hard selection: only the single best model in the neighbourhood contributes. |
153
+ | Method | Best for | Notes |
154
+ |-----------|---|----------------------------------------------------------------------------------------------------------|
155
+ | `DEWSU` | Regression | Softmax over neighbourhood-averaged scores. Temperature controls sharpness. |
156
+ | `DEWSI` | Regression | Like DEWS-U but scores are inverse-distance weighted. |
157
+ | `KNORAU` | Classification | Vote-count weighting. Each model earns one vote per neighbour it correctly classifies. |
158
+ | `KNORAE` | Classification | Intersection-based. Only models correct on all neighbours survive; falls back to smaller neighbourhoods. |
159
+ | `KNORAIU` | Classification | Like KNORA-U but votes are inverse-distance weighted. |
160
+ | `OLA` | Both | Hard selection: only the single best model in the neighbourhood contributes. |
158
161
 
159
162
  ---
160
163
 
@@ -201,16 +204,21 @@ def pinball(y_true, y_pred, alpha=0.9):
201
204
  e = y_true - y_pred
202
205
  return alpha * e if e >= 0 else (alpha - 1) * e
203
206
 
204
- router = KNNDWS(task="regression", metric=pinball, mode="min", k=20)
207
+ router = DEWSU(task="regression", metric=pinball, mode="min", k=20)
205
208
  ```
206
209
 
207
210
  Built-in metric strings: `accuracy`, `mae`, `mse`, `rmse`, `log_loss`, `prob_correct`.
208
211
 
209
212
  ---
210
213
 
214
+ ## Data types
215
+
216
+ deskit can be used with non-tabular data types like images, time series, and more. However, when used, the
217
+ passed features either need to be run through a feature extractor beforehand, such as a CNN backbone for images.
218
+
211
219
  ## Benchmark results
212
220
 
213
- 20-seed benchmark (seeds 0–19) on standard sklearn and OpenML datasets. "Best Single" is the best
221
+ 100-seed benchmark (seeds 0–99) on standard sklearn and OpenML datasets. "Best Single" is the best
214
222
  individual model selected on the validation set. "Simple Average" is uniform
215
223
  equal-weight blending, included as a baseline.
216
224
 
@@ -223,19 +231,19 @@ Pool: KNN, Decision Tree, SVR, Ridge, Bayesian Ridge.
223
231
 
224
232
  This pool was selected for having variability in architectures while avoiding a single dominant model.
225
233
 
226
- deskit algorithms tested: OLA, KNN-DWS, KNORA-U, KNORA-E, KNORA-IU.
234
+ deskit algorithms tested: OLA, DEWS-U, DEWS-I, KNORA-U, KNORA-E, KNORA-IU.
227
235
 
228
236
  ### Regression (MAE, lower is better)
229
237
 
230
- % shown as delta vs Best Single. 10-seed mean.
238
+ % shown as delta vs Best Single. 100-seed mean.
231
239
 
232
- | Dataset | Best Single | Simple Avg | deskit best |
233
- |------------------------------|-----------|---|-----------------------|
234
- | California Housing (sklearn) | 0.3956 | +7.99% | **-2.24%** (KNN-DWS) |
235
- | Bike Sharing (OpenML) | 51.6779 | +47.77% | **-5.34%** (KNN-DWS) |
236
- | Abalone (OpenML) | **1.4981** | +1.14% | +1.47% (KNORA-U) |
237
- | Diabetes (sklearn) | **44.5042** | +3.18% | +1.17% (KNN-DWS) |
238
- | Conrete Strength (OpenML) | 5.2686 | +23.66% | **-1.05%** (KNORA-IU) |
240
+ | Dataset | Best Single | Simple Avg | deskit best |
241
+ |------------------------------|-------------|------------|-------------------------|
242
+ | California Housing (sklearn) | 0.3955 | +7.93% | **−2.68%** (DEWS-I) |
243
+ | Bike Sharing (OpenML) | 51.604 | +48.39% | **−6.25%** (DEWS-I) |
244
+ | Abalone (OpenML) | **1.4923** | +1.29% | +1.61% (KNORA-IU) |
245
+ | Diabetes (sklearn) | **44.986** | +2.98% | +0.88% (DEWS-I) |
246
+ | Concrete Strength (OpenML) | 5.3934 | +21.30% | **−2.85%** (KNORA-IU) |
239
247
 
240
248
  deskit beats best single and simple averaging on 3/5 regression datasets. This shows how DES can provide a
241
249
  strong boost if used on the right dataset, but it might be counterproductive if used blindly.
@@ -247,37 +255,37 @@ and classification-like (like in Abalone).
247
255
 
248
256
  ### Classification (Accuracy, higher is better)
249
257
 
250
- % shown as delta vs Best Single. 10-seed mean.
258
+ % shown as delta vs Best Single. 100-seed mean.
251
259
 
252
- | Dataset | Best Single | Simple Avg | deskit best |
253
- |------------------------|-------------|--------|-----------------------|
254
- | HAR (OpenML) | 98.24% | -0.33% | **+0.14%** (KNN-DWS) |
255
- | Yeast (OpenML) | 58.87% | +0.77% | **+1.66%** (KNORA-IU) |
256
- | Image Segment (OpenML) | 93.70% | +1.40% | **+2.09%** (KNORA-IU) |
257
- | Waveform (OpenML) | 89.95% | -2.05% | **+0.93%** (KNORA-E) |
258
- | Vowel (OpenML) | **85.91%** | -0.98% | -0.40% (KNN-DWS) |
260
+ | Dataset | Best Single | Simple Avg | deskit best |
261
+ |------------------------|-------------|------------|-------------------------|
262
+ | HAR (OpenML) | 98.24% | 0.32% | **+0.14%** (DEWS-I) |
263
+ | Yeast (OpenML) | 59.19% | +0.46% | **+1.48%** (KNORA-IU) |
264
+ | Image Segment (OpenML) | 93.65% | +1.70% | **+2.33%** (KNORA-IU) |
265
+ | Waveform (OpenML) | **86.28%** | −1.04% | 0.55% (DEWS-I) |
266
+ | Vowel (OpenML) | 90.54% | −1.81% | **+0.93%** (KNORA-IU) |
259
267
 
260
268
  deskit beats or matches best single and simple averaging on 4/5 classification datasets. As seen on regression, DES
261
269
  can improve or hurt performance, so it must be used wisely, but if used correctly it can show promising results.
262
270
 
263
271
  ### Speed (mean ms fit + predict, 20 seeds, all tested algorithms combined)
264
272
 
265
- Consider that usually it is recommended to only use one algorithm at a time, this benchmark ran five of them at the
266
- same time, so with a single one runtime is expected to be about 5x faster. For this benchmark, `preset='balanced'` was used,
273
+ Consider that usually it is recommended to only use one algorithm at a time, this benchmark ran six of them at the
274
+ same time, so with a single one runtime is expected to be about 6x faster. For this benchmark, `preset='balanced'` was used,
267
275
  so the backend was an ANN algorithm with FAISS IVF.
268
276
 
269
277
  | Dataset | deskit |
270
- |--------------------|----------|
271
- | California Housing | 136.6 ms |
272
- | Bike Sharing | 115.5 ms |
273
- | Abalone | 28.5 ms |
274
- | Diabetes | 8.1 ms |
275
- | Conrete Strength | 9.4 ms |
276
- | HAR | 297.5 ms |
277
- | Yeast | 16.3 ms |
278
- | Image Segment | 27.2 ms |
279
- | Waveform | 48.9 ms |
280
- | Vowel | 16.5 ms |
278
+ |--------------------|-----------|
279
+ | California Housing | 159.8 ms |
280
+ | Bike Sharing | 130.3 ms |
281
+ | Abalone | 32.9 ms |
282
+ | Diabetes | 8.2 ms |
283
+ | Conrete Strength | 10.8 ms |
284
+ | HAR | 352.0 ms |
285
+ | Yeast | 18.6 ms |
286
+ | Image Segment | 32.4 ms |
287
+ | Waveform | 58.7 ms |
288
+ | Vowel | 19.6 ms |
281
289
 
282
290
  deskit caches all model predictions on the validation set at fit time and reads
283
291
  from that matrix at inference.
@@ -1,17 +1,19 @@
1
1
  # deskit
2
2
 
3
- [deskit](https://TikaaVo.github.io/deskit/) is a flexible, light, and easy-to-use ensembling library that implements
3
+ deskit is a flexible, lightweight, and easy-to-use ensembling library that implements
4
4
  Dynamic Ensemble Selection (DES) algorithms for ensembling multiple ML models
5
- on a singular dataset.
5
+ on a given dataset.
6
6
 
7
7
  The library works entirely with data, taking as input a validation dataset
8
- along with pre-computed predictions and outputting a dictionary of weights
8
+ along with precomputed predictions and outputting a dictionary of weights
9
9
  per model. This means that it can be used with any library or model without
10
10
  requiring any wrappers, including custom models, popular ML libraries, and APIs.
11
11
 
12
- deskit contains multiple different DES algorithms, and it works with both classification
12
+ deskit includes several DES algorithms, and it works with both classification
13
13
  and regression.
14
14
 
15
+ See the full documentation [here](https://TikaaVo.github.io/deskit/).
16
+
15
17
  # Dynamic Ensemble Selection
16
18
 
17
19
  Ensemble learning in machine learning refers to when multiple models trained on a
@@ -24,7 +26,7 @@ concept that there are regions of feature space where certain models perform par
24
26
  so every base model can be an expert in a different region.
25
27
  Only the most competent, or an ensemble of the most competent models is selected for the prediction.
26
28
 
27
- Through empirical studies, DES has been shown to perform best with small-sized, imbalanced, or
29
+ Through empirical studies, DES has been shown to perform best on small-sized, imbalanced, or
28
30
  heterogeneous datasets, as well as non-stationary data (concept drift), models that haven't perfected a dataset,
29
31
  and when used on an ensemble of models with differing architectures and perspectives.
30
32
 
@@ -117,13 +119,14 @@ weights = router.predict(X_test[i])
117
119
 
118
120
  ## Algorithms
119
121
 
120
- | Method | Best for | Notes |
121
- |---|---|---|
122
- | `KNNDWS` | Regression | Softmax over neighbourhood-averaged scores. Temperature controls sharpness. |
123
- | `KNORAU` | Classification | Vote-count weighting. Each model earns one vote per neighbour it correctly classifies. |
124
- | `KNORAE` | Classification | Intersection-based. Only models correct on all neighbours survive; falls back to smaller neighbourhoods. |
125
- | `KNORAIU` | Classification | Like KNORA-U but votes are inverse-distance weighted. |
126
- | `OLA` | Both | Hard selection: only the single best model in the neighbourhood contributes. |
122
+ | Method | Best for | Notes |
123
+ |-----------|---|----------------------------------------------------------------------------------------------------------|
124
+ | `DEWSU` | Regression | Softmax over neighbourhood-averaged scores. Temperature controls sharpness. |
125
+ | `DEWSI` | Regression | Like DEWS-U but scores are inverse-distance weighted. |
126
+ | `KNORAU` | Classification | Vote-count weighting. Each model earns one vote per neighbour it correctly classifies. |
127
+ | `KNORAE` | Classification | Intersection-based. Only models correct on all neighbours survive; falls back to smaller neighbourhoods. |
128
+ | `KNORAIU` | Classification | Like KNORA-U but votes are inverse-distance weighted. |
129
+ | `OLA` | Both | Hard selection: only the single best model in the neighbourhood contributes. |
127
130
 
128
131
  ---
129
132
 
@@ -170,16 +173,21 @@ def pinball(y_true, y_pred, alpha=0.9):
170
173
  e = y_true - y_pred
171
174
  return alpha * e if e >= 0 else (alpha - 1) * e
172
175
 
173
- router = KNNDWS(task="regression", metric=pinball, mode="min", k=20)
176
+ router = DEWSU(task="regression", metric=pinball, mode="min", k=20)
174
177
  ```
175
178
 
176
179
  Built-in metric strings: `accuracy`, `mae`, `mse`, `rmse`, `log_loss`, `prob_correct`.
177
180
 
178
181
  ---
179
182
 
183
+ ## Data types
184
+
185
+ deskit can be used with non-tabular data types like images, time series, and more. However, when used, the
186
+ passed features either need to be run through a feature extractor beforehand, such as a CNN backbone for images.
187
+
180
188
  ## Benchmark results
181
189
 
182
- 20-seed benchmark (seeds 0–19) on standard sklearn and OpenML datasets. "Best Single" is the best
190
+ 100-seed benchmark (seeds 0–99) on standard sklearn and OpenML datasets. "Best Single" is the best
183
191
  individual model selected on the validation set. "Simple Average" is uniform
184
192
  equal-weight blending, included as a baseline.
185
193
 
@@ -192,19 +200,19 @@ Pool: KNN, Decision Tree, SVR, Ridge, Bayesian Ridge.
192
200
 
193
201
  This pool was selected for having variability in architectures while avoiding a single dominant model.
194
202
 
195
- deskit algorithms tested: OLA, KNN-DWS, KNORA-U, KNORA-E, KNORA-IU.
203
+ deskit algorithms tested: OLA, DEWS-U, DEWS-I, KNORA-U, KNORA-E, KNORA-IU.
196
204
 
197
205
  ### Regression (MAE, lower is better)
198
206
 
199
- % shown as delta vs Best Single. 10-seed mean.
207
+ % shown as delta vs Best Single. 100-seed mean.
200
208
 
201
- | Dataset | Best Single | Simple Avg | deskit best |
202
- |------------------------------|-----------|---|-----------------------|
203
- | California Housing (sklearn) | 0.3956 | +7.99% | **-2.24%** (KNN-DWS) |
204
- | Bike Sharing (OpenML) | 51.6779 | +47.77% | **-5.34%** (KNN-DWS) |
205
- | Abalone (OpenML) | **1.4981** | +1.14% | +1.47% (KNORA-U) |
206
- | Diabetes (sklearn) | **44.5042** | +3.18% | +1.17% (KNN-DWS) |
207
- | Conrete Strength (OpenML) | 5.2686 | +23.66% | **-1.05%** (KNORA-IU) |
209
+ | Dataset | Best Single | Simple Avg | deskit best |
210
+ |------------------------------|-------------|------------|-------------------------|
211
+ | California Housing (sklearn) | 0.3955 | +7.93% | **−2.68%** (DEWS-I) |
212
+ | Bike Sharing (OpenML) | 51.604 | +48.39% | **−6.25%** (DEWS-I) |
213
+ | Abalone (OpenML) | **1.4923** | +1.29% | +1.61% (KNORA-IU) |
214
+ | Diabetes (sklearn) | **44.986** | +2.98% | +0.88% (DEWS-I) |
215
+ | Concrete Strength (OpenML) | 5.3934 | +21.30% | **−2.85%** (KNORA-IU) |
208
216
 
209
217
  deskit beats best single and simple averaging on 3/5 regression datasets. This shows how DES can provide a
210
218
  strong boost if used on the right dataset, but it might be counterproductive if used blindly.
@@ -216,37 +224,37 @@ and classification-like (like in Abalone).
216
224
 
217
225
  ### Classification (Accuracy, higher is better)
218
226
 
219
- % shown as delta vs Best Single. 10-seed mean.
227
+ % shown as delta vs Best Single. 100-seed mean.
220
228
 
221
- | Dataset | Best Single | Simple Avg | deskit best |
222
- |------------------------|-------------|--------|-----------------------|
223
- | HAR (OpenML) | 98.24% | -0.33% | **+0.14%** (KNN-DWS) |
224
- | Yeast (OpenML) | 58.87% | +0.77% | **+1.66%** (KNORA-IU) |
225
- | Image Segment (OpenML) | 93.70% | +1.40% | **+2.09%** (KNORA-IU) |
226
- | Waveform (OpenML) | 89.95% | -2.05% | **+0.93%** (KNORA-E) |
227
- | Vowel (OpenML) | **85.91%** | -0.98% | -0.40% (KNN-DWS) |
229
+ | Dataset | Best Single | Simple Avg | deskit best |
230
+ |------------------------|-------------|------------|-------------------------|
231
+ | HAR (OpenML) | 98.24% | 0.32% | **+0.14%** (DEWS-I) |
232
+ | Yeast (OpenML) | 59.19% | +0.46% | **+1.48%** (KNORA-IU) |
233
+ | Image Segment (OpenML) | 93.65% | +1.70% | **+2.33%** (KNORA-IU) |
234
+ | Waveform (OpenML) | **86.28%** | −1.04% | 0.55% (DEWS-I) |
235
+ | Vowel (OpenML) | 90.54% | −1.81% | **+0.93%** (KNORA-IU) |
228
236
 
229
237
  deskit beats or matches best single and simple averaging on 4/5 classification datasets. As seen on regression, DES
230
238
  can improve or hurt performance, so it must be used wisely, but if used correctly it can show promising results.
231
239
 
232
240
  ### Speed (mean ms fit + predict, 20 seeds, all tested algorithms combined)
233
241
 
234
- Consider that usually it is recommended to only use one algorithm at a time, this benchmark ran five of them at the
235
- same time, so with a single one runtime is expected to be about 5x faster. For this benchmark, `preset='balanced'` was used,
242
+ Consider that usually it is recommended to only use one algorithm at a time, this benchmark ran six of them at the
243
+ same time, so with a single one runtime is expected to be about 6x faster. For this benchmark, `preset='balanced'` was used,
236
244
  so the backend was an ANN algorithm with FAISS IVF.
237
245
 
238
246
  | Dataset | deskit |
239
- |--------------------|----------|
240
- | California Housing | 136.6 ms |
241
- | Bike Sharing | 115.5 ms |
242
- | Abalone | 28.5 ms |
243
- | Diabetes | 8.1 ms |
244
- | Conrete Strength | 9.4 ms |
245
- | HAR | 297.5 ms |
246
- | Yeast | 16.3 ms |
247
- | Image Segment | 27.2 ms |
248
- | Waveform | 48.9 ms |
249
- | Vowel | 16.5 ms |
247
+ |--------------------|-----------|
248
+ | California Housing | 159.8 ms |
249
+ | Bike Sharing | 130.3 ms |
250
+ | Abalone | 32.9 ms |
251
+ | Diabetes | 8.2 ms |
252
+ | Conrete Strength | 10.8 ms |
253
+ | HAR | 352.0 ms |
254
+ | Yeast | 18.6 ms |
255
+ | Image Segment | 32.4 ms |
256
+ | Waveform | 58.7 ms |
257
+ | Vowel | 19.6 ms |
250
258
 
251
259
  deskit caches all model predictions on the validation set at fit time and reads
252
260
  from that matrix at inference.
@@ -255,4 +263,4 @@ from that matrix at inference.
255
263
 
256
264
  ## Contributing
257
265
 
258
- Issues and PRs welcome.
266
+ Issues and PRs welcome.
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
4
4
 
5
5
  [project]
6
6
  name = "deskit"
7
- version = "0.1.0"
7
+ version = "0.3.0"
8
8
  description = "A Python library for Dynamic Ensemble Selection"
9
9
  readme = "README.md"
10
10
  license = "MIT"
@@ -5,13 +5,13 @@ Metrics
5
5
  -------
6
6
  Pass a metric name string:
7
7
 
8
- KNNDWS(task='classification', metric='log_loss', mode='min')
8
+ DEWSU(task='classification', metric='log_loss', mode='min')
9
9
 
10
10
  Or import a metric function directly:
11
11
 
12
12
  from deskit.metrics import log_loss, mae
13
13
 
14
- KNNDWS(task='classification', metric=log_loss, mode='min')
14
+ DEWSU(task='classification', metric=log_loss, mode='min')
15
15
 
16
16
  Available built-in metrics:
17
17
  Scalar predictions (pass predict() output):
@@ -21,7 +21,7 @@ Available built-in metrics:
21
21
  'log_loss', 'prob_correct'
22
22
  """
23
23
 
24
- from deskit.des.knndws import KNNDWS
24
+ from deskit.des.dewsu import DEWSU
25
25
  from deskit.des.ola import OLA
26
26
  from deskit.des.knorau import KNORAU
27
27
  from deskit.des.knorae import KNORAE
@@ -31,7 +31,7 @@ from deskit._config import SPEED_PRESETS, list_presets
31
31
  from deskit.analysis import analyze
32
32
 
33
33
  __all__ = [
34
- 'KNNDWS',
34
+ 'DEWSU',
35
35
  'OLA',
36
36
  'KNORAU',
37
37
  'KNORAE',
@@ -1,7 +1,7 @@
1
- from deskit.des.knndws import KNNDWS
1
+ from deskit.des.dewsu import DEWSU
2
2
  from deskit.des.ola import OLA
3
3
  from deskit.des.knorau import KNORAU
4
4
  from deskit.des.knorae import KNORAE
5
5
  from deskit.des.knoraiu import KNORAIU
6
6
 
7
- __all__ = ['KNNDWS', 'OLA', 'KNORAU', 'KNORAE', 'KNORAIU']
7
+ __all__ = ['DEWSU', 'OLA', 'KNORAU', 'KNORAE', 'KNORAIU']
@@ -0,0 +1,130 @@
1
+ """
2
+ DEWS-IU: K-Nearest Neighbors with Distance-Weighted Softmax — Inverse-weighted Union.
3
+ """
4
+ from deskit.base.knnbase import KNNBase
5
+ from deskit._config import make_finder, resolve_metric, prep_fit_inputs
6
+ from deskit.utils import to_numpy
7
+ import numpy as np
8
+
9
+
10
+ class DEWSI(KNNBase):
11
+ """
12
+ DEWS-IU: K-Nearest Neighbors with Distance-Weighted Softmax — Inverse-weighted Union.
13
+
14
+ Extends DEWS-U by replacing the simple average of neighbor scores with an
15
+ inverse-distance-weighted average, so closer neighbors have a stronger
16
+ influence on the softmax routing — analogous to how KNORA-IU extends KNORA-U.
17
+
18
+ Parameters
19
+ ----------
20
+ task : str
21
+ 'classification' or 'regression'.
22
+ metric : str or callable
23
+ Scoring function. Use 'log_loss' or 'prob_correct' with predict_proba()
24
+ output for classification; 'mae', 'mse', or 'rmse' for regression.
25
+ mode : str
26
+ 'max' if higher scores are better, 'min' if lower.
27
+ k : int
28
+ Neighborhood size. Default: 10.
29
+ threshold : float
30
+ After per-neighborhood normalization (best=1.0, worst=0.0), models
31
+ below this fraction are excluded from softmax. 0.0 disables the gate;
32
+ 1.0 reduces to OLA behavior. Default: 0.5.
33
+ temperature : float, optional
34
+ Softmax sharpness. Lower = sharper routing toward the local best model;
35
+ higher = softer blending. If not set, defaults to 0.1 for regression
36
+ (min-metrics) and 1.0 for classification (max-metrics) at predict time.
37
+ preset : str
38
+ Neighbor search preset. Default: 'balanced'. See list_presets().
39
+ """
40
+
41
+ def __init__(self, task, metric='mae', mode='min', k=10,
42
+ threshold=0.5, temperature=None, preset='balanced', **kwargs):
43
+ metric_name, metric_fn = resolve_metric(metric)
44
+ finder = make_finder(preset, k, **kwargs)
45
+ super().__init__(metric=metric_fn, mode=mode, neighbor_finder=finder)
46
+ self.task = task
47
+ self.threshold = threshold
48
+ self._temperature = temperature
49
+ self._metric_name = metric_name
50
+
51
+ def fit(self, features, y, preds_dict):
52
+ """
53
+ Fit the routing model on validation data.
54
+
55
+ Parameters
56
+ ----------
57
+ features : array-like, shape (n_val, n_features)
58
+ Validation features. Must not overlap with train or test data.
59
+ y : array-like, shape (n_val,)
60
+ Validation ground-truth labels or values.
61
+ preds_dict : dict[str, array-like]
62
+ Validation predictions keyed by model name.
63
+ Shape (n_val,) for scalar metrics; (n_val, n_classes) for probability metrics.
64
+ """
65
+ features, y, preds_dict = prep_fit_inputs(
66
+ features, y, preds_dict, self._metric_name
67
+ )
68
+ super().fit(features, y, preds_dict)
69
+
70
+ def predict(self, x, temperature=None, threshold=None):
71
+ """
72
+ Return per-sample model weights.
73
+
74
+ Parameters
75
+ ----------
76
+ x : array-like, shape (n_features,) or (n_samples, n_features)
77
+ temperature : float, optional
78
+ Overrides the instance temperature for this call.
79
+ threshold : float, optional
80
+ Overrides the instance threshold for this call.
81
+
82
+ Returns
83
+ -------
84
+ dict or list of dict
85
+ Single sample: {model_name: weight}. Batch: list of such dicts.
86
+ """
87
+ t = temperature if temperature is not None else (
88
+ self._temperature if self._temperature is not None else
89
+ (0.1 if self.mode == 'min' else 1.0))
90
+ th = threshold if threshold is not None else self.threshold
91
+
92
+ x = np.atleast_2d(to_numpy(x))
93
+ batch_size = x.shape[0]
94
+
95
+ distances, indices = self.model.kneighbors(x) # both (batch, k)
96
+
97
+ # Inverse-distance-weighted average of each model's scores over the K neighbors.
98
+ # Closer neighbors exert stronger influence on routing.
99
+ inv_dist = 1.0 / np.maximum(distances, 1e-8) # (batch, k)
100
+ inv_dist_w = inv_dist / inv_dist.sum(axis=1, keepdims=True) # normalised weights
101
+ neighbor_scores = self.matrix[indices] # (batch, k, n_models)
102
+ avg_scores = (neighbor_scores * inv_dist_w[:, :, np.newaxis]).sum(axis=1) # (batch, n_models)
103
+
104
+ # Normalize per neighborhood: best model = 1.0, worst = 0.0
105
+ local_min = avg_scores.min(axis=1, keepdims=True)
106
+ local_max = avg_scores.max(axis=1, keepdims=True)
107
+ local_range = local_max - local_min
108
+ norm_scores = (avg_scores - local_min) / np.where(local_range > 0, local_range, 1.0)
109
+
110
+ # Zero out models below threshold.
111
+ # If nothing passes: fall back to single best.
112
+ if th > 0:
113
+ gate = norm_scores >= th
114
+ any_pass = gate.any(axis=1, keepdims=True)
115
+ gate = np.where(any_pass, gate, norm_scores == 1.0)
116
+ norm_scores = norm_scores * gate
117
+
118
+ # Softmax
119
+ max_scores = norm_scores.max(axis=1, keepdims=True)
120
+ exp_scores = np.exp((norm_scores - max_scores) / t)
121
+ if th > 0:
122
+ exp_scores = exp_scores * gate
123
+ total = exp_scores.sum(axis=1, keepdims=True)
124
+ weights = np.where(total > 0,
125
+ exp_scores / np.where(total > 0, total, 1.0),
126
+ np.full_like(exp_scores, 1.0 / len(self.models)))
127
+
128
+ if batch_size == 1:
129
+ return dict(zip(self.models, weights[0]))
130
+ return [dict(zip(self.models, w)) for w in weights]
@@ -1,5 +1,5 @@
1
1
  """
2
- KNN-DWS: K-Nearest Neighbors with Distance-Weighted Softmax.
2
+ DEWS-U: K-Nearest Neighbors with Distance-Weighted Softmax.
3
3
  """
4
4
  from deskit.base.knnbase import KNNBase
5
5
  from deskit._config import make_finder, resolve_metric, prep_fit_inputs
@@ -7,9 +7,9 @@ from deskit.utils import to_numpy
7
7
  import numpy as np
8
8
 
9
9
 
10
- class KNNDWS(KNNBase):
10
+ class DEWSU(KNNBase):
11
11
  """
12
- KNN-DWS: K-Nearest Neighbors with Distance-Weighted Softmax.
12
+ DEWS-U: K-Nearest Neighbors with Distance-Weighted Softmax.
13
13
 
14
14
  Parameters
15
15
  ----------
@@ -3,7 +3,7 @@ DynamicRouter — string-based factory for programmatic algorithm selection.
3
3
 
4
4
  Use DynamicRouter when you need to choose an algorithm via a string at runtime.
5
5
  """
6
- from deskit.des.knndws import KNNDWS
6
+ from deskit.des.dewsu import DEWSU
7
7
  from deskit.des.ola import OLA
8
8
  from deskit.des.knorau import KNORAU
9
9
  from deskit.des.knorae import KNORAE
@@ -12,7 +12,7 @@ from deskit._config import SPEED_PRESETS, list_presets
12
12
  from deskit.utils import to_numpy, add_batch_dim
13
13
 
14
14
  _METHOD_CLASSES = {
15
- 'knn-dws': KNNDWS,
15
+ 'DEWS-U': DEWSU,
16
16
  'ola': OLA,
17
17
  'knora-u': KNORAU,
18
18
  'knora-e': KNORAE,
@@ -29,7 +29,7 @@ class DynamicRouter:
29
29
  task : str
30
30
  'classification' or 'regression'.
31
31
  method : str
32
- 'knn-dws', 'ola', 'knora-u', or 'knora-e'.
32
+ 'DEWS-U', 'ola', 'knora-u', or 'knora-e'.
33
33
  metric : str or callable
34
34
  Per-sample scoring function. Built-in names: 'accuracy', 'mae', 'mse',
35
35
  'rmse', 'log_loss', 'prob_correct'. Or any callable (y_true, y_pred) -> float.
@@ -40,7 +40,7 @@ class DynamicRouter:
40
40
  threshold : float
41
41
  Competence gate applied after per-neighborhood normalization.
42
42
  temperature : float, optional
43
- Softmax sharpness for knn-dws. Ignored by other algorithms.
43
+ Softmax sharpness for DEWS-U. Ignored by other algorithms.
44
44
  preset : str
45
45
  Speed/accuracy preset. Call list_presets() for options.
46
46
  feature_extractor : callable, optional
@@ -51,7 +51,7 @@ class DynamicRouter:
51
51
  Forwarded to the neighbor finder constructor.
52
52
  """
53
53
 
54
- def __init__(self, task, method='knn-dws', metric='accuracy', mode='max',
54
+ def __init__(self, task, method='DEWS-U', metric='accuracy', mode='max',
55
55
  k=10, threshold=0.5, temperature=None, preset='balanced',
56
56
  feature_extractor=None, finder=None, **kwargs):
57
57
 
@@ -71,8 +71,8 @@ class DynamicRouter:
71
71
  # Pass finder through as a kwarg when using preset='custom'.
72
72
  extra = {'finder': finder} if finder is not None else {}
73
73
 
74
- # KNNDWS accepts temperature; the others don't.
75
- if method == 'knn-dws':
74
+ # DEWSU accepts temperature; the others don't.
75
+ if method == 'DEWS-U':
76
76
  self._des = cls(
77
77
  task=task, metric=metric, mode=mode, k=k,
78
78
  threshold=threshold, temperature=temperature,
@@ -108,7 +108,7 @@ class DynamicRouter:
108
108
  ----------
109
109
  x : array-like, shape (n_features,) or (n_samples, n_features)
110
110
  temperature : float, optional
111
- knn-dws only. Overrides the instance temperature for this call.
111
+ DEWS-U only. Overrides the instance temperature for this call.
112
112
  threshold : float, optional
113
113
  Overrides the instance threshold for this call.
114
114
 
@@ -125,7 +125,7 @@ class DynamicRouter:
125
125
  # Class methods
126
126
 
127
127
  @classmethod
128
- def from_data_size(cls, n_samples, n_features, task, method='knn-dws',
128
+ def from_data_size(cls, n_samples, n_features, task, method='DEWS-U',
129
129
  metric='accuracy', mode='max', k=10, threshold=0.5,
130
130
  n_queries=None, **extra_kwargs):
131
131
  """
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: deskit
3
- Version: 0.1.0
3
+ Version: 0.3.0
4
4
  Summary: A Python library for Dynamic Ensemble Selection
5
5
  Author: Tikhon Vodyanov
6
6
  License-Expression: MIT
@@ -31,18 +31,20 @@ Dynamic: license-file
31
31
 
32
32
  # deskit
33
33
 
34
- [deskit](https://TikaaVo.github.io/deskit/) is a flexible, light, and easy-to-use ensembling library that implements
34
+ deskit is a flexible, lightweight, and easy-to-use ensembling library that implements
35
35
  Dynamic Ensemble Selection (DES) algorithms for ensembling multiple ML models
36
- on a singular dataset.
36
+ on a given dataset.
37
37
 
38
38
  The library works entirely with data, taking as input a validation dataset
39
- along with pre-computed predictions and outputting a dictionary of weights
39
+ along with precomputed predictions and outputting a dictionary of weights
40
40
  per model. This means that it can be used with any library or model without
41
41
  requiring any wrappers, including custom models, popular ML libraries, and APIs.
42
42
 
43
- deskit contains multiple different DES algorithms, and it works with both classification
43
+ deskit includes several DES algorithms, and it works with both classification
44
44
  and regression.
45
45
 
46
+ See the full documentation [here](https://TikaaVo.github.io/deskit/).
47
+
46
48
  # Dynamic Ensemble Selection
47
49
 
48
50
  Ensemble learning in machine learning refers to when multiple models trained on a
@@ -55,7 +57,7 @@ concept that there are regions of feature space where certain models perform par
55
57
  so every base model can be an expert in a different region.
56
58
  Only the most competent, or an ensemble of the most competent models is selected for the prediction.
57
59
 
58
- Through empirical studies, DES has been shown to perform best with small-sized, imbalanced, or
60
+ Through empirical studies, DES has been shown to perform best on small-sized, imbalanced, or
59
61
  heterogeneous datasets, as well as non-stationary data (concept drift), models that haven't perfected a dataset,
60
62
  and when used on an ensemble of models with differing architectures and perspectives.
61
63
 
@@ -148,13 +150,14 @@ weights = router.predict(X_test[i])
148
150
 
149
151
  ## Algorithms
150
152
 
151
- | Method | Best for | Notes |
152
- |---|---|---|
153
- | `KNNDWS` | Regression | Softmax over neighbourhood-averaged scores. Temperature controls sharpness. |
154
- | `KNORAU` | Classification | Vote-count weighting. Each model earns one vote per neighbour it correctly classifies. |
155
- | `KNORAE` | Classification | Intersection-based. Only models correct on all neighbours survive; falls back to smaller neighbourhoods. |
156
- | `KNORAIU` | Classification | Like KNORA-U but votes are inverse-distance weighted. |
157
- | `OLA` | Both | Hard selection: only the single best model in the neighbourhood contributes. |
153
+ | Method | Best for | Notes |
154
+ |-----------|---|----------------------------------------------------------------------------------------------------------|
155
+ | `DEWSU` | Regression | Softmax over neighbourhood-averaged scores. Temperature controls sharpness. |
156
+ | `DEWSI` | Regression | Like DEWS-U but scores are inverse-distance weighted. |
157
+ | `KNORAU` | Classification | Vote-count weighting. Each model earns one vote per neighbour it correctly classifies. |
158
+ | `KNORAE` | Classification | Intersection-based. Only models correct on all neighbours survive; falls back to smaller neighbourhoods. |
159
+ | `KNORAIU` | Classification | Like KNORA-U but votes are inverse-distance weighted. |
160
+ | `OLA` | Both | Hard selection: only the single best model in the neighbourhood contributes. |
158
161
 
159
162
  ---
160
163
 
@@ -201,16 +204,21 @@ def pinball(y_true, y_pred, alpha=0.9):
201
204
  e = y_true - y_pred
202
205
  return alpha * e if e >= 0 else (alpha - 1) * e
203
206
 
204
- router = KNNDWS(task="regression", metric=pinball, mode="min", k=20)
207
+ router = DEWSU(task="regression", metric=pinball, mode="min", k=20)
205
208
  ```
206
209
 
207
210
  Built-in metric strings: `accuracy`, `mae`, `mse`, `rmse`, `log_loss`, `prob_correct`.
208
211
 
209
212
  ---
210
213
 
214
+ ## Data types
215
+
216
+ deskit can be used with non-tabular data types like images, time series, and more. However, when used, the
217
+ passed features either need to be run through a feature extractor beforehand, such as a CNN backbone for images.
218
+
211
219
  ## Benchmark results
212
220
 
213
- 20-seed benchmark (seeds 0–19) on standard sklearn and OpenML datasets. "Best Single" is the best
221
+ 100-seed benchmark (seeds 0–99) on standard sklearn and OpenML datasets. "Best Single" is the best
214
222
  individual model selected on the validation set. "Simple Average" is uniform
215
223
  equal-weight blending, included as a baseline.
216
224
 
@@ -223,19 +231,19 @@ Pool: KNN, Decision Tree, SVR, Ridge, Bayesian Ridge.
223
231
 
224
232
  This pool was selected for having variability in architectures while avoiding a single dominant model.
225
233
 
226
- deskit algorithms tested: OLA, KNN-DWS, KNORA-U, KNORA-E, KNORA-IU.
234
+ deskit algorithms tested: OLA, DEWS-U, DEWS-I, KNORA-U, KNORA-E, KNORA-IU.
227
235
 
228
236
  ### Regression (MAE, lower is better)
229
237
 
230
- % shown as delta vs Best Single. 10-seed mean.
238
+ % shown as delta vs Best Single. 100-seed mean.
231
239
 
232
- | Dataset | Best Single | Simple Avg | deskit best |
233
- |------------------------------|-----------|---|-----------------------|
234
- | California Housing (sklearn) | 0.3956 | +7.99% | **-2.24%** (KNN-DWS) |
235
- | Bike Sharing (OpenML) | 51.6779 | +47.77% | **-5.34%** (KNN-DWS) |
236
- | Abalone (OpenML) | **1.4981** | +1.14% | +1.47% (KNORA-U) |
237
- | Diabetes (sklearn) | **44.5042** | +3.18% | +1.17% (KNN-DWS) |
238
- | Conrete Strength (OpenML) | 5.2686 | +23.66% | **-1.05%** (KNORA-IU) |
240
+ | Dataset | Best Single | Simple Avg | deskit best |
241
+ |------------------------------|-------------|------------|-------------------------|
242
+ | California Housing (sklearn) | 0.3955 | +7.93% | **−2.68%** (DEWS-I) |
243
+ | Bike Sharing (OpenML) | 51.604 | +48.39% | **−6.25%** (DEWS-I) |
244
+ | Abalone (OpenML) | **1.4923** | +1.29% | +1.61% (KNORA-IU) |
245
+ | Diabetes (sklearn) | **44.986** | +2.98% | +0.88% (DEWS-I) |
246
+ | Concrete Strength (OpenML) | 5.3934 | +21.30% | **−2.85%** (KNORA-IU) |
239
247
 
240
248
  deskit beats best single and simple averaging on 3/5 regression datasets. This shows how DES can provide a
241
249
  strong boost if used on the right dataset, but it might be counterproductive if used blindly.
@@ -247,37 +255,37 @@ and classification-like (like in Abalone).
247
255
 
248
256
  ### Classification (Accuracy, higher is better)
249
257
 
250
- % shown as delta vs Best Single. 10-seed mean.
258
+ % shown as delta vs Best Single. 100-seed mean.
251
259
 
252
- | Dataset | Best Single | Simple Avg | deskit best |
253
- |------------------------|-------------|--------|-----------------------|
254
- | HAR (OpenML) | 98.24% | -0.33% | **+0.14%** (KNN-DWS) |
255
- | Yeast (OpenML) | 58.87% | +0.77% | **+1.66%** (KNORA-IU) |
256
- | Image Segment (OpenML) | 93.70% | +1.40% | **+2.09%** (KNORA-IU) |
257
- | Waveform (OpenML) | 89.95% | -2.05% | **+0.93%** (KNORA-E) |
258
- | Vowel (OpenML) | **85.91%** | -0.98% | -0.40% (KNN-DWS) |
260
+ | Dataset | Best Single | Simple Avg | deskit best |
261
+ |------------------------|-------------|------------|-------------------------|
262
+ | HAR (OpenML) | 98.24% | 0.32% | **+0.14%** (DEWS-I) |
263
+ | Yeast (OpenML) | 59.19% | +0.46% | **+1.48%** (KNORA-IU) |
264
+ | Image Segment (OpenML) | 93.65% | +1.70% | **+2.33%** (KNORA-IU) |
265
+ | Waveform (OpenML) | **86.28%** | −1.04% | 0.55% (DEWS-I) |
266
+ | Vowel (OpenML) | 90.54% | −1.81% | **+0.93%** (KNORA-IU) |
259
267
 
260
268
  deskit beats or matches best single and simple averaging on 4/5 classification datasets. As seen on regression, DES
261
269
  can improve or hurt performance, so it must be used wisely, but if used correctly it can show promising results.
262
270
 
263
271
  ### Speed (mean ms fit + predict, 20 seeds, all tested algorithms combined)
264
272
 
265
- Consider that usually it is recommended to only use one algorithm at a time, this benchmark ran five of them at the
266
- same time, so with a single one runtime is expected to be about 5x faster. For this benchmark, `preset='balanced'` was used,
273
+ Consider that usually it is recommended to only use one algorithm at a time, this benchmark ran six of them at the
274
+ same time, so with a single one runtime is expected to be about 6x faster. For this benchmark, `preset='balanced'` was used,
267
275
  so the backend was an ANN algorithm with FAISS IVF.
268
276
 
269
277
  | Dataset | deskit |
270
- |--------------------|----------|
271
- | California Housing | 136.6 ms |
272
- | Bike Sharing | 115.5 ms |
273
- | Abalone | 28.5 ms |
274
- | Diabetes | 8.1 ms |
275
- | Conrete Strength | 9.4 ms |
276
- | HAR | 297.5 ms |
277
- | Yeast | 16.3 ms |
278
- | Image Segment | 27.2 ms |
279
- | Waveform | 48.9 ms |
280
- | Vowel | 16.5 ms |
278
+ |--------------------|-----------|
279
+ | California Housing | 159.8 ms |
280
+ | Bike Sharing | 130.3 ms |
281
+ | Abalone | 32.9 ms |
282
+ | Diabetes | 8.2 ms |
283
+ | Conrete Strength | 10.8 ms |
284
+ | HAR | 352.0 ms |
285
+ | Yeast | 18.6 ms |
286
+ | Image Segment | 32.4 ms |
287
+ | Waveform | 58.7 ms |
288
+ | Vowel | 19.6 ms |
281
289
 
282
290
  deskit caches all model predictions on the validation set at fit time and reads
283
291
  from that matrix at inference.
@@ -17,7 +17,8 @@ src/deskit/base/__init__.py
17
17
  src/deskit/base/base.py
18
18
  src/deskit/base/knnbase.py
19
19
  src/deskit/des/__init__.py
20
- src/deskit/des/knndws.py
20
+ src/deskit/des/dewsi.py
21
+ src/deskit/des/dewsu.py
21
22
  src/deskit/des/knorae.py
22
23
  src/deskit/des/knoraiu.py
23
24
  src/deskit/des/knorau.py
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes