hdgpso 0.1.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- hdgpso-0.1.0/.gitignore +95 -0
- hdgpso-0.1.0/CHANGELOG.md +18 -0
- hdgpso-0.1.0/CITATION.cff +27 -0
- hdgpso-0.1.0/LICENSE +21 -0
- hdgpso-0.1.0/PKG-INFO +234 -0
- hdgpso-0.1.0/README.md +170 -0
- hdgpso-0.1.0/benchmarks/benchmark.py +648 -0
- hdgpso-0.1.0/benchmarks/deep_objectives.py +362 -0
- hdgpso-0.1.0/benchmarks/multifidelity_objectives.py +224 -0
- hdgpso-0.1.0/benchmarks/run_budget_sweep.py +79 -0
- hdgpso-0.1.0/benchmarks/run_claim_check_v5.py +177 -0
- hdgpso-0.1.0/benchmarks/tuners.py +533 -0
- hdgpso-0.1.0/examples/00_walkthrough.ipynb +1193 -0
- hdgpso-0.1.0/examples/01_quickstart.py +45 -0
- hdgpso-0.1.0/examples/02_multifidelity.py +65 -0
- hdgpso-0.1.0/examples/03_paper_reproduction.ipynb +399 -0
- hdgpso-0.1.0/examples/05_RNNTest.ipynb +1451 -0
- hdgpso-0.1.0/pyproject.toml +83 -0
- hdgpso-0.1.0/src/hdgpso/__init__.py +64 -0
- hdgpso-0.1.0/src/hdgpso/_version.py +2 -0
- hdgpso-0.1.0/src/hdgpso/core.py +676 -0
- hdgpso-0.1.0/src/hdgpso/multifidelity.py +315 -0
- hdgpso-0.1.0/src/hdgpso/plots.py +166 -0
- hdgpso-0.1.0/src/hdgpso/stats.py +419 -0
- hdgpso-0.1.0/tests/test_hdgpso.py +289 -0
hdgpso-0.1.0/.gitignore
ADDED
|
@@ -0,0 +1,95 @@
|
|
|
1
|
+
# Python
|
|
2
|
+
__pycache__/
|
|
3
|
+
*.py[cod]
|
|
4
|
+
*$py.class
|
|
5
|
+
*.so
|
|
6
|
+
*.egg-info/
|
|
7
|
+
.eggs/
|
|
8
|
+
.pytest_cache/
|
|
9
|
+
.coverage
|
|
10
|
+
htmlcov/
|
|
11
|
+
.tox/
|
|
12
|
+
build/
|
|
13
|
+
dist/
|
|
14
|
+
*.egg
|
|
15
|
+
.python-version
|
|
16
|
+
|
|
17
|
+
# Virtual envs
|
|
18
|
+
.venv/
|
|
19
|
+
venv/
|
|
20
|
+
env/
|
|
21
|
+
|
|
22
|
+
# IDE
|
|
23
|
+
.vscode/
|
|
24
|
+
.idea/
|
|
25
|
+
*.swp
|
|
26
|
+
*.swo
|
|
27
|
+
|
|
28
|
+
# OS
|
|
29
|
+
.DS_Store
|
|
30
|
+
Thumbs.db
|
|
31
|
+
desktop.ini
|
|
32
|
+
|
|
33
|
+
# Benchmark outputs (large, regenerable)
|
|
34
|
+
results/
|
|
35
|
+
results_*/
|
|
36
|
+
report.log
|
|
37
|
+
|
|
38
|
+
# Smoke / preflight / probe scripts and their logs (temporary research artifacts).
|
|
39
|
+
# Anchored to repo root with a leading slash so we don't accidentally exclude
|
|
40
|
+
# package internals like src/hdgpso/__init__.py or src/hdgpso/_version.py.
|
|
41
|
+
/_*.py
|
|
42
|
+
/_*.log
|
|
43
|
+
/_*.png
|
|
44
|
+
/_smoke_*.png
|
|
45
|
+
|
|
46
|
+
# Research-internal scripts and reference notebooks kept locally only.
|
|
47
|
+
# These are not part of the published reproduction surface.
|
|
48
|
+
benchmarks/_*.py
|
|
49
|
+
benchmarks/Experiment details.ipynb
|
|
50
|
+
benchmarks/run_claim_check.py
|
|
51
|
+
benchmarks/run_claim_check_v2.py
|
|
52
|
+
benchmarks/run_claim_check_v3.py
|
|
53
|
+
benchmarks/run_claim_check_v4.py
|
|
54
|
+
|
|
55
|
+
# Compiled notebooks / cached artifacts
|
|
56
|
+
.ipynb_checkpoints/
|
|
57
|
+
|
|
58
|
+
# Misc CSVs from one-off scripts
|
|
59
|
+
hdgpso_meta_search.csv
|
|
60
|
+
|
|
61
|
+
# pyswarms / scikit-optimize / etc. write log files
|
|
62
|
+
report.log
|
|
63
|
+
|
|
64
|
+
# -----------------------------------------------------------------------------
|
|
65
|
+
# Paper directory excluded pending author's doc review.
|
|
66
|
+
# When ready, remove the line below and selectively add LaTeX sources, figures,
|
|
67
|
+
# and the experimental appendix. Build artifacts and bundled third-party PDFs
|
|
68
|
+
# are explicitly excluded below regardless.
|
|
69
|
+
# -----------------------------------------------------------------------------
|
|
70
|
+
paper/
|
|
71
|
+
|
|
72
|
+
# LaTeX build artifacts (if paper/ is later partially un-ignored)
|
|
73
|
+
*.aux
|
|
74
|
+
*.bbl
|
|
75
|
+
*.blg
|
|
76
|
+
*.fdb_latexmk
|
|
77
|
+
*.fls
|
|
78
|
+
*.out
|
|
79
|
+
*.synctex.gz
|
|
80
|
+
*.toc
|
|
81
|
+
*.lof
|
|
82
|
+
*.lot
|
|
83
|
+
*.nav
|
|
84
|
+
*.snm
|
|
85
|
+
*.vrb
|
|
86
|
+
|
|
87
|
+
# Third-party reference PDFs bundled under paper/literature/ (copyright / size)
|
|
88
|
+
paper/literature/
|
|
89
|
+
|
|
90
|
+
# Compiled paper PDF (regenerable from LaTeX source)
|
|
91
|
+
paper/IEEE_Conference_HDGPSO.pdf
|
|
92
|
+
|
|
93
|
+
# Local scratch CSVs / logs
|
|
94
|
+
*.tmp
|
|
95
|
+
*.bak
|
|
@@ -0,0 +1,18 @@
|
|
|
1
|
+
# Changelog
|
|
2
|
+
|
|
3
|
+
This file lists notable changes between releases.
|
|
4
|
+
|
|
5
|
+
## Version 0.1.0 (May 2026)
|
|
6
|
+
|
|
7
|
+
This is the first public release of the HDGPSO package.
|
|
8
|
+
|
|
9
|
+
The release includes the implementation of HDGPSO, the hybrid hyperparameter optimizer described in the paper. Each iteration updates the same population using Differential Evolution, Grey Wolf Optimization, and Particle Swarm Optimization in sequence. DE helps explore the search space, GWO guides the population toward better regions, and PSO refines candidates using personal-best and global-best memory. A small RandomForest surrogate is refit every few iterations and is used to screen candidates before the real objective function is evaluated.
|
|
10
|
+
|
|
11
|
+
The package supports mixed search spaces through `Float`, `Int`, and `Categorical` parameters, which can be combined into a `SearchSpace`. Float parameters may also be sampled on a log scale. Results are returned in an `OptimizeResult` object containing the best parameters, best loss, trial history as a pandas DataFrame, elapsed time, and stopping reason. Runs are reproducible when a fixed seed and deterministic objective are used.
|
|
12
|
+
|
|
13
|
+
This release also includes statistical helpers in `hdgpso.stats` for comparing optimizers, including Friedman testing, Nemenyi post-hoc analysis, Critical Difference diagrams, Cliff’s delta, and bootstrap confidence intervals. A benchmark harness is provided under `benchmarks/`, with adapters for GridSearch, RandomSearch, Bayesian Optimization, Optuna-TPE, scipy Differential Evolution, pyswarms Particle Swarm, and HDGPSO. The benchmark scripts reproduce the main 60-evaluation experiment and the budget sweep at 20, 40, 60, and 100 evaluations.
|
|
14
|
+
|
|
15
|
+
An experimental multi-fidelity variant, `HDGPSOMF`, is also included. It uses BOHB-style successive halving and trains the surrogate only on full-fidelity evaluations. This variant is provided for testing and is not part of the main headline results. Its API may change before a stable 1.0 release.
|
|
16
|
+
|
|
17
|
+
Default settings follow common values: `F = 0.8`, `CR = 0.5`, `c1 = c2 = 2.0`, and inertia decreasing from 0.7 to 0.4. The minimum population size is 4 because DE/rand/1 needs three other population members besides the target candidate.
|
|
18
|
+
|
|
@@ -0,0 +1,27 @@
|
|
|
1
|
+
cff-version: 1.2.0
|
|
2
|
+
message: "If you use HDGPSO in your research, please cite the following work."
|
|
3
|
+
title: "HDGPSO: A surrogate-assisted hybrid DE-GWO-PSO method for hyperparameter optimization"
|
|
4
|
+
type: software
|
|
5
|
+
authors:
|
|
6
|
+
- family-names: Kumar
|
|
7
|
+
given-names: Ashutosh
|
|
8
|
+
affiliation: University of North Dakota
|
|
9
|
+
email: ashutosh.kumar@und.edu
|
|
10
|
+
repository-code: "https://github.com/ashuxen/hdgpso"
|
|
11
|
+
license: MIT
|
|
12
|
+
keywords:
|
|
13
|
+
- hyperparameter optimization
|
|
14
|
+
- metaheuristic
|
|
15
|
+
- differential evolution
|
|
16
|
+
- grey wolf optimizer
|
|
17
|
+
- particle swarm optimization
|
|
18
|
+
- surrogate-assisted optimization
|
|
19
|
+
- AutoML
|
|
20
|
+
preferred-citation:
|
|
21
|
+
type: article
|
|
22
|
+
title: "HDGPSO: A Surrogate-Assisted Hybrid DE-GWO-PSO Method for Hyperparameter Optimization"
|
|
23
|
+
authors:
|
|
24
|
+
- family-names: Kumar
|
|
25
|
+
given-names: Ashutosh
|
|
26
|
+
year: 2026
|
|
27
|
+
journal: "TBD"
|
hdgpso-0.1.0/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 Ashutosh Kumar
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
hdgpso-0.1.0/PKG-INFO
ADDED
|
@@ -0,0 +1,234 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: hdgpso
|
|
3
|
+
Version: 0.1.0
|
|
4
|
+
Summary: HDGPSO: a hybrid DE-GWO-PSO hyperparameter optimizer with RandomForest surrogate filtering.
|
|
5
|
+
Project-URL: Homepage, https://github.com/ashuxen/hdgpso
|
|
6
|
+
Project-URL: Documentation, https://github.com/ashuxen/hdgpso#readme
|
|
7
|
+
Project-URL: Repository, https://github.com/ashuxen/hdgpso
|
|
8
|
+
Project-URL: Issues, https://github.com/ashuxen/hdgpso/issues
|
|
9
|
+
Author-email: Ashutosh Kumar <ashutosh.kumar@und.edu>
|
|
10
|
+
License: MIT License
|
|
11
|
+
|
|
12
|
+
Copyright (c) 2026 Ashutosh Kumar
|
|
13
|
+
|
|
14
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
15
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
16
|
+
in the Software without restriction, including without limitation the rights
|
|
17
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
18
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
19
|
+
furnished to do so, subject to the following conditions:
|
|
20
|
+
|
|
21
|
+
The above copyright notice and this permission notice shall be included in all
|
|
22
|
+
copies or substantial portions of the Software.
|
|
23
|
+
|
|
24
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
25
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
26
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
27
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
28
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
29
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
30
|
+
SOFTWARE.
|
|
31
|
+
License-File: LICENSE
|
|
32
|
+
Keywords: automl,differential-evolution,grey-wolf-optimizer,hyperparameter-optimization,metaheuristic,particle-swarm-optimization,surrogate-assisted-optimization
|
|
33
|
+
Classifier: Development Status :: 4 - Beta
|
|
34
|
+
Classifier: Intended Audience :: Science/Research
|
|
35
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
36
|
+
Classifier: Operating System :: OS Independent
|
|
37
|
+
Classifier: Programming Language :: Python :: 3
|
|
38
|
+
Classifier: Programming Language :: Python :: 3.10
|
|
39
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
40
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
41
|
+
Classifier: Programming Language :: Python :: 3.13
|
|
42
|
+
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
|
|
43
|
+
Classifier: Topic :: Scientific/Engineering :: Mathematics
|
|
44
|
+
Requires-Python: >=3.10
|
|
45
|
+
Requires-Dist: numpy>=1.23
|
|
46
|
+
Requires-Dist: pandas>=1.5
|
|
47
|
+
Requires-Dist: scikit-learn>=1.2
|
|
48
|
+
Requires-Dist: scipy>=1.10
|
|
49
|
+
Provides-Extra: benchmarks
|
|
50
|
+
Requires-Dist: matplotlib>=3.6; extra == 'benchmarks'
|
|
51
|
+
Requires-Dist: optuna>=3.0; extra == 'benchmarks'
|
|
52
|
+
Requires-Dist: pyswarms>=1.3; extra == 'benchmarks'
|
|
53
|
+
Requires-Dist: scikit-optimize>=0.10; extra == 'benchmarks'
|
|
54
|
+
Requires-Dist: xgboost>=2.0; extra == 'benchmarks'
|
|
55
|
+
Provides-Extra: deep
|
|
56
|
+
Requires-Dist: torch>=2.0; extra == 'deep'
|
|
57
|
+
Provides-Extra: dev
|
|
58
|
+
Requires-Dist: pytest-cov>=4.0; extra == 'dev'
|
|
59
|
+
Requires-Dist: pytest>=7.0; extra == 'dev'
|
|
60
|
+
Requires-Dist: ruff>=0.1; extra == 'dev'
|
|
61
|
+
Provides-Extra: stats
|
|
62
|
+
Requires-Dist: matplotlib>=3.6; extra == 'stats'
|
|
63
|
+
Description-Content-Type: text/markdown
|
|
64
|
+
|
|
65
|
+
# hdgpso
|
|
66
|
+
|
|
67
|
+
[]()
|
|
68
|
+
[](LICENSE)
|
|
69
|
+
|
|
70
|
+
This repository contains the Python implementation of HDGPSO, a hybrid method for hyperparameter optimization. The method combines Differential Evolution, Grey Wolf Optimization, and Particle Swarm Optimization in one sequential search process. A lightweight RandomForest surrogate is also used to filter candidate solutions before expensive model training is performed.
|
|
71
|
+
|
|
72
|
+
The motivation for the design is simple. Each of the three optimizers has a useful behavior, but each one also has limitations when used alone. DE helps explore the search space. GWO moves the population toward good candidate regions. PSO refines solutions using memory of past good positions. By applying them together in a single iteration, the method tries to keep the strengths of each component and reduce the impact of their individual weaknesses.
|
|
73
|
+
|
|
74
|
+
On the benchmark used in the paper, HDGPSO achieves the lowest mean rank at the standard 60-evaluation budget. The benchmark contains nine valid (dataset, model) pairs across four datasets and four model classes, evaluated under seven tuners with three random seeds. HDGPSO obtains a mean rank of 2.63, compared to 2.85 for Optuna-TPE and 2.89 for Bayesian Optimization. On the GradientBoosting cells specifically, HDGPSO wins every tested cell.
|
|
75
|
+
|
|
76
|
+
## Install
|
|
77
|
+
|
|
78
|
+
```bash
|
|
79
|
+
pip install git+https://github.com/ashuxen/hdgpso.git
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
From source:
|
|
83
|
+
|
|
84
|
+
```bash
|
|
85
|
+
git clone https://github.com/ashuxen/hdgpso.git
|
|
86
|
+
cd hdgpso
|
|
87
|
+
pip install -e ".[stats]" # adds matplotlib for plotting
|
|
88
|
+
pip install -e ".[benchmarks,deep,dev]" # full reproduction stack
|
|
89
|
+
```
|
|
90
|
+
|
|
91
|
+
Optional extras:
|
|
92
|
+
|
|
93
|
+
| Extra | Adds | When to use |
|
|
94
|
+
|-------|------|-------------|
|
|
95
|
+
| `[stats]` | matplotlib | `hdgpso.stats.cd_diagram` and `hdgpso.plots.*` |
|
|
96
|
+
| `[benchmarks]` | scikit-optimize, optuna, pyswarms, xgboost | the full 7-tuner benchmark |
|
|
97
|
+
| `[deep]` | torch | MLP and PINN-Heat objectives |
|
|
98
|
+
| `[dev]` | pytest, ruff | running the tests |
|
|
99
|
+
|
|
100
|
+
## Quickstart
|
|
101
|
+
|
|
102
|
+
```python
|
|
103
|
+
from hdgpso import HDGPSO, SearchSpace, Float, Int, Categorical
|
|
104
|
+
|
|
105
|
+
space = SearchSpace({
|
|
106
|
+
"n_estimators": Int(20, 300),
|
|
107
|
+
"max_depth": Int(2, 20),
|
|
108
|
+
"max_features": Categorical(["sqrt", "log2", 0.5, 1.0]),
|
|
109
|
+
})
|
|
110
|
+
|
|
111
|
+
def objective(params):
|
|
112
|
+
from sklearn.ensemble import RandomForestClassifier
|
|
113
|
+
from sklearn.model_selection import cross_val_score
|
|
114
|
+
from sklearn.datasets import load_breast_cancer
|
|
115
|
+
X, y = load_breast_cancer(return_X_y=True)
|
|
116
|
+
model = RandomForestClassifier(**params, random_state=0, n_jobs=1)
|
|
117
|
+
return -cross_val_score(model, X, y, cv=3).mean() # lower is better
|
|
118
|
+
|
|
119
|
+
result = HDGPSO(space, objective, population_size=10,
|
|
120
|
+
iterations=15, seed=0).optimize()
|
|
121
|
+
print(result.best_params)
|
|
122
|
+
print(f"Best loss: {result.best_loss:.4f}")
|
|
123
|
+
```
|
|
124
|
+
|
|
125
|
+
In typical use, the caller only specifies the search space, the objective function, and the population and iteration budget. The remaining parameters use values taken directly from the original papers:
|
|
126
|
+
|
|
127
|
+
- For DE: `F = 0.8` and `CR = 0.5`, which sit inside the Storn–Price recommended range.
|
|
128
|
+
- For PSO: `c1 = c2 = 2.0`, which is the classical Kennedy–Eberhart setting.
|
|
129
|
+
- The inertia weight decays linearly from 0.7 to 0.4. This is slightly tighter than the canonical 0.9 to 0.4, because the preceding DE and GWO stages already supply enough exploration and PSO is used here mainly for refinement.
|
|
130
|
+
- The RandomForest surrogate refits every 4 iterations once at least 12 trial points are available.
|
|
131
|
+
|
|
132
|
+
All of these are exposed as constructor arguments. The complete list is available through `help(HDGPSO)`.
|
|
133
|
+
|
|
134
|
+
## What the algorithm does
|
|
135
|
+
|
|
136
|
+
Each iteration of HDGPSO runs three operator stages on the same population.
|
|
137
|
+
|
|
138
|
+
First, Differential Evolution. For each candidate, three other population members are chosen at random and combined into a donor vector using the DE/rand/1 rule. Binary crossover is then applied between the donor and the current candidate, and the trial replaces the candidate only when it improves the loss.
|
|
139
|
+
|
|
140
|
+
Second, Grey Wolf Optimizer. The population is sorted by current loss, and the top three members are labeled α, β, and δ. Every other member is updated using a weighted blend of these three leaders, following the standard Mirjalili (2014) update.
|
|
141
|
+
|
|
142
|
+
Third, Particle Swarm Optimization. Each particle maintains its personal best position, and the swarm keeps a global best. The velocity is updated using both terms, scaled by a linearly decaying inertia weight, and the particle moves accordingly.
|
|
143
|
+
|
|
144
|
+
Between the stages, a RandomForest surrogate is trained on the trial history. The surrogate is used only to filter proposed candidates by predicting their mean and tree-variance and selecting those with the best lower-confidence-bound score. It does not replace any real objective evaluation. Its only role is to screen out clearly weak proposals before they are passed to the expensive objective.
|
|
145
|
+
|
|
146
|
+
## Repository layout
|
|
147
|
+
|
|
148
|
+
```
|
|
149
|
+
hdgpso/
|
|
150
|
+
├── src/hdgpso/ # installable package
|
|
151
|
+
│ ├── __init__.py # public API
|
|
152
|
+
│ ├── core.py # HDGPSO + SearchSpace types
|
|
153
|
+
│ ├── stats.py # Friedman / Nemenyi / CD / bootstrap
|
|
154
|
+
│ └── plots.py # convergence / rank-bar / wins
|
|
155
|
+
├── benchmarks/ # paper reproduction scripts
|
|
156
|
+
│ ├── benchmark.py # main run_benchmark() driver
|
|
157
|
+
│ ├── tuners.py # uniform adapters for the baseline tuners
|
|
158
|
+
│ ├── deep_objectives.py # MLP + PINN-Heat objectives (torch)
|
|
159
|
+
│ ├── run_claim_check_v*.py # main paper run at b = 60
|
|
160
|
+
│ └── run_budget_sweep.py # budget sensitivity sweep
|
|
161
|
+
├── tests/test_hdgpso.py # unit tests
|
|
162
|
+
└── examples/ # standalone usage examples
|
|
163
|
+
```
|
|
164
|
+
|
|
165
|
+
## Reproducing the paper benchmark
|
|
166
|
+
|
|
167
|
+
```bash
|
|
168
|
+
pip install -e ".[benchmarks,deep]"
|
|
169
|
+
cd benchmarks
|
|
170
|
+
|
|
171
|
+
# Main 7-tuner comparison at budget = 60
|
|
172
|
+
python run_claim_check_v5.py
|
|
173
|
+
|
|
174
|
+
# Budget sensitivity sweep over {20, 40, 60, 100}
|
|
175
|
+
python run_budget_sweep.py
|
|
176
|
+
```
|
|
177
|
+
|
|
178
|
+
Results are written into `results_*/` directories, which are gitignored.
|
|
179
|
+
|
|
180
|
+
## Statistical analysis (Demšar 2006)
|
|
181
|
+
|
|
182
|
+
The `hdgpso.stats` module implements the rank-based protocol used in the paper for comparing several tuners across several cells.
|
|
183
|
+
|
|
184
|
+
```python
|
|
185
|
+
import pandas as pd
|
|
186
|
+
from hdgpso.stats import (
|
|
187
|
+
friedman_test, nemenyi_matrix, cd_diagram,
|
|
188
|
+
hdgpso_vs_baselines_table, bootstrap_rank_ci, cliffs_delta,
|
|
189
|
+
)
|
|
190
|
+
|
|
191
|
+
summary = pd.read_csv("results/summary.csv")
|
|
192
|
+
print(friedman_test(summary)) # global rejection check
|
|
193
|
+
print(bootstrap_rank_ci(summary)) # 95% CI per tuner
|
|
194
|
+
print(hdgpso_vs_baselines_table(summary, target="HDGPSO"))
|
|
195
|
+
cd_diagram(summary, save_path="fig_cd.png", title="My benchmark")
|
|
196
|
+
```
|
|
197
|
+
|
|
198
|
+
## Tests
|
|
199
|
+
|
|
200
|
+
```bash
|
|
201
|
+
pip install -e ".[dev]"
|
|
202
|
+
pytest -v
|
|
203
|
+
```
|
|
204
|
+
|
|
205
|
+
## Experimental: HDGPSOMF
|
|
206
|
+
|
|
207
|
+
The package also includes an experimental multi-fidelity variant, called HDGPSOMF, which wraps HDGPSO with BOHB-style successive halving over fidelity units. This variant is not part of the published headline results. On the benchmark mix used in this study it did not show a consistent advantage. The main reason is that the sklearn models are already inexpensive to train at full fidelity, and the low-fidelity proxies introduced enough noise to flip candidate rankings.
|
|
208
|
+
|
|
209
|
+
The variant may be more useful in settings where fidelity (for example, the number of training epochs) is exact and informative, and where each full-fidelity evaluation is genuinely expensive. The interface is exposed in `examples/02_multifidelity.py` and `help(hdgpso.HDGPSOMF)`. The API is not yet stable.
|
|
210
|
+
|
|
211
|
+
## References
|
|
212
|
+
|
|
213
|
+
- Demšar, J. (2006). *Statistical Comparisons of Classifiers over Multiple Data Sets.* JMLR 7.
|
|
214
|
+
- Mirjalili, S., Mirjalili, S. M., & Lewis, A. (2014). *Grey Wolf Optimizer.* Adv. Eng. Software 69.
|
|
215
|
+
- Storn, R. & Price, K. (1997). *Differential Evolution.* J. Global Optimization 11.
|
|
216
|
+
- Kennedy, J. & Eberhart, R. (1995). *Particle Swarm Optimization.*
|
|
217
|
+
|
|
218
|
+
## Citation
|
|
219
|
+
|
|
220
|
+
See [CITATION.cff](CITATION.cff). BibTeX entry:
|
|
221
|
+
|
|
222
|
+
```bibtex
|
|
223
|
+
@software{kumar2026hdgpso,
|
|
224
|
+
author = {Ashutosh Kumar},
|
|
225
|
+
title = {hdgpso: Hybrid DE-GWO-PSO hyperparameter optimization},
|
|
226
|
+
year = {2026},
|
|
227
|
+
url = {https://github.com/ashuxen/hdgpso},
|
|
228
|
+
license = {MIT}
|
|
229
|
+
}
|
|
230
|
+
```
|
|
231
|
+
|
|
232
|
+
## License
|
|
233
|
+
|
|
234
|
+
MIT. See [LICENSE](LICENSE).
|
hdgpso-0.1.0/README.md
ADDED
|
@@ -0,0 +1,170 @@
|
|
|
1
|
+
# hdgpso
|
|
2
|
+
|
|
3
|
+
[]()
|
|
4
|
+
[](LICENSE)
|
|
5
|
+
|
|
6
|
+
This repository contains the Python implementation of HDGPSO, a hybrid method for hyperparameter optimization. The method combines Differential Evolution, Grey Wolf Optimization, and Particle Swarm Optimization in one sequential search process. A lightweight RandomForest surrogate is also used to filter candidate solutions before expensive model training is performed.
|
|
7
|
+
|
|
8
|
+
The motivation for the design is simple. Each of the three optimizers has a useful behavior, but each one also has limitations when used alone. DE helps explore the search space. GWO moves the population toward good candidate regions. PSO refines solutions using memory of past good positions. By applying them together in a single iteration, the method tries to keep the strengths of each component and reduce the impact of their individual weaknesses.
|
|
9
|
+
|
|
10
|
+
On the benchmark used in the paper, HDGPSO achieves the lowest mean rank at the standard 60-evaluation budget. The benchmark contains nine valid (dataset, model) pairs across four datasets and four model classes, evaluated under seven tuners with three random seeds. HDGPSO obtains a mean rank of 2.63, compared to 2.85 for Optuna-TPE and 2.89 for Bayesian Optimization. On the GradientBoosting cells specifically, HDGPSO wins every tested cell.
|
|
11
|
+
|
|
12
|
+
## Install
|
|
13
|
+
|
|
14
|
+
```bash
|
|
15
|
+
pip install git+https://github.com/ashuxen/hdgpso.git
|
|
16
|
+
```
|
|
17
|
+
|
|
18
|
+
From source:
|
|
19
|
+
|
|
20
|
+
```bash
|
|
21
|
+
git clone https://github.com/ashuxen/hdgpso.git
|
|
22
|
+
cd hdgpso
|
|
23
|
+
pip install -e ".[stats]" # adds matplotlib for plotting
|
|
24
|
+
pip install -e ".[benchmarks,deep,dev]" # full reproduction stack
|
|
25
|
+
```
|
|
26
|
+
|
|
27
|
+
Optional extras:
|
|
28
|
+
|
|
29
|
+
| Extra | Adds | When to use |
|
|
30
|
+
|-------|------|-------------|
|
|
31
|
+
| `[stats]` | matplotlib | `hdgpso.stats.cd_diagram` and `hdgpso.plots.*` |
|
|
32
|
+
| `[benchmarks]` | scikit-optimize, optuna, pyswarms, xgboost | the full 7-tuner benchmark |
|
|
33
|
+
| `[deep]` | torch | MLP and PINN-Heat objectives |
|
|
34
|
+
| `[dev]` | pytest, ruff | running the tests |
|
|
35
|
+
|
|
36
|
+
## Quickstart
|
|
37
|
+
|
|
38
|
+
```python
|
|
39
|
+
from hdgpso import HDGPSO, SearchSpace, Float, Int, Categorical
|
|
40
|
+
|
|
41
|
+
space = SearchSpace({
|
|
42
|
+
"n_estimators": Int(20, 300),
|
|
43
|
+
"max_depth": Int(2, 20),
|
|
44
|
+
"max_features": Categorical(["sqrt", "log2", 0.5, 1.0]),
|
|
45
|
+
})
|
|
46
|
+
|
|
47
|
+
def objective(params):
|
|
48
|
+
from sklearn.ensemble import RandomForestClassifier
|
|
49
|
+
from sklearn.model_selection import cross_val_score
|
|
50
|
+
from sklearn.datasets import load_breast_cancer
|
|
51
|
+
X, y = load_breast_cancer(return_X_y=True)
|
|
52
|
+
model = RandomForestClassifier(**params, random_state=0, n_jobs=1)
|
|
53
|
+
return -cross_val_score(model, X, y, cv=3).mean() # lower is better
|
|
54
|
+
|
|
55
|
+
result = HDGPSO(space, objective, population_size=10,
|
|
56
|
+
iterations=15, seed=0).optimize()
|
|
57
|
+
print(result.best_params)
|
|
58
|
+
print(f"Best loss: {result.best_loss:.4f}")
|
|
59
|
+
```
|
|
60
|
+
|
|
61
|
+
In typical use, the caller only specifies the search space, the objective function, and the population and iteration budget. The remaining parameters use values taken directly from the original papers:
|
|
62
|
+
|
|
63
|
+
- For DE: `F = 0.8` and `CR = 0.5`, which sit inside the Storn–Price recommended range.
|
|
64
|
+
- For PSO: `c1 = c2 = 2.0`, which is the classical Kennedy–Eberhart setting.
|
|
65
|
+
- The inertia weight decays linearly from 0.7 to 0.4. This is slightly tighter than the canonical 0.9 to 0.4, because the preceding DE and GWO stages already supply enough exploration and PSO is used here mainly for refinement.
|
|
66
|
+
- The RandomForest surrogate refits every 4 iterations once at least 12 trial points are available.
|
|
67
|
+
|
|
68
|
+
All of these are exposed as constructor arguments. The complete list is available through `help(HDGPSO)`.
|
|
69
|
+
|
|
70
|
+
## What the algorithm does
|
|
71
|
+
|
|
72
|
+
Each iteration of HDGPSO runs three operator stages on the same population.
|
|
73
|
+
|
|
74
|
+
First, Differential Evolution. For each candidate, three other population members are chosen at random and combined into a donor vector using the DE/rand/1 rule. Binary crossover is then applied between the donor and the current candidate, and the trial replaces the candidate only when it improves the loss.
|
|
75
|
+
|
|
76
|
+
Second, Grey Wolf Optimizer. The population is sorted by current loss, and the top three members are labeled α, β, and δ. Every other member is updated using a weighted blend of these three leaders, following the standard Mirjalili (2014) update.
|
|
77
|
+
|
|
78
|
+
Third, Particle Swarm Optimization. Each particle maintains its personal best position, and the swarm keeps a global best. The velocity is updated using both terms, scaled by a linearly decaying inertia weight, and the particle moves accordingly.
|
|
79
|
+
|
|
80
|
+
Between the stages, a RandomForest surrogate is trained on the trial history. The surrogate is used only to filter proposed candidates by predicting their mean and tree-variance and selecting those with the best lower-confidence-bound score. It does not replace any real objective evaluation. Its only role is to screen out clearly weak proposals before they are passed to the expensive objective.
|
|
81
|
+
|
|
82
|
+
## Repository layout
|
|
83
|
+
|
|
84
|
+
```
|
|
85
|
+
hdgpso/
|
|
86
|
+
├── src/hdgpso/ # installable package
|
|
87
|
+
│ ├── __init__.py # public API
|
|
88
|
+
│ ├── core.py # HDGPSO + SearchSpace types
|
|
89
|
+
│ ├── stats.py # Friedman / Nemenyi / CD / bootstrap
|
|
90
|
+
│ └── plots.py # convergence / rank-bar / wins
|
|
91
|
+
├── benchmarks/ # paper reproduction scripts
|
|
92
|
+
│ ├── benchmark.py # main run_benchmark() driver
|
|
93
|
+
│ ├── tuners.py # uniform adapters for the baseline tuners
|
|
94
|
+
│ ├── deep_objectives.py # MLP + PINN-Heat objectives (torch)
|
|
95
|
+
│ ├── run_claim_check_v*.py # main paper run at b = 60
|
|
96
|
+
│ └── run_budget_sweep.py # budget sensitivity sweep
|
|
97
|
+
├── tests/test_hdgpso.py # unit tests
|
|
98
|
+
└── examples/ # standalone usage examples
|
|
99
|
+
```
|
|
100
|
+
|
|
101
|
+
## Reproducing the paper benchmark
|
|
102
|
+
|
|
103
|
+
```bash
|
|
104
|
+
pip install -e ".[benchmarks,deep]"
|
|
105
|
+
cd benchmarks
|
|
106
|
+
|
|
107
|
+
# Main 7-tuner comparison at budget = 60
|
|
108
|
+
python run_claim_check_v5.py
|
|
109
|
+
|
|
110
|
+
# Budget sensitivity sweep over {20, 40, 60, 100}
|
|
111
|
+
python run_budget_sweep.py
|
|
112
|
+
```
|
|
113
|
+
|
|
114
|
+
Results are written into `results_*/` directories, which are gitignored.
|
|
115
|
+
|
|
116
|
+
## Statistical analysis (Demšar 2006)
|
|
117
|
+
|
|
118
|
+
The `hdgpso.stats` module implements the rank-based protocol used in the paper for comparing several tuners across several cells.
|
|
119
|
+
|
|
120
|
+
```python
|
|
121
|
+
import pandas as pd
|
|
122
|
+
from hdgpso.stats import (
|
|
123
|
+
friedman_test, nemenyi_matrix, cd_diagram,
|
|
124
|
+
hdgpso_vs_baselines_table, bootstrap_rank_ci, cliffs_delta,
|
|
125
|
+
)
|
|
126
|
+
|
|
127
|
+
summary = pd.read_csv("results/summary.csv")
|
|
128
|
+
print(friedman_test(summary)) # global rejection check
|
|
129
|
+
print(bootstrap_rank_ci(summary)) # 95% CI per tuner
|
|
130
|
+
print(hdgpso_vs_baselines_table(summary, target="HDGPSO"))
|
|
131
|
+
cd_diagram(summary, save_path="fig_cd.png", title="My benchmark")
|
|
132
|
+
```
|
|
133
|
+
|
|
134
|
+
## Tests
|
|
135
|
+
|
|
136
|
+
```bash
|
|
137
|
+
pip install -e ".[dev]"
|
|
138
|
+
pytest -v
|
|
139
|
+
```
|
|
140
|
+
|
|
141
|
+
## Experimental: HDGPSOMF
|
|
142
|
+
|
|
143
|
+
The package also includes an experimental multi-fidelity variant, called HDGPSOMF, which wraps HDGPSO with BOHB-style successive halving over fidelity units. This variant is not part of the published headline results. On the benchmark mix used in this study it did not show a consistent advantage. The main reason is that the sklearn models are already inexpensive to train at full fidelity, and the low-fidelity proxies introduced enough noise to flip candidate rankings.
|
|
144
|
+
|
|
145
|
+
The variant may be more useful in settings where fidelity (for example, the number of training epochs) is exact and informative, and where each full-fidelity evaluation is genuinely expensive. The interface is exposed in `examples/02_multifidelity.py` and `help(hdgpso.HDGPSOMF)`. The API is not yet stable.
|
|
146
|
+
|
|
147
|
+
## References
|
|
148
|
+
|
|
149
|
+
- Demšar, J. (2006). *Statistical Comparisons of Classifiers over Multiple Data Sets.* JMLR 7.
|
|
150
|
+
- Mirjalili, S., Mirjalili, S. M., & Lewis, A. (2014). *Grey Wolf Optimizer.* Adv. Eng. Software 69.
|
|
151
|
+
- Storn, R. & Price, K. (1997). *Differential Evolution.* J. Global Optimization 11.
|
|
152
|
+
- Kennedy, J. & Eberhart, R. (1995). *Particle Swarm Optimization.*
|
|
153
|
+
|
|
154
|
+
## Citation
|
|
155
|
+
|
|
156
|
+
See [CITATION.cff](CITATION.cff). BibTeX entry:
|
|
157
|
+
|
|
158
|
+
```bibtex
|
|
159
|
+
@software{kumar2026hdgpso,
|
|
160
|
+
author = {Ashutosh Kumar},
|
|
161
|
+
title = {hdgpso: Hybrid DE-GWO-PSO hyperparameter optimization},
|
|
162
|
+
year = {2026},
|
|
163
|
+
url = {https://github.com/ashuxen/hdgpso},
|
|
164
|
+
license = {MIT}
|
|
165
|
+
}
|
|
166
|
+
```
|
|
167
|
+
|
|
168
|
+
## License
|
|
169
|
+
|
|
170
|
+
MIT. See [LICENSE](LICENSE).
|