DFO-LS 1.4.1__py3-none-any.whl → 1.5.0__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Potentially problematic release.
This version of DFO-LS might be problematic. Click here for more details.
- {DFO_LS-1.4.1.dist-info → DFO_LS-1.5.0.dist-info}/METADATA +14 -34
- DFO_LS-1.5.0.dist-info/RECORD +14 -0
- {DFO_LS-1.4.1.dist-info → DFO_LS-1.5.0.dist-info}/WHEEL +1 -1
- dfols/__init__.py +1 -1
- dfols/controller.py +136 -45
- dfols/model.py +46 -29
- dfols/params.py +18 -2
- dfols/solver.py +86 -58
- dfols/trust_region.py +86 -7
- dfols/util.py +20 -9
- DFO_LS-1.4.1.dist-info/RECORD +0 -14
- {DFO_LS-1.4.1.dist-info → DFO_LS-1.5.0.dist-info}/LICENSE.txt +0 -0
- {DFO_LS-1.4.1.dist-info → DFO_LS-1.5.0.dist-info}/top_level.txt +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.1
|
|
2
2
|
Name: DFO-LS
|
|
3
|
-
Version: 1.
|
|
3
|
+
Version: 1.5.0
|
|
4
4
|
Summary: A flexible derivative-free solver for (bound constrained) nonlinear least-squares minimization
|
|
5
5
|
Author-email: Lindon Roberts <lindon.roberts@sydney.edu.au>
|
|
6
6
|
Maintainer-email: Lindon Roberts <lindon.roberts@sydney.edu.au>
|
|
@@ -68,7 +68,7 @@ DFO-LS: Derivative-Free Optimizer for Least-Squares
|
|
|
68
68
|
|
|
69
69
|
DFO-LS is a flexible package for solving nonlinear least-squares minimization, without requiring derivatives of the objective. It is particularly useful when evaluations of the objective function are expensive and/or noisy. DFO-LS is more flexible version of `DFO-GN <https://github.com/numericalalgorithmsgroup/dfogn>`_.
|
|
70
70
|
|
|
71
|
-
|
|
71
|
+
The main algorithm is described in our paper [1] below.
|
|
72
72
|
|
|
73
73
|
If you are interested in solving general optimization problems (without a least-squares structure), you may wish to try `Py-BOBYQA <https://github.com/numericalalgorithmsgroup/pybobyqa>`_, which has many of the same features as DFO-LS.
|
|
74
74
|
|
|
@@ -78,13 +78,15 @@ See manual.pdf or `here <https://numericalalgorithmsgroup.github.io/dfols/>`_.
|
|
|
78
78
|
|
|
79
79
|
Citation
|
|
80
80
|
--------
|
|
81
|
-
|
|
81
|
+
The development of DFO-LS is outlined over several publications:
|
|
82
82
|
|
|
83
|
-
|
|
83
|
+
1. C Cartis, J Fiala, B Marteau and L Roberts, `Improving the Flexibility and Robustness of Model-Based Derivative-Free Optimization Solvers <https://doi.org/10.1145/3338517>`_, *ACM Transactions on Mathematical Software*, 45:3 (2019), pp. 32:1-32:41 [`preprint arXiv 1804.00154 <https://arxiv.org/abs/1804.00154>`_] .
|
|
84
|
+
2. M Hough and L Roberts, `Model-Based Derivative-Free Methods for Convex-Constrained Optimization <https://doi.org/10.1137/21M1460971>`_, *SIAM Journal on Optimization*, 21:4 (2022), pp. 2552-2579 [`preprint arXiv 2111.05443 <https://arxiv.org/abs/2111.05443>`_].
|
|
85
|
+
3. Y Liu, K H Lam and L Roberts, `Black-box Optimization Algorithms for Regularized Least-squares Problems <http://arxiv.org/abs/2407.14915>`_, *arXiv preprint arXiv:arXiv:2407.14915*, 2024.
|
|
84
86
|
|
|
85
|
-
If you use DFO-LS
|
|
86
|
-
|
|
87
|
-
|
|
87
|
+
If you use DFO-LS in a paper, please cite [1].
|
|
88
|
+
If your problem has constraints, including bound constraints, please cite [1,2].
|
|
89
|
+
If your problem includes a regularizer, please cite [1,3].
|
|
88
90
|
|
|
89
91
|
Requirements
|
|
90
92
|
------------
|
|
@@ -114,27 +116,13 @@ For easy installation, use `pip <http://www.pip-installer.org/>`_ as root:
|
|
|
114
116
|
|
|
115
117
|
.. code-block:: bash
|
|
116
118
|
|
|
117
|
-
$
|
|
118
|
-
|
|
119
|
-
or alternatively *easy_install*:
|
|
120
|
-
|
|
121
|
-
.. code-block:: bash
|
|
122
|
-
|
|
123
|
-
$ [sudo] easy_install DFO-LS
|
|
124
|
-
|
|
125
|
-
If you do not have root privileges or you want to install DFO-LS for your private use, you can use:
|
|
126
|
-
|
|
127
|
-
.. code-block:: bash
|
|
128
|
-
|
|
129
|
-
$ pip install --user DFO-LS
|
|
130
|
-
|
|
131
|
-
which will install DFO-LS in your home directory.
|
|
119
|
+
$ pip install DFO-LS
|
|
132
120
|
|
|
133
121
|
Note that if an older install of DFO-LS is present on your system you can use:
|
|
134
122
|
|
|
135
123
|
.. code-block:: bash
|
|
136
124
|
|
|
137
|
-
$
|
|
125
|
+
$ pip install --upgrade DFO-LS
|
|
138
126
|
|
|
139
127
|
to upgrade DFO-LS to the latest version.
|
|
140
128
|
|
|
@@ -151,22 +139,14 @@ DFO-LS is written in pure Python and requires no compilation. It can be installe
|
|
|
151
139
|
|
|
152
140
|
.. code-block:: bash
|
|
153
141
|
|
|
154
|
-
$
|
|
155
|
-
|
|
156
|
-
If you do not have root privileges or you want to install DFO-LS for your private use, you can use:
|
|
157
|
-
|
|
158
|
-
.. code-block:: bash
|
|
159
|
-
|
|
160
|
-
$ pip install --user .
|
|
161
|
-
|
|
162
|
-
instead.
|
|
142
|
+
$ pip install .
|
|
163
143
|
|
|
164
144
|
To upgrade DFO-LS to the latest version, navigate to the top-level directory (i.e. the one containing :code:`pyproject.toml`) and rerun the installation using :code:`pip`, as above:
|
|
165
145
|
|
|
166
146
|
.. code-block:: bash
|
|
167
147
|
|
|
168
148
|
$ git pull
|
|
169
|
-
$
|
|
149
|
+
$ pip install .
|
|
170
150
|
|
|
171
151
|
Testing
|
|
172
152
|
-------
|
|
@@ -189,7 +169,7 @@ If DFO-LS was installed using *pip* you can uninstall as follows:
|
|
|
189
169
|
|
|
190
170
|
.. code-block:: bash
|
|
191
171
|
|
|
192
|
-
$
|
|
172
|
+
$ pip uninstall DFO-LS
|
|
193
173
|
|
|
194
174
|
If DFO-LS was installed manually you have to remove the installed files by hand (located in your python site-packages directory).
|
|
195
175
|
|
|
@@ -0,0 +1,14 @@
|
|
|
1
|
+
dfols/__init__.py,sha256=nMJ4G3JcmjQ82lYXV2ywxjHWQqd9nq7Ak6GIlrN70Tw,1605
|
|
2
|
+
dfols/controller.py,sha256=gz4yGpk8KyfsWxrAkI8y69K5ckSHZ3Xdq0fEVFtIcPk,49925
|
|
3
|
+
dfols/diagnostic_info.py,sha256=2kEUkL-MS4eDENUf1r2hOWsntP8OxMDKi_kyHmrC9V4,6081
|
|
4
|
+
dfols/hessian.py,sha256=sExx4J4KoGwHItbthX2odosB2ONbQFvLdlcod7PIh4k,4262
|
|
5
|
+
dfols/model.py,sha256=i-TcGNFAeYt4uu3R_-THTk2rOCDvgU_mcZQQXfE1ODA,19786
|
|
6
|
+
dfols/params.py,sha256=GzJGO0TByH1X3B0NbLOCOqmYG8dRiKPKjjX7or_fOqI,18342
|
|
7
|
+
dfols/solver.py,sha256=QUF84UYnSitvlpVssKLdcMF9e_zdA9qlZlg5e8IegeQ,63173
|
|
8
|
+
dfols/trust_region.py,sha256=JbHLBDw7H88a3cIMuialh7kpMNGjL3Lp9JsjrBNpDWQ,28231
|
|
9
|
+
dfols/util.py,sha256=efGVAKPb7YrHya1IOgyzacwa_h0u2jHHs5FhuxUlYDg,10282
|
|
10
|
+
DFO_LS-1.5.0.dist-info/LICENSE.txt,sha256=jOtLnuWt7d5Hsx6XXB2QxzrSe2sWWh3NgMfFRetluQM,35147
|
|
11
|
+
DFO_LS-1.5.0.dist-info/METADATA,sha256=JIQNs15kBtVr5_cA7JnXDbT-uQ06pqTd3RD_MRYmB7w,8069
|
|
12
|
+
DFO_LS-1.5.0.dist-info/WHEEL,sha256=cVxcB9AmuTcXqmwrtPhNK88dr7IR_b6qagTj0UvIEbY,91
|
|
13
|
+
DFO_LS-1.5.0.dist-info/top_level.txt,sha256=UfxRhaDN8HQx2_l17KbrDrERJ90OCN7VKkDMpYYbRLU,6
|
|
14
|
+
DFO_LS-1.5.0.dist-info/RECORD,,
|
dfols/__init__.py
CHANGED
dfols/controller.py
CHANGED
|
@@ -100,14 +100,19 @@ class ExitInformation(object):
|
|
|
100
100
|
|
|
101
101
|
|
|
102
102
|
class Controller(object):
|
|
103
|
-
def __init__(self, objfun,
|
|
104
|
-
scaling_changes, do_logging):
|
|
103
|
+
def __init__(self, objfun, argsf, x0, r0, r0_nsamples, xl, xu, projections, npt, rhobeg, rhoend, nf, nx, maxfun, params,
|
|
104
|
+
scaling_changes, do_logging, h=None, lh=None, argsh = (), prox_uh=None, argsprox = ()):
|
|
105
105
|
self.do_logging = do_logging
|
|
106
106
|
self.objfun = objfun
|
|
107
|
-
self.
|
|
107
|
+
self.h = h
|
|
108
|
+
self.argsf = argsf
|
|
109
|
+
self.argsh = argsh
|
|
110
|
+
self.lh = lh
|
|
111
|
+
self.prox_uh = prox_uh #TODO: add instruction for prox_uh
|
|
112
|
+
self.argsprox = argsprox
|
|
108
113
|
self.maxfun = maxfun
|
|
109
|
-
self.model = Model(npt, x0, r0, xl, xu, projections, r0_nsamples, precondition=params("interpolation.precondition"),
|
|
110
|
-
abs_tol = params("model.abs_tol"), rel_tol = params("model.rel_tol"), do_logging=do_logging)
|
|
114
|
+
self.model = Model(npt, x0, r0, xl, xu, projections, r0_nsamples, h=self.h, argsh = argsh, precondition=params("interpolation.precondition"),
|
|
115
|
+
abs_tol = params("model.abs_tol"), rel_tol = params("model.rel_tol"), do_logging=do_logging, scaling_changes=scaling_changes)
|
|
111
116
|
self.nf = nf
|
|
112
117
|
self.nx = nx
|
|
113
118
|
self.rhobeg = rhobeg
|
|
@@ -230,7 +235,7 @@ class Controller(object):
|
|
|
230
235
|
for k in range(0,self.n()):
|
|
231
236
|
# Evaluate objective at this new point
|
|
232
237
|
x = self.model.as_absolute_coordinates(D[k, :])
|
|
233
|
-
rvec_list,
|
|
238
|
+
rvec_list, obj_list, num_samples_run, exit_info = self.evaluate_objective(x, number_of_samples, params)
|
|
234
239
|
|
|
235
240
|
# Handle exit conditions (f < min obj value or maxfun reached)
|
|
236
241
|
if exit_info is not None:
|
|
@@ -289,7 +294,7 @@ class Controller(object):
|
|
|
289
294
|
|
|
290
295
|
# Evaluate objective at this new point
|
|
291
296
|
x = self.model.as_absolute_coordinates(xpts_added[k, :])
|
|
292
|
-
rvec_list,
|
|
297
|
+
rvec_list, obj_list, num_samples_run, exit_info = self.evaluate_objective(x, number_of_samples, params)
|
|
293
298
|
|
|
294
299
|
# Handle exit conditions (f < min obj value or maxfun reached)
|
|
295
300
|
if exit_info is not None:
|
|
@@ -309,7 +314,7 @@ class Controller(object):
|
|
|
309
314
|
# Note: this works because the steps for (k) and (k-n) points were in the same coordinate direction
|
|
310
315
|
if self.n() + 1 <= k < 2 * self.n() + 1:
|
|
311
316
|
# Only swap if steps were in different directions AND new pt has lower objective
|
|
312
|
-
if stepa * stepb < 0.0 and self.model.
|
|
317
|
+
if stepa * stepb < 0.0 and self.model.objval[k] < self.model.objval[k - self.n()]:
|
|
313
318
|
xpts_added[[k, k-self.n()]] = xpts_added[[k-self.n(), k]]
|
|
314
319
|
|
|
315
320
|
return None # return & continue
|
|
@@ -342,7 +347,7 @@ class Controller(object):
|
|
|
342
347
|
for ndirns in range(num_directions):
|
|
343
348
|
new_point = xopt + dirns[ndirns, :] # alway base move around best value so far
|
|
344
349
|
x = self.model.as_absolute_coordinates(new_point)
|
|
345
|
-
rvec_list,
|
|
350
|
+
rvec_list, obj_list, num_samples_run, exit_info = eval_obj_results[ndirns]
|
|
346
351
|
# Handle exit conditions (f < min obj value or maxfun reached)
|
|
347
352
|
if exit_info is not None:
|
|
348
353
|
if num_samples_run > 0:
|
|
@@ -361,7 +366,7 @@ class Controller(object):
|
|
|
361
366
|
|
|
362
367
|
# Evaluate objective
|
|
363
368
|
x = self.model.as_absolute_coordinates(new_point)
|
|
364
|
-
rvec_list,
|
|
369
|
+
rvec_list, obj_list, num_samples_run, exit_info = self.evaluate_objective(x, number_of_samples, params)
|
|
365
370
|
|
|
366
371
|
# Handle exit conditions (f < min obj value or maxfun reached)
|
|
367
372
|
if exit_info is not None:
|
|
@@ -398,7 +403,7 @@ class Controller(object):
|
|
|
398
403
|
for j in range(num_steps):
|
|
399
404
|
xnew = self.model.xopt() + (step_length / LA.norm(dirns[j, :])) * dirns[j, :]
|
|
400
405
|
x = self.model.as_absolute_coordinates(xnew)
|
|
401
|
-
rvec_list,
|
|
406
|
+
rvec_list, obj_list, num_samples_run, exit_info = self.evaluate_objective(x, number_of_samples, params)
|
|
402
407
|
|
|
403
408
|
# Handle exit conditions (f < min obj value or maxfun reached)
|
|
404
409
|
if exit_info is not None:
|
|
@@ -436,13 +441,85 @@ class Controller(object):
|
|
|
436
441
|
|
|
437
442
|
return dirn * (step_length / LA.norm(dirn))
|
|
438
443
|
|
|
439
|
-
def
|
|
440
|
-
#
|
|
444
|
+
def evaluate_criticality_measure(self, params):
|
|
445
|
+
# Calculate criticality measure for regularized problems (h is not None)
|
|
446
|
+
|
|
447
|
+
# Build model for full least squares function
|
|
441
448
|
gopt, H = self.model.build_full_model()
|
|
449
|
+
|
|
450
|
+
if np.any(np.isnan(gopt)) or np.any(np.isnan(H)) or not np.all(np.isfinite(gopt)) or not np.all(np.isfinite(H)):
|
|
451
|
+
module_logger.debug("nan/inf values in gopt and/or H, skipping ctrsbox_sfista (criticality measure calc)")
|
|
452
|
+
# d = np.zeros(gopt.shape)
|
|
453
|
+
# gnew = gopt.copy()
|
|
454
|
+
# crvmin = -1
|
|
455
|
+
return np.inf
|
|
456
|
+
|
|
457
|
+
# NOTE: smaller params here to get more iterations in S-FISTA
|
|
458
|
+
func_tol = params("func_tol.criticality_measure") * self.delta
|
|
442
459
|
if self.model.projections:
|
|
443
|
-
d, gnew, crvmin =
|
|
460
|
+
d, gnew, crvmin = ctrsbox_sfista(self.model.xopt(abs_coordinates=True), gopt, np.zeros(H.shape), self.model.projections, 1,
|
|
461
|
+
self.h, self.lh, self.prox_uh, argsh = self.argsh, argsprox=self.argsprox, func_tol=func_tol,
|
|
462
|
+
max_iters=params("func_tol.max_iters"), d_max_iters=params("dykstra.max_iters"), d_tol=params("dykstra.d_tol"),
|
|
463
|
+
scaling_changes=self.scaling_changes, sfista_iters_scale=params("sfista.max_iters_scaling"))
|
|
464
|
+
else:
|
|
465
|
+
proj = lambda x: pbox(x, self.model.sl, self.model.su)
|
|
466
|
+
d, gnew, crvmin = ctrsbox_sfista(self.model.xopt(abs_coordinates=True), gopt, np.zeros(H.shape), [proj], 1,
|
|
467
|
+
self.h, self.lh, self.prox_uh, argsh = self.argsh, argsprox=self.argsprox, func_tol=func_tol,
|
|
468
|
+
max_iters=params("func_tol.max_iters"), d_max_iters=params("dykstra.max_iters"), d_tol=params("dykstra.d_tol"),
|
|
469
|
+
scaling_changes=self.scaling_changes, sfista_iters_scale=params("sfista.max_iters_scaling"))
|
|
470
|
+
|
|
471
|
+
# Calculate criticality measure
|
|
472
|
+
criticality_measure = self.h(remove_scaling(self.model.xopt(abs_coordinates=True), self.scaling_changes), *self.argsh) - model_value(gopt, np.zeros(H.shape), d, self.model.xopt(abs_coordinates=True), self.h, self.argsh, self.scaling_changes)
|
|
473
|
+
return criticality_measure
|
|
474
|
+
|
|
475
|
+
def trust_region_step(self, params, criticality_measure=1e-2):
|
|
476
|
+
# Build model for full least squares function
|
|
477
|
+
gopt, H = self.model.build_full_model()
|
|
478
|
+
# Build func_tol for trust region step
|
|
479
|
+
# QUESTION: c1 = min{1, 1/delta_max^2}, but choose c1=1here; choose maxhessian = max(||H||_2,1)
|
|
480
|
+
# QUESTION: when criticality_measure = 0? choose max(criticality_measure,1)
|
|
481
|
+
func_tol = (1-params("func_tol.tr_step")) * 1 * max(criticality_measure,1) * min(self.delta, max(criticality_measure,1) / max(np.linalg.norm(H, 2),1))
|
|
482
|
+
|
|
483
|
+
if self.h is None:
|
|
484
|
+
if self.model.projections:
|
|
485
|
+
# Running PGD/SFISTA is generally slower than trsbox, so don't do this if gopt or H have bad values
|
|
486
|
+
# (this will ultimately lead to a manual setting of d=0 and calling a safety step anyway)
|
|
487
|
+
if np.any(np.isnan(gopt)) or np.any(np.isnan(H)) or not np.all(np.isfinite(gopt)) or not np.all(np.isfinite(H)):
|
|
488
|
+
module_logger.debug("nan/inf values in gopt and/or H, skipping ctrsbox_pgd")
|
|
489
|
+
d = np.zeros(gopt.shape)
|
|
490
|
+
gnew = gopt.copy()
|
|
491
|
+
crvmin = -1
|
|
492
|
+
else:
|
|
493
|
+
d, gnew, crvmin = ctrsbox_pgd(self.model.xopt(abs_coordinates=True), gopt, H, self.model.projections, self.delta, d_max_iters=params("dykstra.max_iters"), d_tol=params("dykstra.d_tol"))
|
|
494
|
+
else:
|
|
495
|
+
d, gnew, crvmin = trsbox(self.model.xopt(), gopt, H, self.model.sl, self.model.su, self.delta)
|
|
444
496
|
else:
|
|
445
|
-
|
|
497
|
+
# Running PGD/SFISTA is generally slower than trsbox, so don't do this if gopt or H have bad values
|
|
498
|
+
# (this will ultimately lead to a manual setting of d=0 and calling a safety step anyway)
|
|
499
|
+
if np.any(np.isnan(gopt)) or np.any(np.isnan(H)) or not np.all(np.isfinite(gopt)) or not np.all(np.isfinite(H)):
|
|
500
|
+
module_logger.debug("nan/inf values in gopt and/or H, skipping ctrsbox_sfista")
|
|
501
|
+
d = np.zeros(gopt.shape)
|
|
502
|
+
gnew = gopt.copy()
|
|
503
|
+
crvmin = -1
|
|
504
|
+
elif self.model.projections:
|
|
505
|
+
d, gnew, crvmin = ctrsbox_sfista(self.model.xopt(abs_coordinates=True), gopt, H, self.model.projections, self.delta,
|
|
506
|
+
self.h, self.lh, self.prox_uh, argsh = self.argsh, argsprox=self.argsprox, func_tol=func_tol,
|
|
507
|
+
max_iters=params("func_tol.max_iters"), d_max_iters=params("dykstra.max_iters"), d_tol=params("dykstra.d_tol"),
|
|
508
|
+
scaling_changes=self.scaling_changes, sfista_iters_scale=params("sfista.max_iters_scaling"))
|
|
509
|
+
else:
|
|
510
|
+
# NOTE: alternative way if using trsbox
|
|
511
|
+
# d, gnew, crvmin = trsbox(self.model.xopt(), gopt, H, self.model.sl, self.model.su, self.delta)
|
|
512
|
+
proj = lambda x: pbox(x, self.model.sl, self.model.su)
|
|
513
|
+
d, gnew, crvmin = ctrsbox_sfista(self.model.xopt(abs_coordinates=True), gopt, H, [proj], self.delta,
|
|
514
|
+
self.h, self.lh, self.prox_uh, argsh = self.argsh, argsprox=self.argsprox, func_tol=func_tol,
|
|
515
|
+
max_iters=params("func_tol.max_iters"), d_max_iters=params("dykstra.max_iters"), d_tol=params("dykstra.d_tol"),
|
|
516
|
+
scaling_changes=self.scaling_changes, sfista_iters_scale=params("sfista.max_iters_scaling"))
|
|
517
|
+
|
|
518
|
+
# NOTE: check sufficient decrease. If increase in the model, set zero step
|
|
519
|
+
pred_reduction = self.h(remove_scaling(self.model.xopt(abs_coordinates=True), self.scaling_changes), *self.argsh) - model_value(gopt, H, d, self.model.xopt(abs_coordinates=True), self.h, self.argsh, self.scaling_changes)
|
|
520
|
+
if pred_reduction < 0.0:
|
|
521
|
+
d = np.zeros(d.shape)
|
|
522
|
+
|
|
446
523
|
return d, gopt, H, gnew, crvmin
|
|
447
524
|
|
|
448
525
|
def geometry_step(self, knew, adelt, number_of_samples, params):
|
|
@@ -463,10 +540,10 @@ class Controller(object):
|
|
|
463
540
|
return exit_info # didn't fix geometry - return & quit
|
|
464
541
|
|
|
465
542
|
gopt, H = self.model.build_full_model() # save here, to calculate predicted value from geometry step
|
|
466
|
-
|
|
543
|
+
objopt = self.model.objopt() # again, evaluate now, before model.change_point()
|
|
467
544
|
d = xnew - self.model.xopt()
|
|
468
545
|
x = self.model.as_absolute_coordinates(xnew)
|
|
469
|
-
rvec_list,
|
|
546
|
+
rvec_list, obj_list, num_samples_run, exit_info = self.evaluate_objective(x, number_of_samples, params)
|
|
470
547
|
|
|
471
548
|
# Handle exit conditions (f < min obj value or maxfun reached)
|
|
472
549
|
if exit_info is not None:
|
|
@@ -481,11 +558,14 @@ class Controller(object):
|
|
|
481
558
|
self.model.add_new_sample(knew, rvec_extra=rvec_list[i, :])
|
|
482
559
|
|
|
483
560
|
# Estimate actual reduction to add to diffs vector
|
|
484
|
-
|
|
485
|
-
|
|
561
|
+
obj = sumsq(np.mean(rvec_list[:num_samples_run, :], axis=0)) # estimate actual objective value
|
|
486
562
|
# pred_reduction = - calculate_model_value(gopt, H, d)
|
|
487
563
|
pred_reduction = - model_value(gopt, H, d)
|
|
488
|
-
|
|
564
|
+
if self.h is not None:
|
|
565
|
+
obj += self.h(remove_scaling(x, self.scaling_changes), *self.argsh)
|
|
566
|
+
# since m(0) = h(x)
|
|
567
|
+
pred_reduction = self.h(remove_scaling(x, self.scaling_changes), *self.argsh) - model_value(gopt, H, d, x, self.h, self.argsh, self.scaling_changes)
|
|
568
|
+
actual_reduction = objopt - obj
|
|
489
569
|
self.diffs = [abs(pred_reduction - actual_reduction), self.diffs[0], self.diffs[1]]
|
|
490
570
|
return None # exit_info = None
|
|
491
571
|
|
|
@@ -513,7 +593,7 @@ class Controller(object):
|
|
|
513
593
|
def evaluate_objective(self, x, number_of_samples, params):
|
|
514
594
|
# Sample from objective function several times, keeping track of maxfun and min_obj_value throughout
|
|
515
595
|
rvec_list = np.zeros((number_of_samples, self.m()))
|
|
516
|
-
|
|
596
|
+
obj_list = np.zeros((number_of_samples,))
|
|
517
597
|
num_samples_run = 0
|
|
518
598
|
incremented_nx = False
|
|
519
599
|
exit_info = None
|
|
@@ -527,19 +607,24 @@ class Controller(object):
|
|
|
527
607
|
if not incremented_nx:
|
|
528
608
|
self.nx += 1
|
|
529
609
|
incremented_nx = True
|
|
530
|
-
rvec_list[i, :],
|
|
531
|
-
|
|
610
|
+
rvec_list[i, :], obj_list[i] = eval_least_squares_with_regularisation(self.objfun, remove_scaling(x, self.scaling_changes), self.h,
|
|
611
|
+
argsf=self.argsf, argsh=self.argsh, verbose=self.do_logging, eval_num=self.nf, pt_num=self.nx,
|
|
532
612
|
full_x_thresh=params("logging.n_to_print_whole_x_vector"),
|
|
533
|
-
check_for_overflow=params("general.check_objfun_for_overflow")
|
|
534
|
-
verbose=self.do_logging)
|
|
613
|
+
check_for_overflow=params("general.check_objfun_for_overflow"))
|
|
535
614
|
num_samples_run += 1
|
|
536
615
|
|
|
537
616
|
# Check if the average value was below our threshold
|
|
538
|
-
|
|
539
|
-
|
|
540
|
-
|
|
617
|
+
# QUESTION: how to choose x in h when using averaged values
|
|
618
|
+
if self.h is None:
|
|
619
|
+
if num_samples_run > 0 and \
|
|
620
|
+
sumsq(np.mean(rvec_list[:num_samples_run, :], axis=0)) <= self.model.min_objective_value():
|
|
621
|
+
exit_info = ExitInformation(EXIT_SUCCESS, "Objective is sufficiently small")
|
|
622
|
+
else:
|
|
623
|
+
if num_samples_run > 0 and \
|
|
624
|
+
sumsq(np.mean(rvec_list[:num_samples_run, :], axis=0)) + self.h(remove_scaling(x, self.scaling_changes),*self.argsh) <= self.model.min_objective_value():
|
|
625
|
+
exit_info = ExitInformation(EXIT_SUCCESS, "Objective is sufficiently small")
|
|
541
626
|
|
|
542
|
-
return rvec_list,
|
|
627
|
+
return rvec_list, obj_list, num_samples_run, exit_info
|
|
543
628
|
|
|
544
629
|
def choose_point_to_replace(self, d, skip_kopt=True):
|
|
545
630
|
delsq = self.delta ** 2
|
|
@@ -615,11 +700,18 @@ class Controller(object):
|
|
|
615
700
|
self.last_successful_iter = current_iter # reset successful iteration check
|
|
616
701
|
return
|
|
617
702
|
|
|
618
|
-
def calculate_ratio(self, current_iter, rvec_list, d, gopt, H):
|
|
703
|
+
def calculate_ratio(self, x, current_iter, rvec_list, d, gopt, H):
|
|
619
704
|
exit_info = None
|
|
620
|
-
|
|
621
|
-
|
|
622
|
-
|
|
705
|
+
# estimate actual objective value
|
|
706
|
+
obj = sumsq(np.mean(rvec_list, axis=0))
|
|
707
|
+
# pred_reduction = - calculate_model_value(gopt, H, d)
|
|
708
|
+
pred_reduction = - model_value(gopt, H, d)
|
|
709
|
+
if self.h is not None:
|
|
710
|
+
# QUESTION: x+d here correct? rvec_list takes mean value
|
|
711
|
+
obj += self.h(remove_scaling(x+d, self.scaling_changes), *self.argsh)
|
|
712
|
+
# since m(0) = h(x)
|
|
713
|
+
pred_reduction = self.h(remove_scaling(x, self.scaling_changes), *self.argsh) - model_value(gopt, H, d, x, self.h, self.argsh, self.scaling_changes)
|
|
714
|
+
actual_reduction = self.model.objopt() - obj
|
|
623
715
|
self.diffs = [abs(actual_reduction - pred_reduction), self.diffs[0], self.diffs[1]]
|
|
624
716
|
if min(sqrt(sumsq(d)), self.delta) > self.rho: # if ||d|| >= rho, successful!
|
|
625
717
|
self.last_successful_iter = current_iter
|
|
@@ -627,8 +719,7 @@ class Controller(object):
|
|
|
627
719
|
if len(self.model.projections) > 1: # if we are using multiple projections, only warn since likely due to constraint intersection
|
|
628
720
|
exit_info = ExitInformation(EXIT_TR_INCREASE_WARNING, "Either multiple constraints are active or trust region step gave model increase")
|
|
629
721
|
else:
|
|
630
|
-
exit_info = ExitInformation(EXIT_TR_INCREASE_ERROR, "
|
|
631
|
-
|
|
722
|
+
exit_info = ExitInformation(EXIT_TR_INCREASE_ERROR, "Trust region step gave model increase")
|
|
632
723
|
ratio = actual_reduction / pred_reduction
|
|
633
724
|
return ratio, exit_info
|
|
634
725
|
|
|
@@ -636,13 +727,13 @@ class Controller(object):
|
|
|
636
727
|
if len(self.last_iters_step_taken) <= params("slow.history_for_slow"):
|
|
637
728
|
# Not enough info, simply append
|
|
638
729
|
self.last_iters_step_taken.append(current_iter)
|
|
639
|
-
self.last_fopts_step_taken.append(self.model.
|
|
730
|
+
self.last_fopts_step_taken.append(self.model.objopt())
|
|
640
731
|
this_iter_slow = False
|
|
641
732
|
else:
|
|
642
733
|
# Enough info - shift values
|
|
643
734
|
self.last_iters_step_taken = self.last_iters_step_taken[1:] + [current_iter]
|
|
644
|
-
self.last_fopts_step_taken = self.last_fopts_step_taken[1:] + [self.model.
|
|
645
|
-
this_iter_slow = (log(self.last_fopts_step_taken[0]) - log(self.model.
|
|
735
|
+
self.last_fopts_step_taken = self.last_fopts_step_taken[1:] + [self.model.objopt()]
|
|
736
|
+
this_iter_slow = (log(self.last_fopts_step_taken[0]) - log(self.model.objopt())) / \
|
|
646
737
|
float(params("slow.history_for_slow")) < params("slow.thresh_for_slow")
|
|
647
738
|
# Update counter of number of slow iterations
|
|
648
739
|
if this_iter_slow:
|
|
@@ -659,9 +750,9 @@ class Controller(object):
|
|
|
659
750
|
def soft_restart(self, number_of_samples, nruns_so_far, params, x_in_abs_coords_to_save=None, rvec_to_save=None,
|
|
660
751
|
nsamples_to_save=None):
|
|
661
752
|
# A successful run is one where we reduced fopt
|
|
662
|
-
if self.model.
|
|
753
|
+
if self.model.objopt() < self.last_run_fopt:
|
|
663
754
|
self.last_successful_run = nruns_so_far
|
|
664
|
-
self.last_run_fopt = self.model.
|
|
755
|
+
self.last_run_fopt = self.model.objopt()
|
|
665
756
|
|
|
666
757
|
ok_to_do_restart = (nruns_so_far - self.last_successful_run < params("restarts.max_unsuccessful_restarts")) and \
|
|
667
758
|
(self.nf < self.maxfun)
|
|
@@ -682,7 +773,7 @@ class Controller(object):
|
|
|
682
773
|
self.model.nsamples[self.model.kopt], x_in_abs_coords=True)
|
|
683
774
|
|
|
684
775
|
if self.do_logging:
|
|
685
|
-
module_logger.info("Soft restart [currently, f = %g after %g function evals]" % (self.model.
|
|
776
|
+
module_logger.info("Soft restart [currently, f = %g after %g function evals]" % (self.model.objopt(), self.nf))
|
|
686
777
|
# Resetting method: reset delta and rho, then move the closest 'num_steps' points to xk to improve geometry
|
|
687
778
|
# Note: closest points because we are suddenly increasing delta & rho, so we want to encourage spreading out points
|
|
688
779
|
self.delta = self.rhobeg
|
|
@@ -724,7 +815,7 @@ class Controller(object):
|
|
|
724
815
|
for i in range(num_pts_to_add):
|
|
725
816
|
xnew = self.model.xopt() + dirns[i, :] # always base move around best value so far
|
|
726
817
|
x = self.model.as_absolute_coordinates(xnew)
|
|
727
|
-
rvec_list,
|
|
818
|
+
rvec_list, obj_list, num_samples_run, exit_info = self.evaluate_objective(x, number_of_samples, params)
|
|
728
819
|
|
|
729
820
|
# Handle exit conditions (f < min obj value or maxfun reached)
|
|
730
821
|
if exit_info is not None:
|
|
@@ -771,11 +862,11 @@ class Controller(object):
|
|
|
771
862
|
add_noise = params("noise.scale_factor_for_quit") * params("noise.additive_noise_level")
|
|
772
863
|
for k in range(self.model.npt()):
|
|
773
864
|
all_fvals_within_noise = all_fvals_within_noise and \
|
|
774
|
-
(self.model.
|
|
865
|
+
(self.model.objval[k] <= self.model.objopt() + add_noise / sqrt(self.model.nsamples[k]))
|
|
775
866
|
else: # noise_level_multiplicative
|
|
776
867
|
ratio = 1.0 + params("noise.scale_factor_for_quit") * params("noise.multiplicative_noise_level")
|
|
777
868
|
for k in range(self.model.npt()):
|
|
778
|
-
this_ratio = self.model.
|
|
869
|
+
this_ratio = self.model.objval[k] / self.model.objopt() # fval_opt strictly positive (would have quit o/w)
|
|
779
870
|
all_fvals_within_noise = all_fvals_within_noise and (
|
|
780
871
|
this_ratio <= ratio / sqrt(self.model.nsamples[k]))
|
|
781
872
|
return all_fvals_within_noise
|
|
@@ -804,7 +895,7 @@ class Controller(object):
|
|
|
804
895
|
dirns[i, :] = -dirns[i, :]
|
|
805
896
|
xnew = np.maximum(np.minimum(self.model.xopt() + dirns[i, :], self.model.su), self.model.sl)
|
|
806
897
|
x = self.model.as_absolute_coordinates(xnew)
|
|
807
|
-
rvec_list,
|
|
898
|
+
rvec_list, obj_list, num_samples_run, exit_info = self.evaluate_objective(x, number_of_samples, params)
|
|
808
899
|
|
|
809
900
|
# Handle exit conditions (f < min obj value or maxfun reached)
|
|
810
901
|
if exit_info is not None:
|
dfols/model.py
CHANGED
|
@@ -36,7 +36,7 @@ import numpy as np
|
|
|
36
36
|
import scipy.linalg as LA
|
|
37
37
|
|
|
38
38
|
from .trust_region import trsbox_geometry
|
|
39
|
-
from .util import sumsq, dykstra
|
|
39
|
+
from .util import sumsq, dykstra, remove_scaling
|
|
40
40
|
|
|
41
41
|
__all__ = ['Model']
|
|
42
42
|
|
|
@@ -44,8 +44,8 @@ module_logger = logging.getLogger(__name__)
|
|
|
44
44
|
|
|
45
45
|
|
|
46
46
|
class Model(object):
|
|
47
|
-
def __init__(self, npt, x0, r0, xl, xu, projections, r0_nsamples, n=None, m=None, abs_tol=1e-12, rel_tol=1e-20, precondition=True,
|
|
48
|
-
do_logging=True):
|
|
47
|
+
def __init__(self, npt, x0, r0, xl, xu, projections, r0_nsamples, h=None, argsh=(), n=None, m=None, abs_tol=1e-12, rel_tol=1e-20, precondition=True,
|
|
48
|
+
do_logging=True, scaling_changes=None):
|
|
49
49
|
if n is None:
|
|
50
50
|
n = len(x0)
|
|
51
51
|
if m is None:
|
|
@@ -56,11 +56,15 @@ class Model(object):
|
|
|
56
56
|
assert xu.shape == (n,), "xu has wrong shape (got %s, expect (%g,))" % (str(xu.shape), n)
|
|
57
57
|
assert r0.shape == (m,), "r0 has wrong shape (got %s, expect (%g,))" % (str(r0.shape), m)
|
|
58
58
|
self.do_logging = do_logging
|
|
59
|
+
self.scaling_changes = scaling_changes
|
|
59
60
|
self.dim = n
|
|
60
61
|
self.resid_dim = m
|
|
61
62
|
self.num_pts = npt
|
|
62
63
|
self.npt_so_far = 1 # number of points added so far (with function values)
|
|
63
64
|
|
|
65
|
+
self.h = h
|
|
66
|
+
self.argsh = argsh
|
|
67
|
+
|
|
64
68
|
# Initialise to blank some useful stuff
|
|
65
69
|
# Interpolation points
|
|
66
70
|
self.xbase = x0.copy()
|
|
@@ -72,12 +76,15 @@ class Model(object):
|
|
|
72
76
|
# Function values
|
|
73
77
|
self.fval_v = np.inf * np.ones((npt, m)) # residuals for each xpt
|
|
74
78
|
self.fval_v[0, :] = r0.copy()
|
|
75
|
-
|
|
76
|
-
self.
|
|
79
|
+
|
|
80
|
+
self.objval = np.inf * np.ones((npt, )) # overall objective value for each xpt
|
|
81
|
+
self.objval[0] = sumsq(r0)
|
|
82
|
+
if h is not None:
|
|
83
|
+
self.objval[0] += h(remove_scaling(x0, self.scaling_changes), *argsh)
|
|
77
84
|
self.kopt = 0 # index of current iterate (should be best value so far)
|
|
78
85
|
self.nsamples = np.zeros((npt,), dtype=int) # number of samples used to evaluate objective at each point
|
|
79
86
|
self.nsamples[0] = r0_nsamples
|
|
80
|
-
self.
|
|
87
|
+
self.objbeg = self.objval[0] # f(x0), saved to check for sufficient reduction
|
|
81
88
|
|
|
82
89
|
# Termination criteria
|
|
83
90
|
self.abs_tol = abs_tol
|
|
@@ -90,7 +97,7 @@ class Model(object):
|
|
|
90
97
|
# Saved point (in absolute coordinates) - always check this value before quitting solver
|
|
91
98
|
self.xsave = None
|
|
92
99
|
self.rsave = None
|
|
93
|
-
self.
|
|
100
|
+
self.objsave = None
|
|
94
101
|
self.jacsave = None
|
|
95
102
|
self.nsamples_save = None
|
|
96
103
|
|
|
@@ -118,8 +125,8 @@ class Model(object):
|
|
|
118
125
|
def ropt(self):
|
|
119
126
|
return self.fval_v[self.kopt, :] # residuals for current iterate
|
|
120
127
|
|
|
121
|
-
def
|
|
122
|
-
return self.
|
|
128
|
+
def objopt(self):
|
|
129
|
+
return self.objval[self.kopt]
|
|
123
130
|
|
|
124
131
|
def xpt(self, k, abs_coordinates=False):
|
|
125
132
|
assert 0 <= k < self.npt(), "Invalid index %g" % k
|
|
@@ -135,9 +142,9 @@ class Model(object):
|
|
|
135
142
|
assert 0 <= k < self.npt(), "Invalid index %g" % k
|
|
136
143
|
return self.fval_v[k, :]
|
|
137
144
|
|
|
138
|
-
def
|
|
145
|
+
def objval(self, k):
|
|
139
146
|
assert 0 <= k < self.npt(), "Invalid index %g" % k
|
|
140
|
-
return self.
|
|
147
|
+
return self.objval[k]
|
|
141
148
|
|
|
142
149
|
def as_absolute_coordinates(self, x, full_dykstra=False):
|
|
143
150
|
# If x were an interpolation point, get the absolute coordinates of x
|
|
@@ -177,18 +184,20 @@ class Model(object):
|
|
|
177
184
|
|
|
178
185
|
self.points[k, :] = x.copy()
|
|
179
186
|
self.fval_v[k, :] = rvec.copy()
|
|
180
|
-
self.
|
|
187
|
+
self.objval[k] = sumsq(rvec)
|
|
188
|
+
if self.h is not None:
|
|
189
|
+
self.objval[k] += self.h(remove_scaling(self.xbase + x, self.scaling_changes), *self.argsh)
|
|
181
190
|
self.nsamples[k] = 1
|
|
182
191
|
self.factorisation_current = False
|
|
183
192
|
|
|
184
|
-
if allow_kopt_update and self.
|
|
193
|
+
if allow_kopt_update and self.objval[k] < self.objopt():
|
|
185
194
|
self.kopt = k
|
|
186
195
|
return
|
|
187
196
|
|
|
188
197
|
def swap_points(self, k1, k2):
|
|
189
198
|
self.points[[k1, k2], :] = self.points[[k2, k1], :]
|
|
190
199
|
self.fval_v[[k1, k2], :] = self.fval_v[[k2, k1], :]
|
|
191
|
-
self.
|
|
200
|
+
self.objval[[k1, k2]] = self.objval[[k2, k1]]
|
|
192
201
|
if self.kopt == k1:
|
|
193
202
|
self.kopt = k2
|
|
194
203
|
elif self.kopt == k2:
|
|
@@ -201,22 +210,27 @@ class Model(object):
|
|
|
201
210
|
assert 0 <= k < self.npt(), "Invalid index %g" % k
|
|
202
211
|
t = float(self.nsamples[k]) / float(self.nsamples[k] + 1)
|
|
203
212
|
self.fval_v[k, :] = t * self.fval_v[k, :] + (1 - t) * rvec_extra
|
|
204
|
-
|
|
213
|
+
# NOTE: how to sample when we have h? still at xpt(k), then add h(xpt(k)). Modify test if incorrect!
|
|
214
|
+
self.objval[k] = sumsq(self.fval_v[k, :])
|
|
215
|
+
if self.h is not None:
|
|
216
|
+
self.objval[k] += self.h(remove_scaling(self.xbase + self.points[k, :], self.scaling_changes), *self.argsh)
|
|
205
217
|
self.nsamples[k] += 1
|
|
206
218
|
|
|
207
|
-
self.kopt = np.argmin(self.
|
|
219
|
+
self.kopt = np.argmin(self.objval[:self.npt()]) # make sure kopt is always the best value we have
|
|
208
220
|
return
|
|
209
221
|
|
|
210
222
|
def add_new_point(self, x, rvec):
|
|
211
223
|
self.points = np.append(self.points, x.reshape((1, self.n())), axis=0) # append row to xpt
|
|
212
224
|
self.fval_v = np.append(self.fval_v, rvec.reshape((1, self.m())), axis=0) # append row to fval_v
|
|
213
|
-
|
|
214
|
-
|
|
225
|
+
obj = sumsq(rvec)
|
|
226
|
+
if self.h is not None:
|
|
227
|
+
obj += self.h(remove_scaling(self.xbase + x, self.scaling_changes), *self.argsh)
|
|
228
|
+
self.objval = np.append(self.objval, obj) # append entry to fval
|
|
215
229
|
self.nsamples = np.append(self.nsamples, 1) # add new sample number
|
|
216
230
|
self.num_pts += 1 # make sure npt is updated
|
|
217
231
|
self.npt_so_far += 1
|
|
218
232
|
|
|
219
|
-
if
|
|
233
|
+
if obj < self.objopt():
|
|
220
234
|
self.kopt = self.npt() - 1
|
|
221
235
|
|
|
222
236
|
self.factorisation_current = False
|
|
@@ -236,11 +250,14 @@ class Model(object):
|
|
|
236
250
|
return
|
|
237
251
|
|
|
238
252
|
def save_point(self, x, rvec, nsamples, x_in_abs_coords=True):
|
|
239
|
-
|
|
240
|
-
|
|
241
|
-
|
|
253
|
+
xabs = x.copy() if x_in_abs_coords else self.as_absolute_coordinates(x)
|
|
254
|
+
obj = sumsq(rvec)
|
|
255
|
+
if self.h is not None:
|
|
256
|
+
obj += self.h(remove_scaling(xabs, self.scaling_changes), *self.argsh)
|
|
257
|
+
if self.objsave is None or obj <= self.objsave:
|
|
258
|
+
self.xsave = xabs
|
|
242
259
|
self.rsave = rvec.copy()
|
|
243
|
-
self.
|
|
260
|
+
self.objsave = obj
|
|
244
261
|
self.jacsave = self.model_jac.copy()
|
|
245
262
|
self.nsamples_save = nsamples
|
|
246
263
|
return True
|
|
@@ -248,15 +265,15 @@ class Model(object):
|
|
|
248
265
|
return False # this value is worse than what we have already - didn't save
|
|
249
266
|
|
|
250
267
|
def get_final_results(self):
|
|
251
|
-
# Return x and
|
|
252
|
-
if self.
|
|
253
|
-
return self.xopt(abs_coordinates=True).copy(), self.ropt().copy(), self.
|
|
268
|
+
# Return x and objval for optimal point (either from xsave+objsave or kopt)
|
|
269
|
+
if self.objsave is None or self.objopt() <= self.objsave: # optimal has changed since xsave+objsave were last set
|
|
270
|
+
return self.xopt(abs_coordinates=True).copy(), self.ropt().copy(), self.objopt(), self.model_jac.copy(), self.nsamples[self.kopt]
|
|
254
271
|
else:
|
|
255
|
-
return self.xsave.copy(), self.rsave.copy(), self.
|
|
272
|
+
return self.xsave.copy(), self.rsave.copy(), self.objsave, self.jacsave, self.nsamples_save
|
|
256
273
|
|
|
257
274
|
def min_objective_value(self):
|
|
258
275
|
# Get termination criterion for f small: f <= abs_tol or f <= rel_tol * f0
|
|
259
|
-
return max(self.abs_tol, self.rel_tol * self.
|
|
276
|
+
return max(self.abs_tol, self.rel_tol * self.objbeg)
|
|
260
277
|
|
|
261
278
|
def model_value(self, d, d_based_at_xopt=True, with_const_term=False):
|
|
262
279
|
if d_based_at_xopt:
|
|
@@ -375,7 +392,7 @@ class Model(object):
|
|
|
375
392
|
return True, interp_error, sqrt(norm_J_error), linalg_resid, ls_interp_cond_num # flag ok
|
|
376
393
|
|
|
377
394
|
def build_full_model(self):
|
|
378
|
-
# Build full least squares
|
|
395
|
+
# Build full least squares model from mini-models
|
|
379
396
|
# Centred around xopt
|
|
380
397
|
r = self.model_const + np.dot(self.model_jac, self.xopt()) # constant term (for inexact interpolation)
|
|
381
398
|
J = self.model_jac
|
dfols/params.py
CHANGED
|
@@ -82,7 +82,7 @@ class ParameterList(object):
|
|
|
82
82
|
self.params["restarts.use_soft_restarts"] = True
|
|
83
83
|
self.params["restarts.soft.num_geom_steps"] = 3
|
|
84
84
|
self.params["restarts.soft.move_xk"] = True
|
|
85
|
-
self.params["restarts.soft.max_fake_successful_steps"] = maxfun # number ratio>0 steps below
|
|
85
|
+
self.params["restarts.soft.max_fake_successful_steps"] = maxfun # number ratio>0 steps below objsave allowed
|
|
86
86
|
self.params["restarts.hard.use_old_rk"] = True # recycle r(xk) from previous run?
|
|
87
87
|
self.params["restarts.increase_npt"] = False
|
|
88
88
|
self.params["restarts.increase_npt_amt"] = 1
|
|
@@ -109,12 +109,20 @@ class ParameterList(object):
|
|
|
109
109
|
self.params["growing.full_rank.min_sing_val"] = 1e-6 # absolute floor on singular values
|
|
110
110
|
self.params["growing.full_rank.svd_max_jac_cond"] = 1e8 # maximum condition number of Jacobian
|
|
111
111
|
self.params["growing.perturb_trust_region_step"] = False # add random direction onto TRS solution?
|
|
112
|
+
|
|
112
113
|
# Dykstra's algorithm
|
|
113
114
|
self.params["dykstra.d_tol"] = 1e-10
|
|
114
115
|
self.params["dykstra.max_iters"] = 100
|
|
116
|
+
|
|
115
117
|
# Matrix rank algorithm
|
|
116
118
|
self.params["matrix_rank.r_tol"] = 1e-18
|
|
117
|
-
|
|
119
|
+
|
|
120
|
+
# Function tolerance when applying S-FISTA method
|
|
121
|
+
self.params["func_tol.criticality_measure"] = 1e-3
|
|
122
|
+
self.params["func_tol.tr_step"] = 1-1e-1
|
|
123
|
+
self.params["func_tol.max_iters"] = 500
|
|
124
|
+
self.params["sfista.max_iters_scaling"] = 2.0
|
|
125
|
+
|
|
118
126
|
self.params_changed = {}
|
|
119
127
|
for p in self.params:
|
|
120
128
|
self.params_changed[p] = False
|
|
@@ -268,6 +276,14 @@ class ParameterList(object):
|
|
|
268
276
|
type_str, nonetype_ok, lower, upper = 'int', False, 0, None
|
|
269
277
|
elif key == "matrix_rank.r_tol":
|
|
270
278
|
type_str, nonetype_ok, lower, upper = 'float', False, 0.0, None
|
|
279
|
+
elif key == "func_tol.criticality_measure":
|
|
280
|
+
type_str, nonetype_ok, lower, upper = 'float', False, 0.0, 1.0
|
|
281
|
+
elif key == "func_tol.tr_step":
|
|
282
|
+
type_str, nonetype_ok, lower, upper = 'float', False, 0.0, 1.0
|
|
283
|
+
elif key == "func_tol.max_iters":
|
|
284
|
+
type_str, nonetype_ok, lower, upper = 'int', False, 0, None
|
|
285
|
+
elif key == "sfista.max_iters_scaling":
|
|
286
|
+
type_str, nonetype_ok, lower, upper = 'float', False, 1.0, None
|
|
271
287
|
else:
|
|
272
288
|
assert False, "ParameterList.param_type() has unknown key: %s" % key
|
|
273
289
|
return type_str, nonetype_ok, lower, upper
|
dfols/solver.py
CHANGED
|
@@ -48,10 +48,10 @@ module_logger = logging.getLogger(__name__)
|
|
|
48
48
|
|
|
49
49
|
# A container for the results of the optimization routine
|
|
50
50
|
class OptimResults(object):
|
|
51
|
-
def __init__(self, xmin, rmin,
|
|
51
|
+
def __init__(self, xmin, rmin, objmin, jacmin, nf, nx, nruns, exit_flag, exit_msg):
|
|
52
52
|
self.x = xmin
|
|
53
53
|
self.resid = rmin
|
|
54
|
-
self.
|
|
54
|
+
self.obj = objmin
|
|
55
55
|
self.jacobian = jacmin
|
|
56
56
|
self.nf = nf
|
|
57
57
|
self.nx = nx
|
|
@@ -77,7 +77,7 @@ class OptimResults(object):
|
|
|
77
77
|
output += "Residual vector = %s\n" % str(self.resid)
|
|
78
78
|
else:
|
|
79
79
|
output += "Not showing residual vector because it is too long; check self.resid\n"
|
|
80
|
-
output += "Objective value f(xmin) = %.10g\n" % self.
|
|
80
|
+
output += "Objective value f(xmin) = %.10g\n" % self.obj
|
|
81
81
|
output += "Needed %g objective evaluations (at %g points)\n" % (self.nf, self.nx)
|
|
82
82
|
if self.nruns > 1:
|
|
83
83
|
output += "Did a total of %g runs\n" % self.nruns
|
|
@@ -95,8 +95,8 @@ class OptimResults(object):
|
|
|
95
95
|
return output
|
|
96
96
|
|
|
97
97
|
|
|
98
|
-
def solve_main(objfun, x0,
|
|
99
|
-
diagnostic_info, scaling_changes, r0_avg_old=None, r0_nsamples_old=None, default_growing_method_set_by_user=None,
|
|
98
|
+
def solve_main(objfun, x0, argsf, xl, xu, projections, npt, rhobeg, rhoend, maxfun, nruns_so_far, nf_so_far, nx_so_far, nsamples, params,
|
|
99
|
+
diagnostic_info, scaling_changes, h=None, lh=None, argsh=(), prox_uh=None, argsprox=None, r0_avg_old=None, r0_nsamples_old=None, default_growing_method_set_by_user=None,
|
|
100
100
|
do_logging=True, print_progress=False):
|
|
101
101
|
# Evaluate at x0 (keep nf, nx correct and check for f < 1e-12)
|
|
102
102
|
# The hard bit is determining what m = len(r0) should be, and allocating memory appropriately
|
|
@@ -105,18 +105,17 @@ def solve_main(objfun, x0, args, xl, xu, projections, npt, rhobeg, rhoend, maxfu
|
|
|
105
105
|
# Evaluate the first time...
|
|
106
106
|
nf = nf_so_far + 1
|
|
107
107
|
nx = nx_so_far + 1
|
|
108
|
-
r0,
|
|
109
|
-
|
|
108
|
+
r0, obj0 = eval_least_squares_with_regularisation(objfun, remove_scaling(x0, scaling_changes), h,
|
|
109
|
+
argsf=argsf, argsh=argsh, verbose=do_logging, eval_num=nf, pt_num=nx,
|
|
110
110
|
full_x_thresh=params("logging.n_to_print_whole_x_vector"),
|
|
111
|
-
check_for_overflow=params("general.check_objfun_for_overflow")
|
|
112
|
-
verbose=do_logging)
|
|
111
|
+
check_for_overflow=params("general.check_objfun_for_overflow"))
|
|
113
112
|
m = len(r0)
|
|
114
113
|
|
|
115
114
|
# Now we have m, we can evaluate the rest of the times
|
|
116
115
|
rvec_list = np.zeros((number_of_samples, m))
|
|
117
|
-
|
|
116
|
+
obj_list = np.zeros((number_of_samples,))
|
|
118
117
|
rvec_list[0, :] = r0
|
|
119
|
-
|
|
118
|
+
obj_list[0] = obj0
|
|
120
119
|
num_samples_run = 1
|
|
121
120
|
exit_info = None
|
|
122
121
|
|
|
@@ -128,15 +127,20 @@ def solve_main(objfun, x0, args, xl, xu, projections, npt, rhobeg, rhoend, maxfu
|
|
|
128
127
|
|
|
129
128
|
nf += 1
|
|
130
129
|
# Don't increment nx for x0 - we did this earlier
|
|
131
|
-
rvec_list[i, :],
|
|
130
|
+
rvec_list[i, :], obj_list[i] = eval_least_squares_with_regularisation(objfun, remove_scaling(x0, scaling_changes), h,
|
|
131
|
+
argsf=argsf, argsh=argsh, verbose=do_logging, eval_num=nf, pt_num=nx,
|
|
132
132
|
full_x_thresh=params("logging.n_to_print_whole_x_vector"),
|
|
133
|
-
check_for_overflow=params("general.check_objfun_for_overflow")
|
|
134
|
-
verbose=do_logging)
|
|
133
|
+
check_for_overflow=params("general.check_objfun_for_overflow"))
|
|
135
134
|
num_samples_run += 1
|
|
136
135
|
|
|
137
136
|
r0_avg = np.mean(rvec_list[:num_samples_run, :], axis=0)
|
|
138
|
-
|
|
139
|
-
|
|
137
|
+
# NOTE: modify objvalue here
|
|
138
|
+
if h is None:
|
|
139
|
+
if sumsq(r0_avg) <= params("model.abs_tol"):
|
|
140
|
+
exit_info = ExitInformation(EXIT_SUCCESS, "Objective is sufficiently small")
|
|
141
|
+
else:
|
|
142
|
+
if sumsq(r0_avg) + h(remove_scaling(x0, scaling_changes), *argsh)<= params("model.abs_tol"):
|
|
143
|
+
exit_info = ExitInformation(EXIT_SUCCESS, "Objective is sufficiently small")
|
|
140
144
|
|
|
141
145
|
if exit_info is not None:
|
|
142
146
|
return x0, r0_avg, sumsq(r0_avg), None, num_samples_run, nf, nx, nruns_so_far+1, exit_info, diagnostic_info
|
|
@@ -162,8 +166,8 @@ def solve_main(objfun, x0, args, xl, xu, projections, npt, rhobeg, rhoend, maxfu
|
|
|
162
166
|
params('growing.delta_scale_new_dirns', new_value=0.1)
|
|
163
167
|
|
|
164
168
|
# Initialise controller
|
|
165
|
-
control = Controller(objfun,
|
|
166
|
-
params, scaling_changes, do_logging)
|
|
169
|
+
control = Controller(objfun, argsf, x0, r0_avg, num_samples_run, xl, xu, projections, npt, rhobeg, rhoend, nf, nx, maxfun,
|
|
170
|
+
params, scaling_changes, do_logging, h=h, lh=lh, argsh=argsh, prox_uh=prox_uh, argsprox=argsprox)
|
|
167
171
|
|
|
168
172
|
# Initialise interpolation set
|
|
169
173
|
number_of_samples = max(nsamples(control.delta, control.rho, 0, nruns_so_far), 1)
|
|
@@ -178,8 +182,8 @@ def solve_main(objfun, x0, args, xl, xu, projections, npt, rhobeg, rhoend, maxfu
|
|
|
178
182
|
module_logger.info("Initialising (coordinate directions)")
|
|
179
183
|
exit_info = control.initialise_coordinate_directions(number_of_samples, num_directions, params)
|
|
180
184
|
if exit_info is not None:
|
|
181
|
-
x, rvec,
|
|
182
|
-
return x, rvec,
|
|
185
|
+
x, rvec, obj, jacmin, nsamples = control.model.get_final_results()
|
|
186
|
+
return x, rvec, obj, None, nsamples, control.nf, control.nx, nruns_so_far + 1, exit_info, diagnostic_info
|
|
183
187
|
|
|
184
188
|
finished_growing = (control.model.npt() >= control.model.num_pts) # have we finished growing the initial set yet?
|
|
185
189
|
|
|
@@ -271,16 +275,30 @@ def solve_main(objfun, x0, args, xl, xu, projections, npt, rhobeg, rhoend, maxfu
|
|
|
271
275
|
nruns_so_far += 1
|
|
272
276
|
break # quit
|
|
273
277
|
|
|
274
|
-
|
|
275
|
-
|
|
276
|
-
|
|
278
|
+
tau = 1.0 # ratio used in the safety phase
|
|
279
|
+
if h is None:
|
|
280
|
+
# Trust region step
|
|
281
|
+
d, gopt, H, gnew, crvmin = control.trust_region_step(params)
|
|
282
|
+
else:
|
|
283
|
+
# Calculate criticality measure
|
|
284
|
+
criticality_measure = control.evaluate_criticality_measure(params)
|
|
285
|
+
# Trust region step
|
|
286
|
+
d, gopt, H, gnew, crvmin = control.trust_region_step(params, criticality_measure)
|
|
287
|
+
try:
|
|
288
|
+
tau = min(criticality_measure/(LA.norm(gopt)+lh), 1.0)
|
|
289
|
+
except ValueError:
|
|
290
|
+
# In some instances, gopt can have nan/inf values -- this ultimately calls a safety step and is generally fine
|
|
291
|
+
# but we need to set a value for tau nonetheless
|
|
292
|
+
tau = 1.0
|
|
293
|
+
|
|
277
294
|
if do_logging:
|
|
278
295
|
module_logger.debug("Trust region step is d = " + str(d))
|
|
296
|
+
|
|
279
297
|
xnew = control.model.xopt() + d
|
|
280
298
|
dnorm = min(LA.norm(d), control.delta)
|
|
281
299
|
|
|
282
300
|
if print_progress:
|
|
283
|
-
print("{:^5}{:^7}{:^10.2e}{:^10.2e}{:^10.2e}{:^10.2e}{:^7}".format(nruns_so_far+1, current_iter+1, control.model.
|
|
301
|
+
print("{:^5}{:^7}{:^10.2e}{:^10.2e}{:^10.2e}{:^10.2e}{:^7}".format(nruns_so_far+1, current_iter+1, control.model.objopt(), np.linalg.norm(gopt), control.delta, control.rho, control.nf))
|
|
284
302
|
|
|
285
303
|
if params("logging.save_diagnostic_info"):
|
|
286
304
|
diagnostic_info.save_info_from_control(control, nruns_so_far, current_iter,
|
|
@@ -289,7 +307,7 @@ def solve_main(objfun, x0, args, xl, xu, projections, npt, rhobeg, rhoend, maxfu
|
|
|
289
307
|
diagnostic_info.update_interpolation_information(interp_error, ls_interp_cond_num, linalg_resid,
|
|
290
308
|
sqrt(norm_J_error), LA.norm(gopt), LA.norm(d))
|
|
291
309
|
|
|
292
|
-
if dnorm < params("general.safety_step_thresh") * control.rho and not finished_growing and params("growing.safety.do_safety_step"):
|
|
310
|
+
if dnorm < tau * params("general.safety_step_thresh") * control.rho and not finished_growing and params("growing.safety.do_safety_step"):
|
|
293
311
|
if do_logging:
|
|
294
312
|
module_logger.debug("Safety step during growing phase")
|
|
295
313
|
|
|
@@ -415,10 +433,10 @@ def solve_main(objfun, x0, args, xl, xu, projections, npt, rhobeg, rhoend, maxfu
|
|
|
415
433
|
if do_logging:
|
|
416
434
|
module_logger.info("New rho = %g after %i function evaluations" % (control.rho, control.nf))
|
|
417
435
|
if control.n() < params("logging.n_to_print_whole_x_vector"):
|
|
418
|
-
module_logger.debug("Best so far: f = %.15g at x = " % (control.model.
|
|
436
|
+
module_logger.debug("Best so far: f = %.15g at x = " % (control.model.objopt())
|
|
419
437
|
+ str(control.model.xopt(abs_coordinates=True)))
|
|
420
438
|
else:
|
|
421
|
-
module_logger.debug("Best so far: f = %.15g at x = [...]" % (control.model.
|
|
439
|
+
module_logger.debug("Best so far: f = %.15g at x = [...]" % (control.model.objopt()))
|
|
422
440
|
continue # next iteration
|
|
423
441
|
else:
|
|
424
442
|
# Quit on rho=rhoend
|
|
@@ -439,8 +457,9 @@ def solve_main(objfun, x0, args, xl, xu, projections, npt, rhobeg, rhoend, maxfu
|
|
|
439
457
|
else:
|
|
440
458
|
# Cannot reduce rho, so check xnew and quit
|
|
441
459
|
x = control.model.as_absolute_coordinates(xnew)
|
|
460
|
+
##print("x from xnew", x)
|
|
442
461
|
number_of_samples = max(nsamples(control.delta, control.rho, current_iter, nruns_so_far), 1)
|
|
443
|
-
rvec_list,
|
|
462
|
+
rvec_list, obj_list, num_samples_run, exit_info = control.evaluate_objective(x, number_of_samples,
|
|
444
463
|
params)
|
|
445
464
|
|
|
446
465
|
if num_samples_run > 0:
|
|
@@ -514,8 +533,9 @@ def solve_main(objfun, x0, args, xl, xu, projections, npt, rhobeg, rhoend, maxfu
|
|
|
514
533
|
|
|
515
534
|
# Evaluate new point
|
|
516
535
|
x = control.model.as_absolute_coordinates(xnew)
|
|
536
|
+
##print("x from xnew again", x)
|
|
517
537
|
number_of_samples = max(nsamples(control.delta, control.rho, current_iter, nruns_so_far), 1)
|
|
518
|
-
rvec_list,
|
|
538
|
+
rvec_list, obj_list, num_samples_run, exit_info = control.evaluate_objective(x, number_of_samples, params)
|
|
519
539
|
if np.any(np.isnan(rvec_list)):
|
|
520
540
|
# Just exit without saving the current point
|
|
521
541
|
# We should be able to do a hard restart though, because it's unlikely
|
|
@@ -535,7 +555,7 @@ def solve_main(objfun, x0, args, xl, xu, projections, npt, rhobeg, rhoend, maxfu
|
|
|
535
555
|
break # quit
|
|
536
556
|
|
|
537
557
|
# Estimate f in order to compute 'actual reduction'
|
|
538
|
-
ratio, exit_info = control.calculate_ratio(current_iter, rvec_list[:num_samples_run, :], d, gopt, H)
|
|
558
|
+
ratio, exit_info = control.calculate_ratio(control.model.xopt(abs_coordinates=True), current_iter, rvec_list[:num_samples_run, :], d, gopt, H)
|
|
539
559
|
if exit_info is not None:
|
|
540
560
|
if exit_info.able_to_do_restart() and params("restarts.use_restarts") and params(
|
|
541
561
|
"restarts.use_soft_restarts"):
|
|
@@ -565,9 +585,9 @@ def solve_main(objfun, x0, args, xl, xu, projections, npt, rhobeg, rhoend, maxfu
|
|
|
565
585
|
diagnostic_info.update_slow_iter(-1) # n/a, unless otherwise update
|
|
566
586
|
if ratio < params("tr_radius.eta1"): # ratio < 0.1
|
|
567
587
|
if finished_growing:
|
|
568
|
-
control.delta = min(params("tr_radius.gamma_dec") * control.delta, dnorm)
|
|
588
|
+
control.delta = min(params("tr_radius.gamma_dec") * control.delta, dnorm) / tau
|
|
569
589
|
else:
|
|
570
|
-
control.delta = min(params("growing.gamma_dec") * control.delta, dnorm) # different gamma_dec
|
|
590
|
+
control.delta = min(params("growing.gamma_dec") * control.delta, dnorm) / tau # different gamma_dec
|
|
571
591
|
if params("logging.save_diagnostic_info"):
|
|
572
592
|
diagnostic_info.update_iter_type(ITER_ACCEPTABLE_NO_GEOM if ratio > 0.0
|
|
573
593
|
else ITER_UNSUCCESSFUL_NO_GEOM) # we flag geom update below
|
|
@@ -651,7 +671,7 @@ def solve_main(objfun, x0, args, xl, xu, projections, npt, rhobeg, rhoend, maxfu
|
|
|
651
671
|
break # quit
|
|
652
672
|
|
|
653
673
|
# Update list of successful steps
|
|
654
|
-
this_step_was_not_improvement = control.model.
|
|
674
|
+
this_step_was_not_improvement = control.model.objsave is not None and control.model.objopt() > control.model.objsave
|
|
655
675
|
succ_steps_not_improvement.pop() # remove last item
|
|
656
676
|
succ_steps_not_improvement.insert(0, this_step_was_not_improvement) # add at beginning
|
|
657
677
|
# Terminate (not restart) if all are True
|
|
@@ -828,10 +848,10 @@ def solve_main(objfun, x0, args, xl, xu, projections, npt, rhobeg, rhoend, maxfu
|
|
|
828
848
|
if do_logging:
|
|
829
849
|
module_logger.info("New rho = %g after %i function evaluations" % (control.rho, control.nf))
|
|
830
850
|
if control.n() < params("logging.n_to_print_whole_x_vector"):
|
|
831
|
-
module_logger.debug("Best so far: f = %.15g at x = " % (control.model.
|
|
851
|
+
module_logger.debug("Best so far: f = %.15g at x = " % (control.model.objopt())
|
|
832
852
|
+ str(control.model.xopt(abs_coordinates=True)))
|
|
833
853
|
else:
|
|
834
|
-
module_logger.debug("Best so far: f = %.15g at x = [...]" % (control.model.
|
|
854
|
+
module_logger.debug("Best so far: f = %.15g at x = [...]" % (control.model.objopt()))
|
|
835
855
|
continue # next iteration
|
|
836
856
|
else:
|
|
837
857
|
# Quit on rho=rhoend
|
|
@@ -857,14 +877,14 @@ def solve_main(objfun, x0, args, xl, xu, projections, npt, rhobeg, rhoend, maxfu
|
|
|
857
877
|
# (end main loop)
|
|
858
878
|
|
|
859
879
|
# Quit & return the important information
|
|
860
|
-
x, rvec,
|
|
880
|
+
x, rvec, obj, jacmin, nsamples = control.model.get_final_results()
|
|
861
881
|
if do_logging:
|
|
862
882
|
module_logger.debug("At return from DFO-LS, number of function evals = %i" % nf)
|
|
863
|
-
module_logger.debug("Smallest objective value = %.15g at x = " %
|
|
864
|
-
return x, rvec,
|
|
883
|
+
module_logger.debug("Smallest objective value = %.15g at x = " % obj + str(x))
|
|
884
|
+
return x, rvec, obj, jacmin, nsamples, control.nf, control.nx, nruns_so_far, exit_info, diagnostic_info
|
|
865
885
|
|
|
866
886
|
|
|
867
|
-
def solve(objfun, x0,
|
|
887
|
+
def solve(objfun, x0, h=None, lh=None, prox_uh=None, argsf=(), argsh=(), argsprox=(), bounds=None, projections=[], npt=None, rhobeg=None, rhoend=1e-8, maxfun=None, nsamples=None, user_params=None,
|
|
868
888
|
objfun_has_noise=False, scaling_within_bounds=False, do_logging=True, print_progress=False):
|
|
869
889
|
x0 = x0.astype(float)
|
|
870
890
|
n = len(x0)
|
|
@@ -934,13 +954,21 @@ def solve(objfun, x0, args=(), bounds=None, projections=[], npt=None, rhobeg=Non
|
|
|
934
954
|
|
|
935
955
|
exit_info = None
|
|
936
956
|
# Input & parameter checks
|
|
957
|
+
if exit_info is None and h is not None:
|
|
958
|
+
if prox_uh is None:
|
|
959
|
+
exit_info = ExitInformation(EXIT_INPUT_ERROR, "Must provide prox_uh input if h is not None")
|
|
960
|
+
elif lh is None:
|
|
961
|
+
exit_info = ExitInformation(EXIT_INPUT_ERROR, "Must provide lh input if h is not None")
|
|
962
|
+
elif lh <= 0.0:
|
|
963
|
+
exit_info = ExitInformation(EXIT_INPUT_ERROR, "lh must be strictly positive")
|
|
964
|
+
|
|
937
965
|
if exit_info is None and npt < n + 1:
|
|
938
966
|
exit_info = ExitInformation(EXIT_INPUT_ERROR, "npt must be >= n+1 for linear models with inexact interpolation")
|
|
939
967
|
|
|
940
|
-
if exit_info is None and rhobeg
|
|
968
|
+
if exit_info is None and rhobeg <= 0.0:
|
|
941
969
|
exit_info = ExitInformation(EXIT_INPUT_ERROR, "rhobeg must be strictly positive")
|
|
942
970
|
|
|
943
|
-
if exit_info is None and rhoend
|
|
971
|
+
if exit_info is None and rhoend <= 0.0:
|
|
944
972
|
exit_info = ExitInformation(EXIT_INPUT_ERROR, "rhoend must be strictly positive")
|
|
945
973
|
|
|
946
974
|
if exit_info is None and rhobeg <= rhoend:
|
|
@@ -1013,12 +1041,12 @@ def solve(objfun, x0, args=(), bounds=None, projections=[], npt=None, rhobeg=Non
|
|
|
1013
1041
|
x0 = xp.copy()
|
|
1014
1042
|
|
|
1015
1043
|
# Enforce lower & upper bounds on x0
|
|
1016
|
-
idx = (x0
|
|
1044
|
+
idx = (x0 < xl)
|
|
1017
1045
|
if np.any(idx):
|
|
1018
1046
|
warnings.warn("x0 below lower bound, adjusting", RuntimeWarning)
|
|
1019
1047
|
x0[idx] = xl[idx]
|
|
1020
1048
|
|
|
1021
|
-
idx = (x0
|
|
1049
|
+
idx = (x0 > xu)
|
|
1022
1050
|
if np.any(idx):
|
|
1023
1051
|
warnings.warn("x0 above upper bound, adjusting", RuntimeWarning)
|
|
1024
1052
|
x0[idx] = xu[idx]
|
|
@@ -1028,9 +1056,9 @@ def solve(objfun, x0, args=(), bounds=None, projections=[], npt=None, rhobeg=Non
|
|
|
1028
1056
|
nruns = 0
|
|
1029
1057
|
nf = 0
|
|
1030
1058
|
nx = 0
|
|
1031
|
-
xmin, rmin,
|
|
1032
|
-
solve_main(objfun, x0,
|
|
1033
|
-
diagnostic_info, scaling_changes, default_growing_method_set_by_user=default_growing_method_set_by_user,
|
|
1059
|
+
xmin, rmin, objmin, jacmin, nsamples_min, nf, nx, nruns, exit_info, diagnostic_info = \
|
|
1060
|
+
solve_main(objfun, x0, argsf, xl, xu, projections, npt, rhobeg, rhoend, maxfun, nruns, nf, nx, nsamples, params,
|
|
1061
|
+
diagnostic_info, scaling_changes, h, lh, argsh, prox_uh, argsprox, default_growing_method_set_by_user=default_growing_method_set_by_user,
|
|
1034
1062
|
do_logging=do_logging, print_progress=print_progress)
|
|
1035
1063
|
|
|
1036
1064
|
# Hard restarts loop
|
|
@@ -1045,27 +1073,27 @@ def solve(objfun, x0, args=(), bounds=None, projections=[], npt=None, rhobeg=Non
|
|
|
1045
1073
|
|
|
1046
1074
|
if do_logging:
|
|
1047
1075
|
module_logger.info("Restarting from finish point (f = %g) after %g function evals; using rhobeg = %g and rhoend = %g"
|
|
1048
|
-
% (
|
|
1076
|
+
% (objmin, nf, rhobeg, rhoend))
|
|
1049
1077
|
if params("restarts.hard.use_old_rk"):
|
|
1050
|
-
xmin2, rmin2,
|
|
1051
|
-
solve_main(objfun, xmin,
|
|
1052
|
-
diagnostic_info, scaling_changes, r0_avg_old=rmin, r0_nsamples_old=nsamples_min,
|
|
1078
|
+
xmin2, rmin2, objmin2, jacmin2, nsamples2, nf, nx, nruns, exit_info, diagnostic_info = \
|
|
1079
|
+
solve_main(objfun, xmin, argsf, xl, xu, projections, npt, rhobeg, rhoend, maxfun, nruns, nf, nx, nsamples, params,
|
|
1080
|
+
diagnostic_info, scaling_changes, h, lh, argsh, prox_uh, argsprox, r0_avg_old=rmin, r0_nsamples_old=nsamples_min,
|
|
1053
1081
|
do_logging=do_logging, print_progress=print_progress)
|
|
1054
1082
|
else:
|
|
1055
|
-
xmin2, rmin2,
|
|
1056
|
-
solve_main(objfun, xmin,
|
|
1057
|
-
diagnostic_info, scaling_changes, do_logging=do_logging, print_progress=print_progress)
|
|
1083
|
+
xmin2, rmin2, objmin2, jacmin2, nsamples2, nf, nx, nruns, exit_info, diagnostic_info = \
|
|
1084
|
+
solve_main(objfun, xmin, argsf, xl, xu, projections, npt, rhobeg, rhoend, maxfun, nruns, nf, nx, nsamples, params,
|
|
1085
|
+
diagnostic_info, scaling_changes, h, lh, argsh, prox_uh, argsprox, do_logging=do_logging, print_progress=print_progress)
|
|
1058
1086
|
|
|
1059
|
-
if
|
|
1087
|
+
if objmin2 < objmin or np.isnan(objmin):
|
|
1060
1088
|
if do_logging:
|
|
1061
|
-
module_logger.info("Successful run with new f = %s compared to old f = %s" % (
|
|
1089
|
+
module_logger.info("Successful run with new f = %s compared to old f = %s" % (objmin2, objmin))
|
|
1062
1090
|
last_successful_run = nruns
|
|
1063
|
-
(xmin, rmin,
|
|
1091
|
+
(xmin, rmin, objmin, nsamples_min) = (xmin2, rmin2, objmin2, nsamples2)
|
|
1064
1092
|
if jacmin2 is not None: # may be None if finished during setup phase, in which case just use old Jacobian
|
|
1065
1093
|
jacmin = jacmin2
|
|
1066
1094
|
else:
|
|
1067
1095
|
if do_logging:
|
|
1068
|
-
module_logger.info("Unsuccessful run with new f = %s compared to old f = %s" % (
|
|
1096
|
+
module_logger.info("Unsuccessful run with new f = %s compared to old f = %s" % (objmin2, objmin))
|
|
1069
1097
|
|
|
1070
1098
|
if nruns - last_successful_run >= params("restarts.max_unsuccessful_restarts"):
|
|
1071
1099
|
exit_info = ExitInformation(EXIT_SUCCESS, "Reached maximum number of unsuccessful restarts")
|
|
@@ -1077,7 +1105,7 @@ def solve(objfun, x0, args=(), bounds=None, projections=[], npt=None, rhobeg=Non
|
|
|
1077
1105
|
if scaling_changes is not None and jacmin is not None:
|
|
1078
1106
|
for i in range(n):
|
|
1079
1107
|
jacmin[:, i] = jacmin[:, i] / scaling_changes[1][i]
|
|
1080
|
-
results = OptimResults(remove_scaling(xmin, scaling_changes), rmin,
|
|
1108
|
+
results = OptimResults(remove_scaling(xmin, scaling_changes), rmin, objmin, jacmin, nf, nx, nruns, exit_flag, exit_msg)
|
|
1081
1109
|
if params("logging.save_diagnostic_info"):
|
|
1082
1110
|
df = diagnostic_info.to_dataframe(with_xk=params("logging.save_xk"), with_rk=params("logging.save_rk"))
|
|
1083
1111
|
results.diagnostic_info = df
|
dfols/trust_region.py
CHANGED
|
@@ -29,14 +29,14 @@ solves
|
|
|
29
29
|
s.t. lower <= x <= upper
|
|
30
30
|
||x-xbase|| <= Delta
|
|
31
31
|
With this value, the variable d=x-xbase solves the problem
|
|
32
|
-
|
|
32
|
+
min_d abs(c + g' * d)
|
|
33
33
|
s.t. lower <= xbase + d <= upper
|
|
34
34
|
||d|| <= delta
|
|
35
35
|
Again, we have a version of this for handling arbitrary constraints
|
|
36
36
|
The call
|
|
37
37
|
x = ctrsbox_geometry(xbase, c, g, projections, Delta)
|
|
38
38
|
Solves
|
|
39
|
-
|
|
39
|
+
min_d abs(c + g' * d)
|
|
40
40
|
s.t. xbase + d is feasible w.r.t. the constraint set C
|
|
41
41
|
||d|| <= delta
|
|
42
42
|
|
|
@@ -70,7 +70,7 @@ alternative licensing.
|
|
|
70
70
|
# Ensure compatibility with Python 2
|
|
71
71
|
from __future__ import absolute_import, division, print_function, unicode_literals
|
|
72
72
|
|
|
73
|
-
from math import sqrt
|
|
73
|
+
from math import sqrt, ceil
|
|
74
74
|
import numpy as np
|
|
75
75
|
try:
|
|
76
76
|
import trustregion
|
|
@@ -79,13 +79,93 @@ except ImportError:
|
|
|
79
79
|
# Fall back to Python implementation
|
|
80
80
|
USE_FORTRAN = False
|
|
81
81
|
|
|
82
|
-
from .util import dykstra, pball, pbox, sumsq, model_value
|
|
82
|
+
from .util import dykstra, pball, pbox, sumsq, model_value, remove_scaling
|
|
83
83
|
|
|
84
|
-
__all__ = ['
|
|
84
|
+
__all__ = ['ctrsbox_sfista', 'ctrsbox_pgd', 'ctrsbox_geometry', 'trsbox', 'trsbox_geometry']
|
|
85
85
|
|
|
86
86
|
ZERO_THRESH = 1e-14
|
|
87
87
|
|
|
88
|
-
def
|
|
88
|
+
def ctrsbox_sfista(xopt, g, H, projections, delta, h, L_h, prox_uh, argsh=(), argsprox=(), func_tol=1e-3, max_iters=500, d_max_iters=100, d_tol=1e-10, use_fortran=USE_FORTRAN, scaling_changes=None, sfista_iters_scale=1.0):
|
|
89
|
+
n = xopt.size
|
|
90
|
+
assert xopt.shape == (n,), "xopt has wrong shape (should be vector)"
|
|
91
|
+
assert g.shape == (n,), "g and xopt have incompatible sizes"
|
|
92
|
+
assert len(H.shape) == 2, "H must be a matrix"
|
|
93
|
+
assert H.shape == (n,n), "H and xopt have incompatible sizes"
|
|
94
|
+
assert np.allclose(H, H.T), "H must be symmetric"
|
|
95
|
+
assert delta > 0.0, "delta must be strictly positive"
|
|
96
|
+
|
|
97
|
+
# Initialization
|
|
98
|
+
d = np.zeros(n) # start with zero vector
|
|
99
|
+
y = np.zeros(n)
|
|
100
|
+
t = 1
|
|
101
|
+
k_H = np.linalg.norm(H, 2)
|
|
102
|
+
crvmin = -1.0
|
|
103
|
+
|
|
104
|
+
# Number of iterations & smoothing parameter, from Theorem 10.57 in
|
|
105
|
+
# [A. Beck. First-order methods in optimization, SIAM, 2017]
|
|
106
|
+
# We do not use the values of k and mu given in the theorem statement, but rather the intermediate
|
|
107
|
+
# results on p313 (K1 for number of iterations, and the immediate next line for mu)
|
|
108
|
+
# Note: in the book's notation, Gamma=delta^2, alpha=1, beta=L_h^2/2, Lf=k_H [alpha and beta from Thm 10.51]
|
|
109
|
+
try:
|
|
110
|
+
MAX_LOOP_ITERS = ceil(sfista_iters_scale * delta * (L_h+sqrt(L_h*L_h+2*k_H*func_tol)) / func_tol)
|
|
111
|
+
MAX_LOOP_ITERS = min(MAX_LOOP_ITERS, max_iters)
|
|
112
|
+
except ValueError:
|
|
113
|
+
MAX_LOOP_ITERS = max_iters
|
|
114
|
+
u = 2 * delta / (MAX_LOOP_ITERS * L_h) # smoothing parameter
|
|
115
|
+
# u = 2 * func_tol / (L_h ** 2 + L_h * sqrt(L_h ** 2 + 2 * k_H * func_tol)) # the above choice works better in practice
|
|
116
|
+
|
|
117
|
+
def gradient_Fu(xopt, g, H, u, prox_uh, d):
|
|
118
|
+
# Calculate gradient_Fu,
|
|
119
|
+
# where Fu(d) := g(d) + h_u(d) and h_u(d) is a 1/u-smooth approximation of h.
|
|
120
|
+
# We assume that h is globally Lipschitz continous with constant L_h,
|
|
121
|
+
# then we can let h_u(d) be the Moreau Envelope M_h_u(d) of h.
|
|
122
|
+
return g + H @ d + (xopt + d - prox_uh(remove_scaling(xopt + d, scaling_changes), u, *argsprox)) / u
|
|
123
|
+
|
|
124
|
+
# Lipschitz constant of gradient_Fu
|
|
125
|
+
l = k_H + 1 / u
|
|
126
|
+
|
|
127
|
+
# trust region is a ball of radius delta around xopt
|
|
128
|
+
trproj = lambda w: pball(w, xopt, delta)
|
|
129
|
+
|
|
130
|
+
# combine trust region constraints with user-entered constraints
|
|
131
|
+
P = list(projections) # make a copy of the projections list
|
|
132
|
+
P.append(trproj)
|
|
133
|
+
def proj(d0):
|
|
134
|
+
p = dykstra(P, xopt+d0, max_iter=d_max_iters, tol=d_tol)
|
|
135
|
+
# we want the step only, so we subtract xopt
|
|
136
|
+
# from the new point: proj(xk+d) - xk
|
|
137
|
+
return p - xopt
|
|
138
|
+
|
|
139
|
+
# general step
|
|
140
|
+
model_value_best = model_value(g, H, d, xopt, h, argsh, scaling_changes)
|
|
141
|
+
d_best = d.copy()
|
|
142
|
+
for k in range(MAX_LOOP_ITERS):
|
|
143
|
+
prev_d = d.copy()
|
|
144
|
+
prev_t = t
|
|
145
|
+
# gradient_Fu at y
|
|
146
|
+
g_Fu = gradient_Fu(xopt, g, H, u, prox_uh, d, *argsprox)
|
|
147
|
+
|
|
148
|
+
# main update step
|
|
149
|
+
d = proj(y - g_Fu / l)
|
|
150
|
+
new_model_value = model_value(g, H, d, xopt, h, argsh, scaling_changes)
|
|
151
|
+
if new_model_value < model_value_best:
|
|
152
|
+
d_best = d.copy()
|
|
153
|
+
model_value_best = new_model_value
|
|
154
|
+
|
|
155
|
+
# update true gradient
|
|
156
|
+
# gnew is the gradient of the smoothed function
|
|
157
|
+
gnew = gradient_Fu(xopt, g, H, u, prox_uh, d, *argsprox)
|
|
158
|
+
|
|
159
|
+
# update CRVMIN
|
|
160
|
+
crv = d.dot(H).dot(d)/sumsq(d) if sumsq(d) >= ZERO_THRESH else crvmin
|
|
161
|
+
crvmin = min(crvmin, crv) if crvmin != -1.0 else crv
|
|
162
|
+
|
|
163
|
+
# momentum update
|
|
164
|
+
t = (1 + sqrt(1 + 4*t*t)) / 2
|
|
165
|
+
y = d + (prev_t - 1) * (d - prev_d) / t
|
|
166
|
+
return d, gnew, crvmin
|
|
167
|
+
|
|
168
|
+
def ctrsbox_pgd(xopt, g, H, projections, delta, d_max_iters=100, d_tol=1e-10, use_fortran=USE_FORTRAN):
|
|
89
169
|
n = xopt.size
|
|
90
170
|
assert xopt.shape == (n,), "xopt has wrong shape (should be vector)"
|
|
91
171
|
assert g.shape == (n,), "g and xopt have incompatible sizes"
|
|
@@ -151,7 +231,6 @@ def ctrsbox(xopt, g, H, projections, delta, d_max_iters=100, d_tol=1e-10, use_fo
|
|
|
151
231
|
|
|
152
232
|
return d, gnew, crvmin
|
|
153
233
|
|
|
154
|
-
|
|
155
234
|
def trsbox(xopt, g, H, sl, su, delta, use_fortran=USE_FORTRAN):
|
|
156
235
|
if use_fortran:
|
|
157
236
|
return trustregion.solve(g, H, delta,
|
dfols/util.py
CHANGED
|
@@ -31,7 +31,7 @@ import scipy.linalg as LA
|
|
|
31
31
|
import sys
|
|
32
32
|
|
|
33
33
|
|
|
34
|
-
__all__ = ['sumsq', '
|
|
34
|
+
__all__ = ['sumsq', 'eval_least_squares_with_regularisation', 'model_value', 'random_orthog_directions_within_bounds',
|
|
35
35
|
'random_directions_within_bounds', 'apply_scaling', 'remove_scaling', 'pbox', 'pball', 'dykstra', 'qr_rank']
|
|
36
36
|
|
|
37
37
|
module_logger = logging.getLogger(__name__)
|
|
@@ -47,9 +47,9 @@ def sumsq(x):
|
|
|
47
47
|
return np.dot(x, x)
|
|
48
48
|
|
|
49
49
|
|
|
50
|
-
def
|
|
50
|
+
def eval_least_squares_with_regularisation(objfun, x, h=None, argsf=(), argsh=(), verbose=True, eval_num=0, pt_num=0, full_x_thresh=6, check_for_overflow=True):
|
|
51
51
|
# Evaluate least squares function
|
|
52
|
-
fvec = objfun(x, *
|
|
52
|
+
fvec = objfun(x, *argsf)
|
|
53
53
|
|
|
54
54
|
if check_for_overflow:
|
|
55
55
|
try:
|
|
@@ -62,20 +62,31 @@ def eval_least_squares_objective(objfun, x, args=(), verbose=True, eval_num=0, p
|
|
|
62
62
|
else:
|
|
63
63
|
f = sumsq(fvec)
|
|
64
64
|
|
|
65
|
+
# objective = least-squares + regularisation
|
|
66
|
+
obj = f
|
|
67
|
+
if h is not None:
|
|
68
|
+
# Evaluate regularisation term
|
|
69
|
+
hvalue = h(x, *argsh)
|
|
70
|
+
obj = f + hvalue
|
|
71
|
+
|
|
65
72
|
if verbose:
|
|
66
73
|
if len(x) < full_x_thresh:
|
|
67
|
-
module_logger.info("Function eval %i at point %i has
|
|
74
|
+
module_logger.info("Function eval %i at point %i has obj = %.15g at x = " % (eval_num, pt_num, obj) + str(x))
|
|
68
75
|
else:
|
|
69
|
-
module_logger.info("Function eval %i at point %i has
|
|
76
|
+
module_logger.info("Function eval %i at point %i has obj = %.15g at x = [...]" % (eval_num, pt_num, obj))
|
|
70
77
|
|
|
71
|
-
return fvec,
|
|
78
|
+
return fvec, obj
|
|
72
79
|
|
|
73
80
|
|
|
74
|
-
def model_value(g, H, s):
|
|
75
|
-
# Calculate model value (s^T * g + 0.5* s^T * H * s) = s^T * (gopt + 0.5 * H*s)
|
|
81
|
+
def model_value(g, H, s, xopt=(), h=None,argsh=(), scaling_changes=None):
|
|
82
|
+
# Calculate model value (s^T * g + 0.5* s^T * H * s) + h(xopt + s) = s^T * (gopt + 0.5 * H*s) + h(xopt + s)
|
|
76
83
|
assert g.shape == s.shape, "g and s have incompatible sizes"
|
|
77
84
|
Hs = H.dot(s)
|
|
78
|
-
|
|
85
|
+
rtn = np.dot(s, g + 0.5*Hs)
|
|
86
|
+
if h is not None:
|
|
87
|
+
hvalue = h(remove_scaling(xopt+s, scaling_changes), *argsh)
|
|
88
|
+
rtn += hvalue
|
|
89
|
+
return rtn
|
|
79
90
|
|
|
80
91
|
|
|
81
92
|
def get_scale(dirn, delta, lower, upper):
|
DFO_LS-1.4.1.dist-info/RECORD
DELETED
|
@@ -1,14 +0,0 @@
|
|
|
1
|
-
dfols/__init__.py,sha256=D-x5glfZFfJ8-bdjA-4k4JFTDu1Eylaz3EL4GSH28eI,1605
|
|
2
|
-
dfols/controller.py,sha256=LSeHZoKaKUEYgB1_2subjKskHJ8mWccMbn-LOpxJ7LM,42769
|
|
3
|
-
dfols/diagnostic_info.py,sha256=2kEUkL-MS4eDENUf1r2hOWsntP8OxMDKi_kyHmrC9V4,6081
|
|
4
|
-
dfols/hessian.py,sha256=sExx4J4KoGwHItbthX2odosB2ONbQFvLdlcod7PIh4k,4262
|
|
5
|
-
dfols/model.py,sha256=q70zuqocNtsaXzNjWHcTdrS209BdQt4uY0GNtp0qlI8,18809
|
|
6
|
-
dfols/params.py,sha256=_Va1ybnQDIzWaXvImcSeH8xnNE_A2zpAfBgDG74sc5c,17557
|
|
7
|
-
dfols/solver.py,sha256=IKg3xWPLYlOW_zuTc_-HY_3ZvdDEfkyxARerERUQHlU,61264
|
|
8
|
-
dfols/trust_region.py,sha256=hRKQx0fpSxol7dLZO0yrT7O5IDptPPSnDvxKQNZ3r0M,24603
|
|
9
|
-
dfols/util.py,sha256=ysdIHTkrkWwCRKuGffofehKl-t5dT3sD9dfy0muI4ZI,9852
|
|
10
|
-
DFO_LS-1.4.1.dist-info/LICENSE.txt,sha256=jOtLnuWt7d5Hsx6XXB2QxzrSe2sWWh3NgMfFRetluQM,35147
|
|
11
|
-
DFO_LS-1.4.1.dist-info/METADATA,sha256=RR6KhJi4Ae_1PES8Bpzqm3AYK2w12V-2MyDyjaCDe80,8552
|
|
12
|
-
DFO_LS-1.4.1.dist-info/WHEEL,sha256=GJ7t_kWBFywbagK5eo9IoUwLW6oyOeTKmQ-9iHFVNxQ,92
|
|
13
|
-
DFO_LS-1.4.1.dist-info/top_level.txt,sha256=UfxRhaDN8HQx2_l17KbrDrERJ90OCN7VKkDMpYYbRLU,6
|
|
14
|
-
DFO_LS-1.4.1.dist-info/RECORD,,
|
|
File without changes
|
|
File without changes
|