DFO-LS 1.4.1__py3-none-any.whl → 1.5.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of DFO-LS might be problematic. Click here for more details.

@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.1
2
2
  Name: DFO-LS
3
- Version: 1.4.1
3
+ Version: 1.5.0
4
4
  Summary: A flexible derivative-free solver for (bound constrained) nonlinear least-squares minimization
5
5
  Author-email: Lindon Roberts <lindon.roberts@sydney.edu.au>
6
6
  Maintainer-email: Lindon Roberts <lindon.roberts@sydney.edu.au>
@@ -68,7 +68,7 @@ DFO-LS: Derivative-Free Optimizer for Least-Squares
68
68
 
69
69
  DFO-LS is a flexible package for solving nonlinear least-squares minimization, without requiring derivatives of the objective. It is particularly useful when evaluations of the objective function are expensive and/or noisy. DFO-LS is more flexible version of `DFO-GN <https://github.com/numericalalgorithmsgroup/dfogn>`_.
70
70
 
71
- This is an implementation of the algorithm from our paper: C. Cartis, J. Fiala, B. Marteau and L. Roberts, `Improving the Flexibility and Robustness of Model-Based Derivative-Free Optimization Solvers <https://doi.org/10.1145/3338517>`_, *ACM Transactions on Mathematical Software*, 45:3 (2019), pp. 32:1-32:41 [`preprint <https://arxiv.org/abs/1804.00154>`_]. For reproducibility of all figures in this paper, please feel free to contact the authors.
71
+ The main algorithm is described in our paper [1] below.
72
72
 
73
73
  If you are interested in solving general optimization problems (without a least-squares structure), you may wish to try `Py-BOBYQA <https://github.com/numericalalgorithmsgroup/pybobyqa>`_, which has many of the same features as DFO-LS.
74
74
 
@@ -78,13 +78,15 @@ See manual.pdf or `here <https://numericalalgorithmsgroup.github.io/dfols/>`_.
78
78
 
79
79
  Citation
80
80
  --------
81
- If you use DFO-LS in a paper, please cite:
81
+ The development of DFO-LS is outlined over several publications:
82
82
 
83
- Cartis, C., Fiala, J., Marteau, B. and Roberts, L., `Improving the Flexibility and Robustness of Model-Based Derivative-Free Optimization Solvers <https://doi.org/10.1145/3338517>`_, *ACM Transactions on Mathematical Software*, 45:3 (2019), pp. 32:1-32:41.
83
+ 1. C Cartis, J Fiala, B Marteau and L Roberts, `Improving the Flexibility and Robustness of Model-Based Derivative-Free Optimization Solvers <https://doi.org/10.1145/3338517>`_, *ACM Transactions on Mathematical Software*, 45:3 (2019), pp. 32:1-32:41 [`preprint arXiv 1804.00154 <https://arxiv.org/abs/1804.00154>`_] .
84
+ 2. M Hough and L Roberts, `Model-Based Derivative-Free Methods for Convex-Constrained Optimization <https://doi.org/10.1137/21M1460971>`_, *SIAM Journal on Optimization*, 21:4 (2022), pp. 2552-2579 [`preprint arXiv 2111.05443 <https://arxiv.org/abs/2111.05443>`_].
85
+ 3. Y Liu, K H Lam and L Roberts, `Black-box Optimization Algorithms for Regularized Least-squares Problems <http://arxiv.org/abs/2407.14915>`_, *arXiv preprint arXiv:arXiv:2407.14915*, 2024.
84
86
 
85
- If you use DFO-LS for problems with constraints, including bound constraints, please also cite:
86
-
87
- Hough, M. and Roberts, L., `Model-Based Derivative-Free Methods for Convex-Constrained Optimization <https://doi.org/10.1137/21M1460971>`_, *SIAM Journal on Optimization*, 21:4 (2022), pp. 2552-2579.
87
+ If you use DFO-LS in a paper, please cite [1].
88
+ If your problem has constraints, including bound constraints, please cite [1,2].
89
+ If your problem includes a regularizer, please cite [1,3].
88
90
 
89
91
  Requirements
90
92
  ------------
@@ -114,27 +116,13 @@ For easy installation, use `pip <http://www.pip-installer.org/>`_ as root:
114
116
 
115
117
  .. code-block:: bash
116
118
 
117
- $ [sudo] pip install DFO-LS
118
-
119
- or alternatively *easy_install*:
120
-
121
- .. code-block:: bash
122
-
123
- $ [sudo] easy_install DFO-LS
124
-
125
- If you do not have root privileges or you want to install DFO-LS for your private use, you can use:
126
-
127
- .. code-block:: bash
128
-
129
- $ pip install --user DFO-LS
130
-
131
- which will install DFO-LS in your home directory.
119
+ $ pip install DFO-LS
132
120
 
133
121
  Note that if an older install of DFO-LS is present on your system you can use:
134
122
 
135
123
  .. code-block:: bash
136
124
 
137
- $ [sudo] pip install --upgrade DFO-LS
125
+ $ pip install --upgrade DFO-LS
138
126
 
139
127
  to upgrade DFO-LS to the latest version.
140
128
 
@@ -151,22 +139,14 @@ DFO-LS is written in pure Python and requires no compilation. It can be installe
151
139
 
152
140
  .. code-block:: bash
153
141
 
154
- $ [sudo] pip install .
155
-
156
- If you do not have root privileges or you want to install DFO-LS for your private use, you can use:
157
-
158
- .. code-block:: bash
159
-
160
- $ pip install --user .
161
-
162
- instead.
142
+ $ pip install .
163
143
 
164
144
  To upgrade DFO-LS to the latest version, navigate to the top-level directory (i.e. the one containing :code:`pyproject.toml`) and rerun the installation using :code:`pip`, as above:
165
145
 
166
146
  .. code-block:: bash
167
147
 
168
148
  $ git pull
169
- $ [sudo] pip install . # with admin privileges
149
+ $ pip install .
170
150
 
171
151
  Testing
172
152
  -------
@@ -189,7 +169,7 @@ If DFO-LS was installed using *pip* you can uninstall as follows:
189
169
 
190
170
  .. code-block:: bash
191
171
 
192
- $ [sudo] pip uninstall DFO-LS
172
+ $ pip uninstall DFO-LS
193
173
 
194
174
  If DFO-LS was installed manually you have to remove the installed files by hand (located in your python site-packages directory).
195
175
 
@@ -0,0 +1,14 @@
1
+ dfols/__init__.py,sha256=nMJ4G3JcmjQ82lYXV2ywxjHWQqd9nq7Ak6GIlrN70Tw,1605
2
+ dfols/controller.py,sha256=gz4yGpk8KyfsWxrAkI8y69K5ckSHZ3Xdq0fEVFtIcPk,49925
3
+ dfols/diagnostic_info.py,sha256=2kEUkL-MS4eDENUf1r2hOWsntP8OxMDKi_kyHmrC9V4,6081
4
+ dfols/hessian.py,sha256=sExx4J4KoGwHItbthX2odosB2ONbQFvLdlcod7PIh4k,4262
5
+ dfols/model.py,sha256=i-TcGNFAeYt4uu3R_-THTk2rOCDvgU_mcZQQXfE1ODA,19786
6
+ dfols/params.py,sha256=GzJGO0TByH1X3B0NbLOCOqmYG8dRiKPKjjX7or_fOqI,18342
7
+ dfols/solver.py,sha256=QUF84UYnSitvlpVssKLdcMF9e_zdA9qlZlg5e8IegeQ,63173
8
+ dfols/trust_region.py,sha256=JbHLBDw7H88a3cIMuialh7kpMNGjL3Lp9JsjrBNpDWQ,28231
9
+ dfols/util.py,sha256=efGVAKPb7YrHya1IOgyzacwa_h0u2jHHs5FhuxUlYDg,10282
10
+ DFO_LS-1.5.0.dist-info/LICENSE.txt,sha256=jOtLnuWt7d5Hsx6XXB2QxzrSe2sWWh3NgMfFRetluQM,35147
11
+ DFO_LS-1.5.0.dist-info/METADATA,sha256=JIQNs15kBtVr5_cA7JnXDbT-uQ06pqTd3RD_MRYmB7w,8069
12
+ DFO_LS-1.5.0.dist-info/WHEEL,sha256=cVxcB9AmuTcXqmwrtPhNK88dr7IR_b6qagTj0UvIEbY,91
13
+ DFO_LS-1.5.0.dist-info/top_level.txt,sha256=UfxRhaDN8HQx2_l17KbrDrERJ90OCN7VKkDMpYYbRLU,6
14
+ DFO_LS-1.5.0.dist-info/RECORD,,
@@ -1,5 +1,5 @@
1
1
  Wheel-Version: 1.0
2
- Generator: bdist_wheel (0.43.0)
2
+ Generator: setuptools (74.1.2)
3
3
  Root-Is-Purelib: true
4
4
  Tag: py3-none-any
5
5
 
dfols/__init__.py CHANGED
@@ -39,7 +39,7 @@ alternative licensing.
39
39
  from __future__ import absolute_import, division, print_function, unicode_literals
40
40
 
41
41
  # DFO-LS version
42
- __version__ = '1.4.1'
42
+ __version__ = '1.5.0'
43
43
 
44
44
  # Main solver & exit flags
45
45
  from .solver import *
dfols/controller.py CHANGED
@@ -100,14 +100,19 @@ class ExitInformation(object):
100
100
 
101
101
 
102
102
  class Controller(object):
103
- def __init__(self, objfun, args, x0, r0, r0_nsamples, xl, xu, projections, npt, rhobeg, rhoend, nf, nx, maxfun, params,
104
- scaling_changes, do_logging):
103
+ def __init__(self, objfun, argsf, x0, r0, r0_nsamples, xl, xu, projections, npt, rhobeg, rhoend, nf, nx, maxfun, params,
104
+ scaling_changes, do_logging, h=None, lh=None, argsh = (), prox_uh=None, argsprox = ()):
105
105
  self.do_logging = do_logging
106
106
  self.objfun = objfun
107
- self.args = args
107
+ self.h = h
108
+ self.argsf = argsf
109
+ self.argsh = argsh
110
+ self.lh = lh
111
+ self.prox_uh = prox_uh #TODO: add instruction for prox_uh
112
+ self.argsprox = argsprox
108
113
  self.maxfun = maxfun
109
- self.model = Model(npt, x0, r0, xl, xu, projections, r0_nsamples, precondition=params("interpolation.precondition"),
110
- abs_tol = params("model.abs_tol"), rel_tol = params("model.rel_tol"), do_logging=do_logging)
114
+ self.model = Model(npt, x0, r0, xl, xu, projections, r0_nsamples, h=self.h, argsh = argsh, precondition=params("interpolation.precondition"),
115
+ abs_tol = params("model.abs_tol"), rel_tol = params("model.rel_tol"), do_logging=do_logging, scaling_changes=scaling_changes)
111
116
  self.nf = nf
112
117
  self.nx = nx
113
118
  self.rhobeg = rhobeg
@@ -230,7 +235,7 @@ class Controller(object):
230
235
  for k in range(0,self.n()):
231
236
  # Evaluate objective at this new point
232
237
  x = self.model.as_absolute_coordinates(D[k, :])
233
- rvec_list, f_list, num_samples_run, exit_info = self.evaluate_objective(x, number_of_samples, params)
238
+ rvec_list, obj_list, num_samples_run, exit_info = self.evaluate_objective(x, number_of_samples, params)
234
239
 
235
240
  # Handle exit conditions (f < min obj value or maxfun reached)
236
241
  if exit_info is not None:
@@ -289,7 +294,7 @@ class Controller(object):
289
294
 
290
295
  # Evaluate objective at this new point
291
296
  x = self.model.as_absolute_coordinates(xpts_added[k, :])
292
- rvec_list, f_list, num_samples_run, exit_info = self.evaluate_objective(x, number_of_samples, params)
297
+ rvec_list, obj_list, num_samples_run, exit_info = self.evaluate_objective(x, number_of_samples, params)
293
298
 
294
299
  # Handle exit conditions (f < min obj value or maxfun reached)
295
300
  if exit_info is not None:
@@ -309,7 +314,7 @@ class Controller(object):
309
314
  # Note: this works because the steps for (k) and (k-n) points were in the same coordinate direction
310
315
  if self.n() + 1 <= k < 2 * self.n() + 1:
311
316
  # Only swap if steps were in different directions AND new pt has lower objective
312
- if stepa * stepb < 0.0 and self.model.fval[k] < self.model.fval[k - self.n()]:
317
+ if stepa * stepb < 0.0 and self.model.objval[k] < self.model.objval[k - self.n()]:
313
318
  xpts_added[[k, k-self.n()]] = xpts_added[[k-self.n(), k]]
314
319
 
315
320
  return None # return & continue
@@ -342,7 +347,7 @@ class Controller(object):
342
347
  for ndirns in range(num_directions):
343
348
  new_point = xopt + dirns[ndirns, :] # alway base move around best value so far
344
349
  x = self.model.as_absolute_coordinates(new_point)
345
- rvec_list, f_list, num_samples_run, exit_info = eval_obj_results[ndirns]
350
+ rvec_list, obj_list, num_samples_run, exit_info = eval_obj_results[ndirns]
346
351
  # Handle exit conditions (f < min obj value or maxfun reached)
347
352
  if exit_info is not None:
348
353
  if num_samples_run > 0:
@@ -361,7 +366,7 @@ class Controller(object):
361
366
 
362
367
  # Evaluate objective
363
368
  x = self.model.as_absolute_coordinates(new_point)
364
- rvec_list, f_list, num_samples_run, exit_info = self.evaluate_objective(x, number_of_samples, params)
369
+ rvec_list, obj_list, num_samples_run, exit_info = self.evaluate_objective(x, number_of_samples, params)
365
370
 
366
371
  # Handle exit conditions (f < min obj value or maxfun reached)
367
372
  if exit_info is not None:
@@ -398,7 +403,7 @@ class Controller(object):
398
403
  for j in range(num_steps):
399
404
  xnew = self.model.xopt() + (step_length / LA.norm(dirns[j, :])) * dirns[j, :]
400
405
  x = self.model.as_absolute_coordinates(xnew)
401
- rvec_list, f_list, num_samples_run, exit_info = self.evaluate_objective(x, number_of_samples, params)
406
+ rvec_list, obj_list, num_samples_run, exit_info = self.evaluate_objective(x, number_of_samples, params)
402
407
 
403
408
  # Handle exit conditions (f < min obj value or maxfun reached)
404
409
  if exit_info is not None:
@@ -436,13 +441,85 @@ class Controller(object):
436
441
 
437
442
  return dirn * (step_length / LA.norm(dirn))
438
443
 
439
- def trust_region_step(self, params):
440
- # Build model for full least squares objectives
444
+ def evaluate_criticality_measure(self, params):
445
+ # Calculate criticality measure for regularized problems (h is not None)
446
+
447
+ # Build model for full least squares function
441
448
  gopt, H = self.model.build_full_model()
449
+
450
+ if np.any(np.isnan(gopt)) or np.any(np.isnan(H)) or not np.all(np.isfinite(gopt)) or not np.all(np.isfinite(H)):
451
+ module_logger.debug("nan/inf values in gopt and/or H, skipping ctrsbox_sfista (criticality measure calc)")
452
+ # d = np.zeros(gopt.shape)
453
+ # gnew = gopt.copy()
454
+ # crvmin = -1
455
+ return np.inf
456
+
457
+ # NOTE: smaller params here to get more iterations in S-FISTA
458
+ func_tol = params("func_tol.criticality_measure") * self.delta
442
459
  if self.model.projections:
443
- d, gnew, crvmin = ctrsbox(self.model.xopt(abs_coordinates=True), gopt, H, self.model.projections, self.delta, d_max_iters=params("dykstra.max_iters"), d_tol=params("dykstra.d_tol"))
460
+ d, gnew, crvmin = ctrsbox_sfista(self.model.xopt(abs_coordinates=True), gopt, np.zeros(H.shape), self.model.projections, 1,
461
+ self.h, self.lh, self.prox_uh, argsh = self.argsh, argsprox=self.argsprox, func_tol=func_tol,
462
+ max_iters=params("func_tol.max_iters"), d_max_iters=params("dykstra.max_iters"), d_tol=params("dykstra.d_tol"),
463
+ scaling_changes=self.scaling_changes, sfista_iters_scale=params("sfista.max_iters_scaling"))
464
+ else:
465
+ proj = lambda x: pbox(x, self.model.sl, self.model.su)
466
+ d, gnew, crvmin = ctrsbox_sfista(self.model.xopt(abs_coordinates=True), gopt, np.zeros(H.shape), [proj], 1,
467
+ self.h, self.lh, self.prox_uh, argsh = self.argsh, argsprox=self.argsprox, func_tol=func_tol,
468
+ max_iters=params("func_tol.max_iters"), d_max_iters=params("dykstra.max_iters"), d_tol=params("dykstra.d_tol"),
469
+ scaling_changes=self.scaling_changes, sfista_iters_scale=params("sfista.max_iters_scaling"))
470
+
471
+ # Calculate criticality measure
472
+ criticality_measure = self.h(remove_scaling(self.model.xopt(abs_coordinates=True), self.scaling_changes), *self.argsh) - model_value(gopt, np.zeros(H.shape), d, self.model.xopt(abs_coordinates=True), self.h, self.argsh, self.scaling_changes)
473
+ return criticality_measure
474
+
475
+ def trust_region_step(self, params, criticality_measure=1e-2):
476
+ # Build model for full least squares function
477
+ gopt, H = self.model.build_full_model()
478
+ # Build func_tol for trust region step
479
+ # QUESTION: c1 = min{1, 1/delta_max^2}, but choose c1=1here; choose maxhessian = max(||H||_2,1)
480
+ # QUESTION: when criticality_measure = 0? choose max(criticality_measure,1)
481
+ func_tol = (1-params("func_tol.tr_step")) * 1 * max(criticality_measure,1) * min(self.delta, max(criticality_measure,1) / max(np.linalg.norm(H, 2),1))
482
+
483
+ if self.h is None:
484
+ if self.model.projections:
485
+ # Running PGD/SFISTA is generally slower than trsbox, so don't do this if gopt or H have bad values
486
+ # (this will ultimately lead to a manual setting of d=0 and calling a safety step anyway)
487
+ if np.any(np.isnan(gopt)) or np.any(np.isnan(H)) or not np.all(np.isfinite(gopt)) or not np.all(np.isfinite(H)):
488
+ module_logger.debug("nan/inf values in gopt and/or H, skipping ctrsbox_pgd")
489
+ d = np.zeros(gopt.shape)
490
+ gnew = gopt.copy()
491
+ crvmin = -1
492
+ else:
493
+ d, gnew, crvmin = ctrsbox_pgd(self.model.xopt(abs_coordinates=True), gopt, H, self.model.projections, self.delta, d_max_iters=params("dykstra.max_iters"), d_tol=params("dykstra.d_tol"))
494
+ else:
495
+ d, gnew, crvmin = trsbox(self.model.xopt(), gopt, H, self.model.sl, self.model.su, self.delta)
444
496
  else:
445
- d, gnew, crvmin = trsbox(self.model.xopt(), gopt, H, self.model.sl, self.model.su, self.delta)
497
+ # Running PGD/SFISTA is generally slower than trsbox, so don't do this if gopt or H have bad values
498
+ # (this will ultimately lead to a manual setting of d=0 and calling a safety step anyway)
499
+ if np.any(np.isnan(gopt)) or np.any(np.isnan(H)) or not np.all(np.isfinite(gopt)) or not np.all(np.isfinite(H)):
500
+ module_logger.debug("nan/inf values in gopt and/or H, skipping ctrsbox_sfista")
501
+ d = np.zeros(gopt.shape)
502
+ gnew = gopt.copy()
503
+ crvmin = -1
504
+ elif self.model.projections:
505
+ d, gnew, crvmin = ctrsbox_sfista(self.model.xopt(abs_coordinates=True), gopt, H, self.model.projections, self.delta,
506
+ self.h, self.lh, self.prox_uh, argsh = self.argsh, argsprox=self.argsprox, func_tol=func_tol,
507
+ max_iters=params("func_tol.max_iters"), d_max_iters=params("dykstra.max_iters"), d_tol=params("dykstra.d_tol"),
508
+ scaling_changes=self.scaling_changes, sfista_iters_scale=params("sfista.max_iters_scaling"))
509
+ else:
510
+ # NOTE: alternative way if using trsbox
511
+ # d, gnew, crvmin = trsbox(self.model.xopt(), gopt, H, self.model.sl, self.model.su, self.delta)
512
+ proj = lambda x: pbox(x, self.model.sl, self.model.su)
513
+ d, gnew, crvmin = ctrsbox_sfista(self.model.xopt(abs_coordinates=True), gopt, H, [proj], self.delta,
514
+ self.h, self.lh, self.prox_uh, argsh = self.argsh, argsprox=self.argsprox, func_tol=func_tol,
515
+ max_iters=params("func_tol.max_iters"), d_max_iters=params("dykstra.max_iters"), d_tol=params("dykstra.d_tol"),
516
+ scaling_changes=self.scaling_changes, sfista_iters_scale=params("sfista.max_iters_scaling"))
517
+
518
+ # NOTE: check sufficient decrease. If increase in the model, set zero step
519
+ pred_reduction = self.h(remove_scaling(self.model.xopt(abs_coordinates=True), self.scaling_changes), *self.argsh) - model_value(gopt, H, d, self.model.xopt(abs_coordinates=True), self.h, self.argsh, self.scaling_changes)
520
+ if pred_reduction < 0.0:
521
+ d = np.zeros(d.shape)
522
+
446
523
  return d, gopt, H, gnew, crvmin
447
524
 
448
525
  def geometry_step(self, knew, adelt, number_of_samples, params):
@@ -463,10 +540,10 @@ class Controller(object):
463
540
  return exit_info # didn't fix geometry - return & quit
464
541
 
465
542
  gopt, H = self.model.build_full_model() # save here, to calculate predicted value from geometry step
466
- fopt = self.model.fopt() # again, evaluate now, before model.change_point()
543
+ objopt = self.model.objopt() # again, evaluate now, before model.change_point()
467
544
  d = xnew - self.model.xopt()
468
545
  x = self.model.as_absolute_coordinates(xnew)
469
- rvec_list, f_list, num_samples_run, exit_info = self.evaluate_objective(x, number_of_samples, params)
546
+ rvec_list, obj_list, num_samples_run, exit_info = self.evaluate_objective(x, number_of_samples, params)
470
547
 
471
548
  # Handle exit conditions (f < min obj value or maxfun reached)
472
549
  if exit_info is not None:
@@ -481,11 +558,14 @@ class Controller(object):
481
558
  self.model.add_new_sample(knew, rvec_extra=rvec_list[i, :])
482
559
 
483
560
  # Estimate actual reduction to add to diffs vector
484
- f = sumsq(np.mean(rvec_list[:num_samples_run, :], axis=0)) # estimate actual objective value
485
-
561
+ obj = sumsq(np.mean(rvec_list[:num_samples_run, :], axis=0)) # estimate actual objective value
486
562
  # pred_reduction = - calculate_model_value(gopt, H, d)
487
563
  pred_reduction = - model_value(gopt, H, d)
488
- actual_reduction = fopt - f
564
+ if self.h is not None:
565
+ obj += self.h(remove_scaling(x, self.scaling_changes), *self.argsh)
566
+ # since m(0) = h(x)
567
+ pred_reduction = self.h(remove_scaling(x, self.scaling_changes), *self.argsh) - model_value(gopt, H, d, x, self.h, self.argsh, self.scaling_changes)
568
+ actual_reduction = objopt - obj
489
569
  self.diffs = [abs(pred_reduction - actual_reduction), self.diffs[0], self.diffs[1]]
490
570
  return None # exit_info = None
491
571
 
@@ -513,7 +593,7 @@ class Controller(object):
513
593
  def evaluate_objective(self, x, number_of_samples, params):
514
594
  # Sample from objective function several times, keeping track of maxfun and min_obj_value throughout
515
595
  rvec_list = np.zeros((number_of_samples, self.m()))
516
- f_list = np.zeros((number_of_samples,))
596
+ obj_list = np.zeros((number_of_samples,))
517
597
  num_samples_run = 0
518
598
  incremented_nx = False
519
599
  exit_info = None
@@ -527,19 +607,24 @@ class Controller(object):
527
607
  if not incremented_nx:
528
608
  self.nx += 1
529
609
  incremented_nx = True
530
- rvec_list[i, :], f_list[i] = eval_least_squares_objective(self.objfun, remove_scaling(x, self.scaling_changes),
531
- args=self.args, eval_num=self.nf, pt_num=self.nx,
610
+ rvec_list[i, :], obj_list[i] = eval_least_squares_with_regularisation(self.objfun, remove_scaling(x, self.scaling_changes), self.h,
611
+ argsf=self.argsf, argsh=self.argsh, verbose=self.do_logging, eval_num=self.nf, pt_num=self.nx,
532
612
  full_x_thresh=params("logging.n_to_print_whole_x_vector"),
533
- check_for_overflow=params("general.check_objfun_for_overflow"),
534
- verbose=self.do_logging)
613
+ check_for_overflow=params("general.check_objfun_for_overflow"))
535
614
  num_samples_run += 1
536
615
 
537
616
  # Check if the average value was below our threshold
538
- if num_samples_run > 0 and \
539
- sumsq(np.mean(rvec_list[:num_samples_run, :], axis=0)) <= self.model.min_objective_value():
540
- exit_info = ExitInformation(EXIT_SUCCESS, "Objective is sufficiently small")
617
+ # QUESTION: how to choose x in h when using averaged values
618
+ if self.h is None:
619
+ if num_samples_run > 0 and \
620
+ sumsq(np.mean(rvec_list[:num_samples_run, :], axis=0)) <= self.model.min_objective_value():
621
+ exit_info = ExitInformation(EXIT_SUCCESS, "Objective is sufficiently small")
622
+ else:
623
+ if num_samples_run > 0 and \
624
+ sumsq(np.mean(rvec_list[:num_samples_run, :], axis=0)) + self.h(remove_scaling(x, self.scaling_changes),*self.argsh) <= self.model.min_objective_value():
625
+ exit_info = ExitInformation(EXIT_SUCCESS, "Objective is sufficiently small")
541
626
 
542
- return rvec_list, f_list, num_samples_run, exit_info
627
+ return rvec_list, obj_list, num_samples_run, exit_info
543
628
 
544
629
  def choose_point_to_replace(self, d, skip_kopt=True):
545
630
  delsq = self.delta ** 2
@@ -615,11 +700,18 @@ class Controller(object):
615
700
  self.last_successful_iter = current_iter # reset successful iteration check
616
701
  return
617
702
 
618
- def calculate_ratio(self, current_iter, rvec_list, d, gopt, H):
703
+ def calculate_ratio(self, x, current_iter, rvec_list, d, gopt, H):
619
704
  exit_info = None
620
- f = sumsq(np.mean(rvec_list, axis=0)) # estimate actual objective value
621
- pred_reduction = - model_value(gopt, H, d) # negative of m since m(0) = 0
622
- actual_reduction = self.model.fopt() - f
705
+ # estimate actual objective value
706
+ obj = sumsq(np.mean(rvec_list, axis=0))
707
+ # pred_reduction = - calculate_model_value(gopt, H, d)
708
+ pred_reduction = - model_value(gopt, H, d)
709
+ if self.h is not None:
710
+ # QUESTION: x+d here correct? rvec_list takes mean value
711
+ obj += self.h(remove_scaling(x+d, self.scaling_changes), *self.argsh)
712
+ # since m(0) = h(x)
713
+ pred_reduction = self.h(remove_scaling(x, self.scaling_changes), *self.argsh) - model_value(gopt, H, d, x, self.h, self.argsh, self.scaling_changes)
714
+ actual_reduction = self.model.objopt() - obj
623
715
  self.diffs = [abs(actual_reduction - pred_reduction), self.diffs[0], self.diffs[1]]
624
716
  if min(sqrt(sumsq(d)), self.delta) > self.rho: # if ||d|| >= rho, successful!
625
717
  self.last_successful_iter = current_iter
@@ -627,8 +719,7 @@ class Controller(object):
627
719
  if len(self.model.projections) > 1: # if we are using multiple projections, only warn since likely due to constraint intersection
628
720
  exit_info = ExitInformation(EXIT_TR_INCREASE_WARNING, "Either multiple constraints are active or trust region step gave model increase")
629
721
  else:
630
- exit_info = ExitInformation(EXIT_TR_INCREASE_ERROR, "Either rust region step gave model increase")
631
-
722
+ exit_info = ExitInformation(EXIT_TR_INCREASE_ERROR, "Trust region step gave model increase")
632
723
  ratio = actual_reduction / pred_reduction
633
724
  return ratio, exit_info
634
725
 
@@ -636,13 +727,13 @@ class Controller(object):
636
727
  if len(self.last_iters_step_taken) <= params("slow.history_for_slow"):
637
728
  # Not enough info, simply append
638
729
  self.last_iters_step_taken.append(current_iter)
639
- self.last_fopts_step_taken.append(self.model.fopt())
730
+ self.last_fopts_step_taken.append(self.model.objopt())
640
731
  this_iter_slow = False
641
732
  else:
642
733
  # Enough info - shift values
643
734
  self.last_iters_step_taken = self.last_iters_step_taken[1:] + [current_iter]
644
- self.last_fopts_step_taken = self.last_fopts_step_taken[1:] + [self.model.fopt()]
645
- this_iter_slow = (log(self.last_fopts_step_taken[0]) - log(self.model.fopt())) / \
735
+ self.last_fopts_step_taken = self.last_fopts_step_taken[1:] + [self.model.objopt()]
736
+ this_iter_slow = (log(self.last_fopts_step_taken[0]) - log(self.model.objopt())) / \
646
737
  float(params("slow.history_for_slow")) < params("slow.thresh_for_slow")
647
738
  # Update counter of number of slow iterations
648
739
  if this_iter_slow:
@@ -659,9 +750,9 @@ class Controller(object):
659
750
  def soft_restart(self, number_of_samples, nruns_so_far, params, x_in_abs_coords_to_save=None, rvec_to_save=None,
660
751
  nsamples_to_save=None):
661
752
  # A successful run is one where we reduced fopt
662
- if self.model.fopt() < self.last_run_fopt:
753
+ if self.model.objopt() < self.last_run_fopt:
663
754
  self.last_successful_run = nruns_so_far
664
- self.last_run_fopt = self.model.fopt()
755
+ self.last_run_fopt = self.model.objopt()
665
756
 
666
757
  ok_to_do_restart = (nruns_so_far - self.last_successful_run < params("restarts.max_unsuccessful_restarts")) and \
667
758
  (self.nf < self.maxfun)
@@ -682,7 +773,7 @@ class Controller(object):
682
773
  self.model.nsamples[self.model.kopt], x_in_abs_coords=True)
683
774
 
684
775
  if self.do_logging:
685
- module_logger.info("Soft restart [currently, f = %g after %g function evals]" % (self.model.fopt(), self.nf))
776
+ module_logger.info("Soft restart [currently, f = %g after %g function evals]" % (self.model.objopt(), self.nf))
686
777
  # Resetting method: reset delta and rho, then move the closest 'num_steps' points to xk to improve geometry
687
778
  # Note: closest points because we are suddenly increasing delta & rho, so we want to encourage spreading out points
688
779
  self.delta = self.rhobeg
@@ -724,7 +815,7 @@ class Controller(object):
724
815
  for i in range(num_pts_to_add):
725
816
  xnew = self.model.xopt() + dirns[i, :] # always base move around best value so far
726
817
  x = self.model.as_absolute_coordinates(xnew)
727
- rvec_list, f_list, num_samples_run, exit_info = self.evaluate_objective(x, number_of_samples, params)
818
+ rvec_list, obj_list, num_samples_run, exit_info = self.evaluate_objective(x, number_of_samples, params)
728
819
 
729
820
  # Handle exit conditions (f < min obj value or maxfun reached)
730
821
  if exit_info is not None:
@@ -771,11 +862,11 @@ class Controller(object):
771
862
  add_noise = params("noise.scale_factor_for_quit") * params("noise.additive_noise_level")
772
863
  for k in range(self.model.npt()):
773
864
  all_fvals_within_noise = all_fvals_within_noise and \
774
- (self.model.fval[k] <= self.model.fopt() + add_noise / sqrt(self.model.nsamples[k]))
865
+ (self.model.objval[k] <= self.model.objopt() + add_noise / sqrt(self.model.nsamples[k]))
775
866
  else: # noise_level_multiplicative
776
867
  ratio = 1.0 + params("noise.scale_factor_for_quit") * params("noise.multiplicative_noise_level")
777
868
  for k in range(self.model.npt()):
778
- this_ratio = self.model.fval[k] / self.model.fopt() # fval_opt strictly positive (would have quit o/w)
869
+ this_ratio = self.model.objval[k] / self.model.objopt() # fval_opt strictly positive (would have quit o/w)
779
870
  all_fvals_within_noise = all_fvals_within_noise and (
780
871
  this_ratio <= ratio / sqrt(self.model.nsamples[k]))
781
872
  return all_fvals_within_noise
@@ -804,7 +895,7 @@ class Controller(object):
804
895
  dirns[i, :] = -dirns[i, :]
805
896
  xnew = np.maximum(np.minimum(self.model.xopt() + dirns[i, :], self.model.su), self.model.sl)
806
897
  x = self.model.as_absolute_coordinates(xnew)
807
- rvec_list, f_list, num_samples_run, exit_info = self.evaluate_objective(x, number_of_samples, params)
898
+ rvec_list, obj_list, num_samples_run, exit_info = self.evaluate_objective(x, number_of_samples, params)
808
899
 
809
900
  # Handle exit conditions (f < min obj value or maxfun reached)
810
901
  if exit_info is not None:
dfols/model.py CHANGED
@@ -36,7 +36,7 @@ import numpy as np
36
36
  import scipy.linalg as LA
37
37
 
38
38
  from .trust_region import trsbox_geometry
39
- from .util import sumsq, dykstra
39
+ from .util import sumsq, dykstra, remove_scaling
40
40
 
41
41
  __all__ = ['Model']
42
42
 
@@ -44,8 +44,8 @@ module_logger = logging.getLogger(__name__)
44
44
 
45
45
 
46
46
  class Model(object):
47
- def __init__(self, npt, x0, r0, xl, xu, projections, r0_nsamples, n=None, m=None, abs_tol=1e-12, rel_tol=1e-20, precondition=True,
48
- do_logging=True):
47
+ def __init__(self, npt, x0, r0, xl, xu, projections, r0_nsamples, h=None, argsh=(), n=None, m=None, abs_tol=1e-12, rel_tol=1e-20, precondition=True,
48
+ do_logging=True, scaling_changes=None):
49
49
  if n is None:
50
50
  n = len(x0)
51
51
  if m is None:
@@ -56,11 +56,15 @@ class Model(object):
56
56
  assert xu.shape == (n,), "xu has wrong shape (got %s, expect (%g,))" % (str(xu.shape), n)
57
57
  assert r0.shape == (m,), "r0 has wrong shape (got %s, expect (%g,))" % (str(r0.shape), m)
58
58
  self.do_logging = do_logging
59
+ self.scaling_changes = scaling_changes
59
60
  self.dim = n
60
61
  self.resid_dim = m
61
62
  self.num_pts = npt
62
63
  self.npt_so_far = 1 # number of points added so far (with function values)
63
64
 
65
+ self.h = h
66
+ self.argsh = argsh
67
+
64
68
  # Initialise to blank some useful stuff
65
69
  # Interpolation points
66
70
  self.xbase = x0.copy()
@@ -72,12 +76,15 @@ class Model(object):
72
76
  # Function values
73
77
  self.fval_v = np.inf * np.ones((npt, m)) # residuals for each xpt
74
78
  self.fval_v[0, :] = r0.copy()
75
- self.fval = np.inf * np.ones((npt, )) # overall objective value for each xpt
76
- self.fval[0] = sumsq(r0)
79
+
80
+ self.objval = np.inf * np.ones((npt, )) # overall objective value for each xpt
81
+ self.objval[0] = sumsq(r0)
82
+ if h is not None:
83
+ self.objval[0] += h(remove_scaling(x0, self.scaling_changes), *argsh)
77
84
  self.kopt = 0 # index of current iterate (should be best value so far)
78
85
  self.nsamples = np.zeros((npt,), dtype=int) # number of samples used to evaluate objective at each point
79
86
  self.nsamples[0] = r0_nsamples
80
- self.fbeg = self.fval[0] # f(x0), saved to check for sufficient reduction
87
+ self.objbeg = self.objval[0] # f(x0), saved to check for sufficient reduction
81
88
 
82
89
  # Termination criteria
83
90
  self.abs_tol = abs_tol
@@ -90,7 +97,7 @@ class Model(object):
90
97
  # Saved point (in absolute coordinates) - always check this value before quitting solver
91
98
  self.xsave = None
92
99
  self.rsave = None
93
- self.fsave = None
100
+ self.objsave = None
94
101
  self.jacsave = None
95
102
  self.nsamples_save = None
96
103
 
@@ -118,8 +125,8 @@ class Model(object):
118
125
  def ropt(self):
119
126
  return self.fval_v[self.kopt, :] # residuals for current iterate
120
127
 
121
- def fopt(self):
122
- return self.fval[self.kopt]
128
+ def objopt(self):
129
+ return self.objval[self.kopt]
123
130
 
124
131
  def xpt(self, k, abs_coordinates=False):
125
132
  assert 0 <= k < self.npt(), "Invalid index %g" % k
@@ -135,9 +142,9 @@ class Model(object):
135
142
  assert 0 <= k < self.npt(), "Invalid index %g" % k
136
143
  return self.fval_v[k, :]
137
144
 
138
- def fval(self, k):
145
+ def objval(self, k):
139
146
  assert 0 <= k < self.npt(), "Invalid index %g" % k
140
- return self.fval[k]
147
+ return self.objval[k]
141
148
 
142
149
  def as_absolute_coordinates(self, x, full_dykstra=False):
143
150
  # If x were an interpolation point, get the absolute coordinates of x
@@ -177,18 +184,20 @@ class Model(object):
177
184
 
178
185
  self.points[k, :] = x.copy()
179
186
  self.fval_v[k, :] = rvec.copy()
180
- self.fval[k] = sumsq(rvec)
187
+ self.objval[k] = sumsq(rvec)
188
+ if self.h is not None:
189
+ self.objval[k] += self.h(remove_scaling(self.xbase + x, self.scaling_changes), *self.argsh)
181
190
  self.nsamples[k] = 1
182
191
  self.factorisation_current = False
183
192
 
184
- if allow_kopt_update and self.fval[k] < self.fopt():
193
+ if allow_kopt_update and self.objval[k] < self.objopt():
185
194
  self.kopt = k
186
195
  return
187
196
 
188
197
  def swap_points(self, k1, k2):
189
198
  self.points[[k1, k2], :] = self.points[[k2, k1], :]
190
199
  self.fval_v[[k1, k2], :] = self.fval_v[[k2, k1], :]
191
- self.fval[[k1, k2]] = self.fval[[k2, k1]]
200
+ self.objval[[k1, k2]] = self.objval[[k2, k1]]
192
201
  if self.kopt == k1:
193
202
  self.kopt = k2
194
203
  elif self.kopt == k2:
@@ -201,22 +210,27 @@ class Model(object):
201
210
  assert 0 <= k < self.npt(), "Invalid index %g" % k
202
211
  t = float(self.nsamples[k]) / float(self.nsamples[k] + 1)
203
212
  self.fval_v[k, :] = t * self.fval_v[k, :] + (1 - t) * rvec_extra
204
- self.fval[k] = sumsq(self.fval_v[k, :])
213
+ # NOTE: how to sample when we have h? still at xpt(k), then add h(xpt(k)). Modify test if incorrect!
214
+ self.objval[k] = sumsq(self.fval_v[k, :])
215
+ if self.h is not None:
216
+ self.objval[k] += self.h(remove_scaling(self.xbase + self.points[k, :], self.scaling_changes), *self.argsh)
205
217
  self.nsamples[k] += 1
206
218
 
207
- self.kopt = np.argmin(self.fval[:self.npt()]) # make sure kopt is always the best value we have
219
+ self.kopt = np.argmin(self.objval[:self.npt()]) # make sure kopt is always the best value we have
208
220
  return
209
221
 
210
222
  def add_new_point(self, x, rvec):
211
223
  self.points = np.append(self.points, x.reshape((1, self.n())), axis=0) # append row to xpt
212
224
  self.fval_v = np.append(self.fval_v, rvec.reshape((1, self.m())), axis=0) # append row to fval_v
213
- f = np.dot(rvec, rvec)
214
- self.fval = np.append(self.fval, f) # append entry to fval
225
+ obj = sumsq(rvec)
226
+ if self.h is not None:
227
+ obj += self.h(remove_scaling(self.xbase + x, self.scaling_changes), *self.argsh)
228
+ self.objval = np.append(self.objval, obj) # append entry to fval
215
229
  self.nsamples = np.append(self.nsamples, 1) # add new sample number
216
230
  self.num_pts += 1 # make sure npt is updated
217
231
  self.npt_so_far += 1
218
232
 
219
- if f < self.fopt():
233
+ if obj < self.objopt():
220
234
  self.kopt = self.npt() - 1
221
235
 
222
236
  self.factorisation_current = False
@@ -236,11 +250,14 @@ class Model(object):
236
250
  return
237
251
 
238
252
  def save_point(self, x, rvec, nsamples, x_in_abs_coords=True):
239
- f = sumsq(rvec)
240
- if self.fsave is None or f <= self.fsave:
241
- self.xsave = x.copy() if x_in_abs_coords else self.as_absolute_coordinates(x)
253
+ xabs = x.copy() if x_in_abs_coords else self.as_absolute_coordinates(x)
254
+ obj = sumsq(rvec)
255
+ if self.h is not None:
256
+ obj += self.h(remove_scaling(xabs, self.scaling_changes), *self.argsh)
257
+ if self.objsave is None or obj <= self.objsave:
258
+ self.xsave = xabs
242
259
  self.rsave = rvec.copy()
243
- self.fsave = f
260
+ self.objsave = obj
244
261
  self.jacsave = self.model_jac.copy()
245
262
  self.nsamples_save = nsamples
246
263
  return True
@@ -248,15 +265,15 @@ class Model(object):
248
265
  return False # this value is worse than what we have already - didn't save
249
266
 
250
267
  def get_final_results(self):
251
- # Return x and fval for optimal point (either from xsave+fsave or kopt)
252
- if self.fsave is None or self.fopt() <= self.fsave: # optimal has changed since xsave+fsave were last set
253
- return self.xopt(abs_coordinates=True).copy(), self.ropt().copy(), self.fopt(), self.model_jac.copy(), self.nsamples[self.kopt]
268
+ # Return x and objval for optimal point (either from xsave+objsave or kopt)
269
+ if self.objsave is None or self.objopt() <= self.objsave: # optimal has changed since xsave+objsave were last set
270
+ return self.xopt(abs_coordinates=True).copy(), self.ropt().copy(), self.objopt(), self.model_jac.copy(), self.nsamples[self.kopt]
254
271
  else:
255
- return self.xsave.copy(), self.rsave.copy(), self.fsave, self.jacsave, self.nsamples_save
272
+ return self.xsave.copy(), self.rsave.copy(), self.objsave, self.jacsave, self.nsamples_save
256
273
 
257
274
  def min_objective_value(self):
258
275
  # Get termination criterion for f small: f <= abs_tol or f <= rel_tol * f0
259
- return max(self.abs_tol, self.rel_tol * self.fbeg)
276
+ return max(self.abs_tol, self.rel_tol * self.objbeg)
260
277
 
261
278
  def model_value(self, d, d_based_at_xopt=True, with_const_term=False):
262
279
  if d_based_at_xopt:
@@ -375,7 +392,7 @@ class Model(object):
375
392
  return True, interp_error, sqrt(norm_J_error), linalg_resid, ls_interp_cond_num # flag ok
376
393
 
377
394
  def build_full_model(self):
378
- # Build full least squares objective model from mini-models
395
+ # Build full least squares model from mini-models
379
396
  # Centred around xopt
380
397
  r = self.model_const + np.dot(self.model_jac, self.xopt()) # constant term (for inexact interpolation)
381
398
  J = self.model_jac
dfols/params.py CHANGED
@@ -82,7 +82,7 @@ class ParameterList(object):
82
82
  self.params["restarts.use_soft_restarts"] = True
83
83
  self.params["restarts.soft.num_geom_steps"] = 3
84
84
  self.params["restarts.soft.move_xk"] = True
85
- self.params["restarts.soft.max_fake_successful_steps"] = maxfun # number ratio>0 steps below fsave allowed
85
+ self.params["restarts.soft.max_fake_successful_steps"] = maxfun # number ratio>0 steps below objsave allowed
86
86
  self.params["restarts.hard.use_old_rk"] = True # recycle r(xk) from previous run?
87
87
  self.params["restarts.increase_npt"] = False
88
88
  self.params["restarts.increase_npt_amt"] = 1
@@ -109,12 +109,20 @@ class ParameterList(object):
109
109
  self.params["growing.full_rank.min_sing_val"] = 1e-6 # absolute floor on singular values
110
110
  self.params["growing.full_rank.svd_max_jac_cond"] = 1e8 # maximum condition number of Jacobian
111
111
  self.params["growing.perturb_trust_region_step"] = False # add random direction onto TRS solution?
112
+
112
113
  # Dykstra's algorithm
113
114
  self.params["dykstra.d_tol"] = 1e-10
114
115
  self.params["dykstra.max_iters"] = 100
116
+
115
117
  # Matrix rank algorithm
116
118
  self.params["matrix_rank.r_tol"] = 1e-18
117
-
119
+
120
+ # Function tolerance when applying S-FISTA method
121
+ self.params["func_tol.criticality_measure"] = 1e-3
122
+ self.params["func_tol.tr_step"] = 1-1e-1
123
+ self.params["func_tol.max_iters"] = 500
124
+ self.params["sfista.max_iters_scaling"] = 2.0
125
+
118
126
  self.params_changed = {}
119
127
  for p in self.params:
120
128
  self.params_changed[p] = False
@@ -268,6 +276,14 @@ class ParameterList(object):
268
276
  type_str, nonetype_ok, lower, upper = 'int', False, 0, None
269
277
  elif key == "matrix_rank.r_tol":
270
278
  type_str, nonetype_ok, lower, upper = 'float', False, 0.0, None
279
+ elif key == "func_tol.criticality_measure":
280
+ type_str, nonetype_ok, lower, upper = 'float', False, 0.0, 1.0
281
+ elif key == "func_tol.tr_step":
282
+ type_str, nonetype_ok, lower, upper = 'float', False, 0.0, 1.0
283
+ elif key == "func_tol.max_iters":
284
+ type_str, nonetype_ok, lower, upper = 'int', False, 0, None
285
+ elif key == "sfista.max_iters_scaling":
286
+ type_str, nonetype_ok, lower, upper = 'float', False, 1.0, None
271
287
  else:
272
288
  assert False, "ParameterList.param_type() has unknown key: %s" % key
273
289
  return type_str, nonetype_ok, lower, upper
dfols/solver.py CHANGED
@@ -48,10 +48,10 @@ module_logger = logging.getLogger(__name__)
48
48
 
49
49
  # A container for the results of the optimization routine
50
50
  class OptimResults(object):
51
- def __init__(self, xmin, rmin, fmin, jacmin, nf, nx, nruns, exit_flag, exit_msg):
51
+ def __init__(self, xmin, rmin, objmin, jacmin, nf, nx, nruns, exit_flag, exit_msg):
52
52
  self.x = xmin
53
53
  self.resid = rmin
54
- self.f = fmin
54
+ self.obj = objmin
55
55
  self.jacobian = jacmin
56
56
  self.nf = nf
57
57
  self.nx = nx
@@ -77,7 +77,7 @@ class OptimResults(object):
77
77
  output += "Residual vector = %s\n" % str(self.resid)
78
78
  else:
79
79
  output += "Not showing residual vector because it is too long; check self.resid\n"
80
- output += "Objective value f(xmin) = %.10g\n" % self.f
80
+ output += "Objective value f(xmin) = %.10g\n" % self.obj
81
81
  output += "Needed %g objective evaluations (at %g points)\n" % (self.nf, self.nx)
82
82
  if self.nruns > 1:
83
83
  output += "Did a total of %g runs\n" % self.nruns
@@ -95,8 +95,8 @@ class OptimResults(object):
95
95
  return output
96
96
 
97
97
 
98
- def solve_main(objfun, x0, args, xl, xu, projections, npt, rhobeg, rhoend, maxfun, nruns_so_far, nf_so_far, nx_so_far, nsamples, params,
99
- diagnostic_info, scaling_changes, r0_avg_old=None, r0_nsamples_old=None, default_growing_method_set_by_user=None,
98
+ def solve_main(objfun, x0, argsf, xl, xu, projections, npt, rhobeg, rhoend, maxfun, nruns_so_far, nf_so_far, nx_so_far, nsamples, params,
99
+ diagnostic_info, scaling_changes, h=None, lh=None, argsh=(), prox_uh=None, argsprox=None, r0_avg_old=None, r0_nsamples_old=None, default_growing_method_set_by_user=None,
100
100
  do_logging=True, print_progress=False):
101
101
  # Evaluate at x0 (keep nf, nx correct and check for f < 1e-12)
102
102
  # The hard bit is determining what m = len(r0) should be, and allocating memory appropriately
@@ -105,18 +105,17 @@ def solve_main(objfun, x0, args, xl, xu, projections, npt, rhobeg, rhoend, maxfu
105
105
  # Evaluate the first time...
106
106
  nf = nf_so_far + 1
107
107
  nx = nx_so_far + 1
108
- r0, f0 = eval_least_squares_objective(objfun, remove_scaling(x0, scaling_changes),
109
- args=args, eval_num=nf, pt_num=nx,
108
+ r0, obj0 = eval_least_squares_with_regularisation(objfun, remove_scaling(x0, scaling_changes), h,
109
+ argsf=argsf, argsh=argsh, verbose=do_logging, eval_num=nf, pt_num=nx,
110
110
  full_x_thresh=params("logging.n_to_print_whole_x_vector"),
111
- check_for_overflow=params("general.check_objfun_for_overflow"),
112
- verbose=do_logging)
111
+ check_for_overflow=params("general.check_objfun_for_overflow"))
113
112
  m = len(r0)
114
113
 
115
114
  # Now we have m, we can evaluate the rest of the times
116
115
  rvec_list = np.zeros((number_of_samples, m))
117
- f_list = np.zeros((number_of_samples,))
116
+ obj_list = np.zeros((number_of_samples,))
118
117
  rvec_list[0, :] = r0
119
- f_list[0] = f0
118
+ obj_list[0] = obj0
120
119
  num_samples_run = 1
121
120
  exit_info = None
122
121
 
@@ -128,15 +127,20 @@ def solve_main(objfun, x0, args, xl, xu, projections, npt, rhobeg, rhoend, maxfu
128
127
 
129
128
  nf += 1
130
129
  # Don't increment nx for x0 - we did this earlier
131
- rvec_list[i, :], f_list[i] = eval_least_squares_objective(objfun, remove_scaling(x0, scaling_changes), args=args, eval_num=nf, pt_num=nx,
130
+ rvec_list[i, :], obj_list[i] = eval_least_squares_with_regularisation(objfun, remove_scaling(x0, scaling_changes), h,
131
+ argsf=argsf, argsh=argsh, verbose=do_logging, eval_num=nf, pt_num=nx,
132
132
  full_x_thresh=params("logging.n_to_print_whole_x_vector"),
133
- check_for_overflow=params("general.check_objfun_for_overflow"),
134
- verbose=do_logging)
133
+ check_for_overflow=params("general.check_objfun_for_overflow"))
135
134
  num_samples_run += 1
136
135
 
137
136
  r0_avg = np.mean(rvec_list[:num_samples_run, :], axis=0)
138
- if sumsq(r0_avg) <= params("model.abs_tol"):
139
- exit_info = ExitInformation(EXIT_SUCCESS, "Objective is sufficiently small")
137
+ # NOTE: modify objvalue here
138
+ if h is None:
139
+ if sumsq(r0_avg) <= params("model.abs_tol"):
140
+ exit_info = ExitInformation(EXIT_SUCCESS, "Objective is sufficiently small")
141
+ else:
142
+ if sumsq(r0_avg) + h(remove_scaling(x0, scaling_changes), *argsh)<= params("model.abs_tol"):
143
+ exit_info = ExitInformation(EXIT_SUCCESS, "Objective is sufficiently small")
140
144
 
141
145
  if exit_info is not None:
142
146
  return x0, r0_avg, sumsq(r0_avg), None, num_samples_run, nf, nx, nruns_so_far+1, exit_info, diagnostic_info
@@ -162,8 +166,8 @@ def solve_main(objfun, x0, args, xl, xu, projections, npt, rhobeg, rhoend, maxfu
162
166
  params('growing.delta_scale_new_dirns', new_value=0.1)
163
167
 
164
168
  # Initialise controller
165
- control = Controller(objfun, args, x0, r0_avg, num_samples_run, xl, xu, projections, npt, rhobeg, rhoend, nf, nx, maxfun,
166
- params, scaling_changes, do_logging)
169
+ control = Controller(objfun, argsf, x0, r0_avg, num_samples_run, xl, xu, projections, npt, rhobeg, rhoend, nf, nx, maxfun,
170
+ params, scaling_changes, do_logging, h=h, lh=lh, argsh=argsh, prox_uh=prox_uh, argsprox=argsprox)
167
171
 
168
172
  # Initialise interpolation set
169
173
  number_of_samples = max(nsamples(control.delta, control.rho, 0, nruns_so_far), 1)
@@ -178,8 +182,8 @@ def solve_main(objfun, x0, args, xl, xu, projections, npt, rhobeg, rhoend, maxfu
178
182
  module_logger.info("Initialising (coordinate directions)")
179
183
  exit_info = control.initialise_coordinate_directions(number_of_samples, num_directions, params)
180
184
  if exit_info is not None:
181
- x, rvec, f, jacmin, nsamples = control.model.get_final_results()
182
- return x, rvec, f, None, nsamples, control.nf, control.nx, nruns_so_far + 1, exit_info, diagnostic_info
185
+ x, rvec, obj, jacmin, nsamples = control.model.get_final_results()
186
+ return x, rvec, obj, None, nsamples, control.nf, control.nx, nruns_so_far + 1, exit_info, diagnostic_info
183
187
 
184
188
  finished_growing = (control.model.npt() >= control.model.num_pts) # have we finished growing the initial set yet?
185
189
 
@@ -271,16 +275,30 @@ def solve_main(objfun, x0, args, xl, xu, projections, npt, rhobeg, rhoend, maxfu
271
275
  nruns_so_far += 1
272
276
  break # quit
273
277
 
274
-
275
- # Trust region step
276
- d, gopt, H, gnew, crvmin = control.trust_region_step(params)
278
+ tau = 1.0 # ratio used in the safety phase
279
+ if h is None:
280
+ # Trust region step
281
+ d, gopt, H, gnew, crvmin = control.trust_region_step(params)
282
+ else:
283
+ # Calculate criticality measure
284
+ criticality_measure = control.evaluate_criticality_measure(params)
285
+ # Trust region step
286
+ d, gopt, H, gnew, crvmin = control.trust_region_step(params, criticality_measure)
287
+ try:
288
+ tau = min(criticality_measure/(LA.norm(gopt)+lh), 1.0)
289
+ except ValueError:
290
+ # In some instances, gopt can have nan/inf values -- this ultimately calls a safety step and is generally fine
291
+ # but we need to set a value for tau nonetheless
292
+ tau = 1.0
293
+
277
294
  if do_logging:
278
295
  module_logger.debug("Trust region step is d = " + str(d))
296
+
279
297
  xnew = control.model.xopt() + d
280
298
  dnorm = min(LA.norm(d), control.delta)
281
299
 
282
300
  if print_progress:
283
- print("{:^5}{:^7}{:^10.2e}{:^10.2e}{:^10.2e}{:^10.2e}{:^7}".format(nruns_so_far+1, current_iter+1, control.model.fopt(), np.linalg.norm(gopt), control.delta, control.rho, control.nf))
301
+ print("{:^5}{:^7}{:^10.2e}{:^10.2e}{:^10.2e}{:^10.2e}{:^7}".format(nruns_so_far+1, current_iter+1, control.model.objopt(), np.linalg.norm(gopt), control.delta, control.rho, control.nf))
284
302
 
285
303
  if params("logging.save_diagnostic_info"):
286
304
  diagnostic_info.save_info_from_control(control, nruns_so_far, current_iter,
@@ -289,7 +307,7 @@ def solve_main(objfun, x0, args, xl, xu, projections, npt, rhobeg, rhoend, maxfu
289
307
  diagnostic_info.update_interpolation_information(interp_error, ls_interp_cond_num, linalg_resid,
290
308
  sqrt(norm_J_error), LA.norm(gopt), LA.norm(d))
291
309
 
292
- if dnorm < params("general.safety_step_thresh") * control.rho and not finished_growing and params("growing.safety.do_safety_step"):
310
+ if dnorm < tau * params("general.safety_step_thresh") * control.rho and not finished_growing and params("growing.safety.do_safety_step"):
293
311
  if do_logging:
294
312
  module_logger.debug("Safety step during growing phase")
295
313
 
@@ -415,10 +433,10 @@ def solve_main(objfun, x0, args, xl, xu, projections, npt, rhobeg, rhoend, maxfu
415
433
  if do_logging:
416
434
  module_logger.info("New rho = %g after %i function evaluations" % (control.rho, control.nf))
417
435
  if control.n() < params("logging.n_to_print_whole_x_vector"):
418
- module_logger.debug("Best so far: f = %.15g at x = " % (control.model.fopt())
436
+ module_logger.debug("Best so far: f = %.15g at x = " % (control.model.objopt())
419
437
  + str(control.model.xopt(abs_coordinates=True)))
420
438
  else:
421
- module_logger.debug("Best so far: f = %.15g at x = [...]" % (control.model.fopt()))
439
+ module_logger.debug("Best so far: f = %.15g at x = [...]" % (control.model.objopt()))
422
440
  continue # next iteration
423
441
  else:
424
442
  # Quit on rho=rhoend
@@ -439,8 +457,9 @@ def solve_main(objfun, x0, args, xl, xu, projections, npt, rhobeg, rhoend, maxfu
439
457
  else:
440
458
  # Cannot reduce rho, so check xnew and quit
441
459
  x = control.model.as_absolute_coordinates(xnew)
460
+ ##print("x from xnew", x)
442
461
  number_of_samples = max(nsamples(control.delta, control.rho, current_iter, nruns_so_far), 1)
443
- rvec_list, f_list, num_samples_run, exit_info = control.evaluate_objective(x, number_of_samples,
462
+ rvec_list, obj_list, num_samples_run, exit_info = control.evaluate_objective(x, number_of_samples,
444
463
  params)
445
464
 
446
465
  if num_samples_run > 0:
@@ -514,8 +533,9 @@ def solve_main(objfun, x0, args, xl, xu, projections, npt, rhobeg, rhoend, maxfu
514
533
 
515
534
  # Evaluate new point
516
535
  x = control.model.as_absolute_coordinates(xnew)
536
+ ##print("x from xnew again", x)
517
537
  number_of_samples = max(nsamples(control.delta, control.rho, current_iter, nruns_so_far), 1)
518
- rvec_list, f_list, num_samples_run, exit_info = control.evaluate_objective(x, number_of_samples, params)
538
+ rvec_list, obj_list, num_samples_run, exit_info = control.evaluate_objective(x, number_of_samples, params)
519
539
  if np.any(np.isnan(rvec_list)):
520
540
  # Just exit without saving the current point
521
541
  # We should be able to do a hard restart though, because it's unlikely
@@ -535,7 +555,7 @@ def solve_main(objfun, x0, args, xl, xu, projections, npt, rhobeg, rhoend, maxfu
535
555
  break # quit
536
556
 
537
557
  # Estimate f in order to compute 'actual reduction'
538
- ratio, exit_info = control.calculate_ratio(current_iter, rvec_list[:num_samples_run, :], d, gopt, H)
558
+ ratio, exit_info = control.calculate_ratio(control.model.xopt(abs_coordinates=True), current_iter, rvec_list[:num_samples_run, :], d, gopt, H)
539
559
  if exit_info is not None:
540
560
  if exit_info.able_to_do_restart() and params("restarts.use_restarts") and params(
541
561
  "restarts.use_soft_restarts"):
@@ -565,9 +585,9 @@ def solve_main(objfun, x0, args, xl, xu, projections, npt, rhobeg, rhoend, maxfu
565
585
  diagnostic_info.update_slow_iter(-1) # n/a, unless otherwise update
566
586
  if ratio < params("tr_radius.eta1"): # ratio < 0.1
567
587
  if finished_growing:
568
- control.delta = min(params("tr_radius.gamma_dec") * control.delta, dnorm)
588
+ control.delta = min(params("tr_radius.gamma_dec") * control.delta, dnorm) / tau
569
589
  else:
570
- control.delta = min(params("growing.gamma_dec") * control.delta, dnorm) # different gamma_dec
590
+ control.delta = min(params("growing.gamma_dec") * control.delta, dnorm) / tau # different gamma_dec
571
591
  if params("logging.save_diagnostic_info"):
572
592
  diagnostic_info.update_iter_type(ITER_ACCEPTABLE_NO_GEOM if ratio > 0.0
573
593
  else ITER_UNSUCCESSFUL_NO_GEOM) # we flag geom update below
@@ -651,7 +671,7 @@ def solve_main(objfun, x0, args, xl, xu, projections, npt, rhobeg, rhoend, maxfu
651
671
  break # quit
652
672
 
653
673
  # Update list of successful steps
654
- this_step_was_not_improvement = control.model.fsave is not None and control.model.fopt() > control.model.fsave
674
+ this_step_was_not_improvement = control.model.objsave is not None and control.model.objopt() > control.model.objsave
655
675
  succ_steps_not_improvement.pop() # remove last item
656
676
  succ_steps_not_improvement.insert(0, this_step_was_not_improvement) # add at beginning
657
677
  # Terminate (not restart) if all are True
@@ -828,10 +848,10 @@ def solve_main(objfun, x0, args, xl, xu, projections, npt, rhobeg, rhoend, maxfu
828
848
  if do_logging:
829
849
  module_logger.info("New rho = %g after %i function evaluations" % (control.rho, control.nf))
830
850
  if control.n() < params("logging.n_to_print_whole_x_vector"):
831
- module_logger.debug("Best so far: f = %.15g at x = " % (control.model.fopt())
851
+ module_logger.debug("Best so far: f = %.15g at x = " % (control.model.objopt())
832
852
  + str(control.model.xopt(abs_coordinates=True)))
833
853
  else:
834
- module_logger.debug("Best so far: f = %.15g at x = [...]" % (control.model.fopt()))
854
+ module_logger.debug("Best so far: f = %.15g at x = [...]" % (control.model.objopt()))
835
855
  continue # next iteration
836
856
  else:
837
857
  # Quit on rho=rhoend
@@ -857,14 +877,14 @@ def solve_main(objfun, x0, args, xl, xu, projections, npt, rhobeg, rhoend, maxfu
857
877
  # (end main loop)
858
878
 
859
879
  # Quit & return the important information
860
- x, rvec, f, jacmin, nsamples = control.model.get_final_results()
880
+ x, rvec, obj, jacmin, nsamples = control.model.get_final_results()
861
881
  if do_logging:
862
882
  module_logger.debug("At return from DFO-LS, number of function evals = %i" % nf)
863
- module_logger.debug("Smallest objective value = %.15g at x = " % f + str(x))
864
- return x, rvec, f, jacmin, nsamples, control.nf, control.nx, nruns_so_far, exit_info, diagnostic_info
883
+ module_logger.debug("Smallest objective value = %.15g at x = " % obj + str(x))
884
+ return x, rvec, obj, jacmin, nsamples, control.nf, control.nx, nruns_so_far, exit_info, diagnostic_info
865
885
 
866
886
 
867
- def solve(objfun, x0, args=(), bounds=None, projections=[], npt=None, rhobeg=None, rhoend=1e-8, maxfun=None, nsamples=None, user_params=None,
887
+ def solve(objfun, x0, h=None, lh=None, prox_uh=None, argsf=(), argsh=(), argsprox=(), bounds=None, projections=[], npt=None, rhobeg=None, rhoend=1e-8, maxfun=None, nsamples=None, user_params=None,
868
888
  objfun_has_noise=False, scaling_within_bounds=False, do_logging=True, print_progress=False):
869
889
  x0 = x0.astype(float)
870
890
  n = len(x0)
@@ -934,13 +954,21 @@ def solve(objfun, x0, args=(), bounds=None, projections=[], npt=None, rhobeg=Non
934
954
 
935
955
  exit_info = None
936
956
  # Input & parameter checks
957
+ if exit_info is None and h is not None:
958
+ if prox_uh is None:
959
+ exit_info = ExitInformation(EXIT_INPUT_ERROR, "Must provide prox_uh input if h is not None")
960
+ elif lh is None:
961
+ exit_info = ExitInformation(EXIT_INPUT_ERROR, "Must provide lh input if h is not None")
962
+ elif lh <= 0.0:
963
+ exit_info = ExitInformation(EXIT_INPUT_ERROR, "lh must be strictly positive")
964
+
937
965
  if exit_info is None and npt < n + 1:
938
966
  exit_info = ExitInformation(EXIT_INPUT_ERROR, "npt must be >= n+1 for linear models with inexact interpolation")
939
967
 
940
- if exit_info is None and rhobeg < 0.0:
968
+ if exit_info is None and rhobeg <= 0.0:
941
969
  exit_info = ExitInformation(EXIT_INPUT_ERROR, "rhobeg must be strictly positive")
942
970
 
943
- if exit_info is None and rhoend < 0.0:
971
+ if exit_info is None and rhoend <= 0.0:
944
972
  exit_info = ExitInformation(EXIT_INPUT_ERROR, "rhoend must be strictly positive")
945
973
 
946
974
  if exit_info is None and rhobeg <= rhoend:
@@ -1013,12 +1041,12 @@ def solve(objfun, x0, args=(), bounds=None, projections=[], npt=None, rhobeg=Non
1013
1041
  x0 = xp.copy()
1014
1042
 
1015
1043
  # Enforce lower & upper bounds on x0
1016
- idx = (x0 <= xl)
1044
+ idx = (x0 < xl)
1017
1045
  if np.any(idx):
1018
1046
  warnings.warn("x0 below lower bound, adjusting", RuntimeWarning)
1019
1047
  x0[idx] = xl[idx]
1020
1048
 
1021
- idx = (x0 >= xu)
1049
+ idx = (x0 > xu)
1022
1050
  if np.any(idx):
1023
1051
  warnings.warn("x0 above upper bound, adjusting", RuntimeWarning)
1024
1052
  x0[idx] = xu[idx]
@@ -1028,9 +1056,9 @@ def solve(objfun, x0, args=(), bounds=None, projections=[], npt=None, rhobeg=Non
1028
1056
  nruns = 0
1029
1057
  nf = 0
1030
1058
  nx = 0
1031
- xmin, rmin, fmin, jacmin, nsamples_min, nf, nx, nruns, exit_info, diagnostic_info = \
1032
- solve_main(objfun, x0, args, xl, xu, projections, npt, rhobeg, rhoend, maxfun, nruns, nf, nx, nsamples, params,
1033
- diagnostic_info, scaling_changes, default_growing_method_set_by_user=default_growing_method_set_by_user,
1059
+ xmin, rmin, objmin, jacmin, nsamples_min, nf, nx, nruns, exit_info, diagnostic_info = \
1060
+ solve_main(objfun, x0, argsf, xl, xu, projections, npt, rhobeg, rhoend, maxfun, nruns, nf, nx, nsamples, params,
1061
+ diagnostic_info, scaling_changes, h, lh, argsh, prox_uh, argsprox, default_growing_method_set_by_user=default_growing_method_set_by_user,
1034
1062
  do_logging=do_logging, print_progress=print_progress)
1035
1063
 
1036
1064
  # Hard restarts loop
@@ -1045,27 +1073,27 @@ def solve(objfun, x0, args=(), bounds=None, projections=[], npt=None, rhobeg=Non
1045
1073
 
1046
1074
  if do_logging:
1047
1075
  module_logger.info("Restarting from finish point (f = %g) after %g function evals; using rhobeg = %g and rhoend = %g"
1048
- % (fmin, nf, rhobeg, rhoend))
1076
+ % (objmin, nf, rhobeg, rhoend))
1049
1077
  if params("restarts.hard.use_old_rk"):
1050
- xmin2, rmin2, fmin2, jacmin2, nsamples2, nf, nx, nruns, exit_info, diagnostic_info = \
1051
- solve_main(objfun, xmin, args, xl, xu, projections, npt, rhobeg, rhoend, maxfun, nruns, nf, nx, nsamples, params,
1052
- diagnostic_info, scaling_changes, r0_avg_old=rmin, r0_nsamples_old=nsamples_min,
1078
+ xmin2, rmin2, objmin2, jacmin2, nsamples2, nf, nx, nruns, exit_info, diagnostic_info = \
1079
+ solve_main(objfun, xmin, argsf, xl, xu, projections, npt, rhobeg, rhoend, maxfun, nruns, nf, nx, nsamples, params,
1080
+ diagnostic_info, scaling_changes, h, lh, argsh, prox_uh, argsprox, r0_avg_old=rmin, r0_nsamples_old=nsamples_min,
1053
1081
  do_logging=do_logging, print_progress=print_progress)
1054
1082
  else:
1055
- xmin2, rmin2, fmin2, jacmin2, nsamples2, nf, nx, nruns, exit_info, diagnostic_info = \
1056
- solve_main(objfun, xmin, args, xl, xu, projections, npt, rhobeg, rhoend, maxfun, nruns, nf, nx, nsamples, params,
1057
- diagnostic_info, scaling_changes, do_logging=do_logging, print_progress=print_progress)
1083
+ xmin2, rmin2, objmin2, jacmin2, nsamples2, nf, nx, nruns, exit_info, diagnostic_info = \
1084
+ solve_main(objfun, xmin, argsf, xl, xu, projections, npt, rhobeg, rhoend, maxfun, nruns, nf, nx, nsamples, params,
1085
+ diagnostic_info, scaling_changes, h, lh, argsh, prox_uh, argsprox, do_logging=do_logging, print_progress=print_progress)
1058
1086
 
1059
- if fmin2 < fmin or np.isnan(fmin):
1087
+ if objmin2 < objmin or np.isnan(objmin):
1060
1088
  if do_logging:
1061
- module_logger.info("Successful run with new f = %s compared to old f = %s" % (fmin2, fmin))
1089
+ module_logger.info("Successful run with new f = %s compared to old f = %s" % (objmin2, objmin))
1062
1090
  last_successful_run = nruns
1063
- (xmin, rmin, fmin, nsamples_min) = (xmin2, rmin2, fmin2, nsamples2)
1091
+ (xmin, rmin, objmin, nsamples_min) = (xmin2, rmin2, objmin2, nsamples2)
1064
1092
  if jacmin2 is not None: # may be None if finished during setup phase, in which case just use old Jacobian
1065
1093
  jacmin = jacmin2
1066
1094
  else:
1067
1095
  if do_logging:
1068
- module_logger.info("Unsuccessful run with new f = %s compared to old f = %s" % (fmin2, fmin))
1096
+ module_logger.info("Unsuccessful run with new f = %s compared to old f = %s" % (objmin2, objmin))
1069
1097
 
1070
1098
  if nruns - last_successful_run >= params("restarts.max_unsuccessful_restarts"):
1071
1099
  exit_info = ExitInformation(EXIT_SUCCESS, "Reached maximum number of unsuccessful restarts")
@@ -1077,7 +1105,7 @@ def solve(objfun, x0, args=(), bounds=None, projections=[], npt=None, rhobeg=Non
1077
1105
  if scaling_changes is not None and jacmin is not None:
1078
1106
  for i in range(n):
1079
1107
  jacmin[:, i] = jacmin[:, i] / scaling_changes[1][i]
1080
- results = OptimResults(remove_scaling(xmin, scaling_changes), rmin, fmin, jacmin, nf, nx, nruns, exit_flag, exit_msg)
1108
+ results = OptimResults(remove_scaling(xmin, scaling_changes), rmin, objmin, jacmin, nf, nx, nruns, exit_flag, exit_msg)
1081
1109
  if params("logging.save_diagnostic_info"):
1082
1110
  df = diagnostic_info.to_dataframe(with_xk=params("logging.save_xk"), with_rk=params("logging.save_rk"))
1083
1111
  results.diagnostic_info = df
dfols/trust_region.py CHANGED
@@ -29,14 +29,14 @@ solves
29
29
  s.t. lower <= x <= upper
30
30
  ||x-xbase|| <= Delta
31
31
  With this value, the variable d=x-xbase solves the problem
32
- min_s abs(c + g' * d)
32
+ min_d abs(c + g' * d)
33
33
  s.t. lower <= xbase + d <= upper
34
34
  ||d|| <= delta
35
35
  Again, we have a version of this for handling arbitrary constraints
36
36
  The call
37
37
  x = ctrsbox_geometry(xbase, c, g, projections, Delta)
38
38
  Solves
39
- min_s abs(c + g' * d)
39
+ min_d abs(c + g' * d)
40
40
  s.t. xbase + d is feasible w.r.t. the constraint set C
41
41
  ||d|| <= delta
42
42
 
@@ -70,7 +70,7 @@ alternative licensing.
70
70
  # Ensure compatibility with Python 2
71
71
  from __future__ import absolute_import, division, print_function, unicode_literals
72
72
 
73
- from math import sqrt
73
+ from math import sqrt, ceil
74
74
  import numpy as np
75
75
  try:
76
76
  import trustregion
@@ -79,13 +79,93 @@ except ImportError:
79
79
  # Fall back to Python implementation
80
80
  USE_FORTRAN = False
81
81
 
82
- from .util import dykstra, pball, pbox, sumsq, model_value
82
+ from .util import dykstra, pball, pbox, sumsq, model_value, remove_scaling
83
83
 
84
- __all__ = ['ctrsbox', 'ctrsbox_geometry', 'trsbox', 'trsbox_geometry']
84
+ __all__ = ['ctrsbox_sfista', 'ctrsbox_pgd', 'ctrsbox_geometry', 'trsbox', 'trsbox_geometry']
85
85
 
86
86
  ZERO_THRESH = 1e-14
87
87
 
88
- def ctrsbox(xopt, g, H, projections, delta, d_max_iters=100, d_tol=1e-10, use_fortran=USE_FORTRAN):
88
+ def ctrsbox_sfista(xopt, g, H, projections, delta, h, L_h, prox_uh, argsh=(), argsprox=(), func_tol=1e-3, max_iters=500, d_max_iters=100, d_tol=1e-10, use_fortran=USE_FORTRAN, scaling_changes=None, sfista_iters_scale=1.0):
89
+ n = xopt.size
90
+ assert xopt.shape == (n,), "xopt has wrong shape (should be vector)"
91
+ assert g.shape == (n,), "g and xopt have incompatible sizes"
92
+ assert len(H.shape) == 2, "H must be a matrix"
93
+ assert H.shape == (n,n), "H and xopt have incompatible sizes"
94
+ assert np.allclose(H, H.T), "H must be symmetric"
95
+ assert delta > 0.0, "delta must be strictly positive"
96
+
97
+ # Initialization
98
+ d = np.zeros(n) # start with zero vector
99
+ y = np.zeros(n)
100
+ t = 1
101
+ k_H = np.linalg.norm(H, 2)
102
+ crvmin = -1.0
103
+
104
+ # Number of iterations & smoothing parameter, from Theorem 10.57 in
105
+ # [A. Beck. First-order methods in optimization, SIAM, 2017]
106
+ # We do not use the values of k and mu given in the theorem statement, but rather the intermediate
107
+ # results on p313 (K1 for number of iterations, and the immediate next line for mu)
108
+ # Note: in the book's notation, Gamma=delta^2, alpha=1, beta=L_h^2/2, Lf=k_H [alpha and beta from Thm 10.51]
109
+ try:
110
+ MAX_LOOP_ITERS = ceil(sfista_iters_scale * delta * (L_h+sqrt(L_h*L_h+2*k_H*func_tol)) / func_tol)
111
+ MAX_LOOP_ITERS = min(MAX_LOOP_ITERS, max_iters)
112
+ except ValueError:
113
+ MAX_LOOP_ITERS = max_iters
114
+ u = 2 * delta / (MAX_LOOP_ITERS * L_h) # smoothing parameter
115
+ # u = 2 * func_tol / (L_h ** 2 + L_h * sqrt(L_h ** 2 + 2 * k_H * func_tol)) # the above choice works better in practice
116
+
117
+ def gradient_Fu(xopt, g, H, u, prox_uh, d):
118
+ # Calculate gradient_Fu,
119
+ # where Fu(d) := g(d) + h_u(d) and h_u(d) is a 1/u-smooth approximation of h.
120
+ # We assume that h is globally Lipschitz continous with constant L_h,
121
+ # then we can let h_u(d) be the Moreau Envelope M_h_u(d) of h.
122
+ return g + H @ d + (xopt + d - prox_uh(remove_scaling(xopt + d, scaling_changes), u, *argsprox)) / u
123
+
124
+ # Lipschitz constant of gradient_Fu
125
+ l = k_H + 1 / u
126
+
127
+ # trust region is a ball of radius delta around xopt
128
+ trproj = lambda w: pball(w, xopt, delta)
129
+
130
+ # combine trust region constraints with user-entered constraints
131
+ P = list(projections) # make a copy of the projections list
132
+ P.append(trproj)
133
+ def proj(d0):
134
+ p = dykstra(P, xopt+d0, max_iter=d_max_iters, tol=d_tol)
135
+ # we want the step only, so we subtract xopt
136
+ # from the new point: proj(xk+d) - xk
137
+ return p - xopt
138
+
139
+ # general step
140
+ model_value_best = model_value(g, H, d, xopt, h, argsh, scaling_changes)
141
+ d_best = d.copy()
142
+ for k in range(MAX_LOOP_ITERS):
143
+ prev_d = d.copy()
144
+ prev_t = t
145
+ # gradient_Fu at y
146
+ g_Fu = gradient_Fu(xopt, g, H, u, prox_uh, d, *argsprox)
147
+
148
+ # main update step
149
+ d = proj(y - g_Fu / l)
150
+ new_model_value = model_value(g, H, d, xopt, h, argsh, scaling_changes)
151
+ if new_model_value < model_value_best:
152
+ d_best = d.copy()
153
+ model_value_best = new_model_value
154
+
155
+ # update true gradient
156
+ # gnew is the gradient of the smoothed function
157
+ gnew = gradient_Fu(xopt, g, H, u, prox_uh, d, *argsprox)
158
+
159
+ # update CRVMIN
160
+ crv = d.dot(H).dot(d)/sumsq(d) if sumsq(d) >= ZERO_THRESH else crvmin
161
+ crvmin = min(crvmin, crv) if crvmin != -1.0 else crv
162
+
163
+ # momentum update
164
+ t = (1 + sqrt(1 + 4*t*t)) / 2
165
+ y = d + (prev_t - 1) * (d - prev_d) / t
166
+ return d, gnew, crvmin
167
+
168
+ def ctrsbox_pgd(xopt, g, H, projections, delta, d_max_iters=100, d_tol=1e-10, use_fortran=USE_FORTRAN):
89
169
  n = xopt.size
90
170
  assert xopt.shape == (n,), "xopt has wrong shape (should be vector)"
91
171
  assert g.shape == (n,), "g and xopt have incompatible sizes"
@@ -151,7 +231,6 @@ def ctrsbox(xopt, g, H, projections, delta, d_max_iters=100, d_tol=1e-10, use_fo
151
231
 
152
232
  return d, gnew, crvmin
153
233
 
154
-
155
234
  def trsbox(xopt, g, H, sl, su, delta, use_fortran=USE_FORTRAN):
156
235
  if use_fortran:
157
236
  return trustregion.solve(g, H, delta,
dfols/util.py CHANGED
@@ -31,7 +31,7 @@ import scipy.linalg as LA
31
31
  import sys
32
32
 
33
33
 
34
- __all__ = ['sumsq', 'eval_least_squares_objective', 'model_value', 'random_orthog_directions_within_bounds',
34
+ __all__ = ['sumsq', 'eval_least_squares_with_regularisation', 'model_value', 'random_orthog_directions_within_bounds',
35
35
  'random_directions_within_bounds', 'apply_scaling', 'remove_scaling', 'pbox', 'pball', 'dykstra', 'qr_rank']
36
36
 
37
37
  module_logger = logging.getLogger(__name__)
@@ -47,9 +47,9 @@ def sumsq(x):
47
47
  return np.dot(x, x)
48
48
 
49
49
 
50
- def eval_least_squares_objective(objfun, x, args=(), verbose=True, eval_num=0, pt_num=0, full_x_thresh=6, check_for_overflow=True):
50
+ def eval_least_squares_with_regularisation(objfun, x, h=None, argsf=(), argsh=(), verbose=True, eval_num=0, pt_num=0, full_x_thresh=6, check_for_overflow=True):
51
51
  # Evaluate least squares function
52
- fvec = objfun(x, *args)
52
+ fvec = objfun(x, *argsf)
53
53
 
54
54
  if check_for_overflow:
55
55
  try:
@@ -62,20 +62,31 @@ def eval_least_squares_objective(objfun, x, args=(), verbose=True, eval_num=0, p
62
62
  else:
63
63
  f = sumsq(fvec)
64
64
 
65
+ # objective = least-squares + regularisation
66
+ obj = f
67
+ if h is not None:
68
+ # Evaluate regularisation term
69
+ hvalue = h(x, *argsh)
70
+ obj = f + hvalue
71
+
65
72
  if verbose:
66
73
  if len(x) < full_x_thresh:
67
- module_logger.info("Function eval %i at point %i has f = %.15g at x = " % (eval_num, pt_num, f) + str(x))
74
+ module_logger.info("Function eval %i at point %i has obj = %.15g at x = " % (eval_num, pt_num, obj) + str(x))
68
75
  else:
69
- module_logger.info("Function eval %i at point %i has f = %.15g at x = [...]" % (eval_num, pt_num, f))
76
+ module_logger.info("Function eval %i at point %i has obj = %.15g at x = [...]" % (eval_num, pt_num, obj))
70
77
 
71
- return fvec, f
78
+ return fvec, obj
72
79
 
73
80
 
74
- def model_value(g, H, s):
75
- # Calculate model value (s^T * g + 0.5* s^T * H * s) = s^T * (gopt + 0.5 * H*s)
81
+ def model_value(g, H, s, xopt=(), h=None,argsh=(), scaling_changes=None):
82
+ # Calculate model value (s^T * g + 0.5* s^T * H * s) + h(xopt + s) = s^T * (gopt + 0.5 * H*s) + h(xopt + s)
76
83
  assert g.shape == s.shape, "g and s have incompatible sizes"
77
84
  Hs = H.dot(s)
78
- return np.dot(s, g + 0.5*Hs)
85
+ rtn = np.dot(s, g + 0.5*Hs)
86
+ if h is not None:
87
+ hvalue = h(remove_scaling(xopt+s, scaling_changes), *argsh)
88
+ rtn += hvalue
89
+ return rtn
79
90
 
80
91
 
81
92
  def get_scale(dirn, delta, lower, upper):
@@ -1,14 +0,0 @@
1
- dfols/__init__.py,sha256=D-x5glfZFfJ8-bdjA-4k4JFTDu1Eylaz3EL4GSH28eI,1605
2
- dfols/controller.py,sha256=LSeHZoKaKUEYgB1_2subjKskHJ8mWccMbn-LOpxJ7LM,42769
3
- dfols/diagnostic_info.py,sha256=2kEUkL-MS4eDENUf1r2hOWsntP8OxMDKi_kyHmrC9V4,6081
4
- dfols/hessian.py,sha256=sExx4J4KoGwHItbthX2odosB2ONbQFvLdlcod7PIh4k,4262
5
- dfols/model.py,sha256=q70zuqocNtsaXzNjWHcTdrS209BdQt4uY0GNtp0qlI8,18809
6
- dfols/params.py,sha256=_Va1ybnQDIzWaXvImcSeH8xnNE_A2zpAfBgDG74sc5c,17557
7
- dfols/solver.py,sha256=IKg3xWPLYlOW_zuTc_-HY_3ZvdDEfkyxARerERUQHlU,61264
8
- dfols/trust_region.py,sha256=hRKQx0fpSxol7dLZO0yrT7O5IDptPPSnDvxKQNZ3r0M,24603
9
- dfols/util.py,sha256=ysdIHTkrkWwCRKuGffofehKl-t5dT3sD9dfy0muI4ZI,9852
10
- DFO_LS-1.4.1.dist-info/LICENSE.txt,sha256=jOtLnuWt7d5Hsx6XXB2QxzrSe2sWWh3NgMfFRetluQM,35147
11
- DFO_LS-1.4.1.dist-info/METADATA,sha256=RR6KhJi4Ae_1PES8Bpzqm3AYK2w12V-2MyDyjaCDe80,8552
12
- DFO_LS-1.4.1.dist-info/WHEEL,sha256=GJ7t_kWBFywbagK5eo9IoUwLW6oyOeTKmQ-9iHFVNxQ,92
13
- DFO_LS-1.4.1.dist-info/top_level.txt,sha256=UfxRhaDN8HQx2_l17KbrDrERJ90OCN7VKkDMpYYbRLU,6
14
- DFO_LS-1.4.1.dist-info/RECORD,,