I was curious how well the simple momentum step-size approach shown in the first interactive example compares to alternative methods. The example function featured in the first interactive example is named bananaf ("Rosenbrok Function banana function"), defined as<p><pre><code> var s = 3
var x = xy[0]; var y = xy[1]*s
var fx = (1-x)*(1-x) + 20*(y - x*x )*(y - x*x )
var dfx = [-2*(1-x) - 80*x*(-x*x + y), s*40*(-x*x + y)]
</code></pre>
The interactive example uses an initial guess of [-1.21, 0.853] and a fixed 150 iterations, with no convergence test.<p>From manually fiddling with (step-size) alpha & (momentum) beta parameters, and editing the code to specify a smaller number of iterations, it seems quite difficult to tune this momentum-based approach to get near the minima and stay there without bouncing away in 50 iterations or fewer.<p>Out of curiosity, I compared minimising this bananaf function with scipy.optimize.minimize, using the same initial guess.<p>If we force scipy.optimize.minimize to use method='cg', leaving all other parameters as defaults, it converges to the optimal solution of [1.0, 1./3.] requiring 43 evaluations of fx and dfx,<p>If we allow scipy.optimize.minimize to use all defaults -- including the default method='bfgs', it converges to the optimal solution after only 34 evaluations of fx and dfx.<p>Under the hood, scipy's method='cg' and method='bfgs' solvers do not use a fixed step size or momentum to determine the step size, but instead solve a line search problem. The line search problem is to identify a step size that satisfies a sufficient decrease condition and a curvature condition - see Wolfe conditions [1]. Scipy's default line search method -- used for cg and bfgs -- is a python port [2] of the dcsrch routine from MINPACK2. A good reference covering line search methods & BFGS is Nocedal & Wright's 2006 book Numerical Optimization.<p>[1] <a href="https://en.wikipedia.org/wiki/Wolfe_conditions" rel="nofollow">https://en.wikipedia.org/wiki/Wolfe_conditions</a>
[2] <a href="https://github.com/scipy/scipy/blob/main/scipy/optimize/_dcsrch.py">https://github.com/scipy/scipy/blob/main/scipy/optimize/_dcs...</a>