Nonlinear conjugate gradient by adler-j · Pull Request #554 · odlgroup/odl

adler-j · 2016-08-31T15:07:12Z

Work in progress, but I feel that we need this sooner or later.

adler-j · 2016-09-25T11:20:41Z

This obviously needs the PR #587 before proceeding.

adler-j · 2016-12-11T14:38:51Z

This is now ready for a review. Only remaining thing to do is that this should be moved out of iterative and into scalar.

adler-j · 2017-01-03T10:10:12Z

Bump for review

kohr-h

Some fixes needed, method itself and tests look good though.

kohr-h · 2017-01-03T21:37:40Z

odl/solvers/iterative/iterative.py

    See Also
    --------
    conjugate_gradient : Optimized solver for symmetric matrices
+    conjugate_gradient_normal : Equivalent solver but for nonlinear case


normal -> nonlinear

kohr-h · 2017-01-03T21:39:52Z

odl/solvers/iterative/iterative.py

+    conjugate_gradient : Optimized solver for linear and symmetric case
+    conjugate_gradient_normal : Equivalent solver but for linear case
+    """
+    # TODO: add a book reference


Should be OK with a Wikipedia link?

As I wrote higher up, we have a book reference for FR: GNS2009

kohr-h · 2017-01-03T21:40:08Z

odl/solvers/iterative/iterative.py

+    conjugate_gradient_normal : Equivalent solver but for linear case
+    """
+    # TODO: add a book reference
+    # TODO: update doc


Still TODO, see above.

kohr-h · 2017-01-03T21:40:51Z

odl/solvers/iterative/iterative.py

+
+    The method is described in a
+    `Wikipedia article
+    <https://en.wikipedia.org/wiki/Nonlinear_conjugate_gradient_method>`_.


Would be nice to have a problem statement.

Will get that done, also short word on algorithm.

If you want a book reference there is one in GNS2009 (at least for what wiki calls Flethcer-Reeves update) which we already have in our list of references

kohr-h · 2017-01-03T21:42:33Z

odl/solvers/iterative/iterative.py

+        Operator in the inverse problem. If not linear, it must have
+        an implementation of `Operator.derivative`, which
+        in turn must implement `Operator.adjoint`, i.e.
+        the call ``op.derivative(x).adjoint`` must be valid.


kohr-h · 2017-01-03T21:47:53Z

odl/solvers/iterative/iterative.py

+    tol : float, optional
+        Tolerance that should be used for terminating the iteration.
+    beta_method : {'FR', 'PR', 'HS', 'DY'}
+        Method to calculate ``beta`` in the iterates. TODO


Seems like a short math section would make sense to explain this.

Yeah I'll add that but that goes up in notes, not here since that would clog this.

Second thought, I wont. This can be easily googled and writing it down here would require quite a few lines.

kohr-h · 2017-01-03T21:56:10Z

odl/solvers/iterative/iterative.py

+    niter : int
+        Number of iterations per reset.
+    nreset : int, optional
+        Number of times the solver should be reset. Default: no reset.


I find this API confusing. niter or max_iter were always the total number of iterations in some sense, now the total number is (nreset + 1) * niter. Total surprise to me, and not a positive one. (I know the docstring says it, but if you're familiar with an API and pattern matching tells you it's the same here, this tiny change in text goes through totally unnoticed.)
IMO, either change the algorithm to use niter_before_reset = niter // (nreset + 1) and make sure you really run niter in total by adding the remaining ones at the end (I would do that), or rename niter to niter_before_reset or niter_inner vs. niter_outer or something that makes clear it's not the total number (I wouldn't do that because it unnecessarily changes the API).

I'll do your first suggestion.

kohr-h · 2017-01-03T21:58:13Z

odl/solvers/iterative/iterative.py

+    elif not callable(line_search):
+        line_search = ConstantLineSearch(line_search)
+
+    for rest_nr in range(nreset + 1):


Iteration variable not used

kohr-h · 2017-01-03T22:04:34Z

odl/solvers/iterative/iterative.py

+        x.lincomb(1, x, a, dx)  # x = x + a * dx
+
+        dx_old = dx
+        s = dx  # for 'HS' and 'DY' beta methods


Not only that, you also use them in the search direction update and the line search, no?

kohr-h · 2017-01-03T22:05:54Z

odl/test/solvers/iterative/iterative_test.py

        def solver(op, x, rhs):
            norm2 = op.adjoint(op(x)).norm() / x.norm()
-            odl.solvers.landweber(op, x, rhs, niter=10, omega=0.5 / norm2)
+            odl.solvers.landweber(op, x, rhs, niter=50, omega=0.5 / norm2)


It started failing due to some other change.

aringh

Looks great :-). The only critical comment I have is on how to handle when we set beta = 0. It can handled as it is now, but then we are in some sense half-way into "more advanced" reseting options, and the nreset becomes only a lower bound.

aringh · 2017-01-16T08:34:30Z

odl/solvers/iterative/iterative.py

+        Right-hand side of the equation defining the inverse problem
+    niter : int
+        Number of iterations per reset.
+    nreset : int, optional


If we are interested there are more advanced ways to determine when to reset. The one I know about is by Powell from 1977, but maybe there are more recent ones as well: http://link.springer.com/article/10.1007/BF01593790

aringh · 2017-01-16T08:40:22Z

odl/solvers/iterative/iterative.py

+                raise ValueError('unknown ``beta_method``')
+
+            # Reset beta if negative.
+            beta = max(0, beta)


If you take beta = 0, you will effectively reset the conjugate gradient method since s just becomes the steepest decent direction. I don't know if that actually matters, but we should think through how we want to handle that.

I basically got this from wikipedia, I'm far from an expert. Personally, I'd feel fine with this version for now and then improving it later (if anyone wants to use it).

I don't consider myself an expert in the area either ;-) and it is a perfectly fine solution for me (if you look at the reference that wikipedia uses, it claims convergence of the PR-version if this is used). I justed wanted to bring to attention that it actually is a restart of the method.

adler-j · 2017-01-16T15:28:34Z

Fixed comments and moved the code into the smooth solvers, guess it needs a re-review.

aringh

Minor comments, except for a potential sign error.

If there is a sign error it is a bit weird that the tests pass. I have double checked with the reference on wikipedia for the DY-formula, which contains all four expressions for beta, an it looks like there should indeed be a minus sign for HS and DY.

aringh · 2017-01-18T07:50:19Z

odl/solvers/smooth/nonlinear_cg.py

+    line_search : float or `LineSearch`, optional
+        Strategy to choose the step length. If a float is given, uses it as a
+        fixed step length.
+    maxiter : int


optional. In cg and cg normal this is called niter, is there a reason for using a different name?

I guess because the algorithm can terminate early and is not guaranteed to run this number of iterations. Should be cross-checked anyway and changed in the other methods if it applies there, too. I like the distinction.

Agree with @kohr-h, that is why. Keeping this

aringh · 2017-01-18T07:51:56Z

odl/solvers/smooth/nonlinear_cg.py

+        raise TypeError('`x` {!r} is not in the domain of `f` {!r}'
+                        ''.format(x, f.domain))
+
+    if line_search is None:


The default value is set to 1, but maybe you want to catch this anyway?

No you are correct, it should be "if isinstance", or perhaps a try catch float conversion.

aringh · 2017-01-18T07:55:09Z

odl/solvers/smooth/nonlinear_cg.py

+            elif beta_method == 'PR':
+                beta = dx.inner(dx - dx_old) / dx_old.inner(dx_old)
+            elif beta_method == 'HS':
+                beta = dx.inner(dx - dx_old) / s.inner(dx - dx_old)


It looks like there is a minus sign missing

Where? What source are you using to see that?

aringh · 2017-01-18T07:55:15Z

odl/solvers/smooth/nonlinear_cg.py

+            elif beta_method == 'HS':
+                beta = dx.inner(dx - dx_old) / s.inner(dx - dx_old)
+            elif beta_method == 'DY':
+                beta = dx.inner(dx) / s.inner(dx - dx_old)


It looks like there is a minus sign missing

kohr-h · 2017-01-18T09:07:02Z

Minor comments, except for a potential sign error.

If there is a sign error it is a bit weird that the tests pass. I have double checked with the reference on wikipedia for the DY-formula, which contains all four expressions for beta, an it looks like there should indeed be a minus sign for HS and DY.

Could also be a sign error in the Wikipedia reference. Looking at the formulas it seems weird that the last two have this ominous minus sign although the rest looks kind of similar to the first two formulas. The simplest thing would be to flip the sign and see what happens.

aringh · 2017-01-18T09:34:53Z

The simplest thing would be to flip the sign and see what happens.

I did, and the tests still pass. Which is a bit troublesome.

The reason for the minus signs on wikipedia but that they are not there is in the reference, which is non-obvious from a first look, is that the wikipedia page works with delta x which is - grad while the reference works with grad directly. But someone can double check that I haven't made other sign errors when I tried to check it :-)

adler-j · 2017-01-18T21:28:20Z

Holding on until the sign error is fixed, I'll have a look. Nice catch.

adler-j · 2017-01-19T09:24:29Z

The sign error was indeed a sign error. Fixed it and number of iterations needed went down drastically.

adler-j · 2017-01-19T11:44:34Z

Can I merge this?

kohr-h

Minor stuff, after fix good for merge from my point of view.

kohr-h · 2017-01-19T12:53:31Z

odl/solvers/iterative/iterative.py

@@ -1,4 +1,4 @@
-# Copyright 2014-2016 The ODL development group
+# Copyright 2014-2017 The ODL development group


Oh right, we need to change that across the board (not here though).

kohr-h · 2017-01-19T12:55:18Z

odl/solvers/smooth/nonlinear_cg.py

+# You should have received a copy of the GNU General Public License
+# along with ODL.  If not, see <http://www.gnu.org/licenses/>.
+
+"""Nonlinear version of conjugate gradient."""


the conjugate gradient method

kohr-h · 2017-01-19T12:56:05Z

odl/solvers/smooth/nonlinear_cg.py

+                                 callback=None):
+    """Conjugate gradient for nonlinear problems.
+
+    Notes


Any reason why this comes before Parameters?

We do that in some cases, particularily when the description becomes too "thin" otherwise.

Ok, but in the long run I find it preferable to get directly to the parameters without much ado since that's what users have to look at most frequently. The math explanation is useful once or a couple of times, but later it's just in the way to the parameter reference.
So I prefer the note section after Parameters in almost all cases.

kohr-h · 2017-01-19T12:57:35Z

odl/solvers/smooth/nonlinear_cg.py

+
+        :math:`\min f(x)`
+
+    for a differentiable function


kohr-h · 2017-01-19T12:58:29Z

odl/solvers/smooth/nonlinear_cg.py

+
+    The method is described in a
+    `Wikipedia article
+    <https://en.wikipedia.org/wiki/Nonlinear_conjugate_gradient_method>`_.


Nice, that's exactly the amount of mathy documentation reasonable for such a docstring.

kohr-h · 2017-01-19T13:07:38Z

odl/solvers/smooth/nonlinear_cg.py

+    See Also
+    --------
+    bfgs_method : Quasi-newton solver for the same problem
+    conjugate_gradient : Optimized solver for linear and symmetric case


"least-squares problem with linear and symmetric operator" maybe? Otherwise not so clear what "linear case" means, could also be "linear functional"

kohr-h · 2017-01-19T13:07:46Z

odl/solvers/smooth/nonlinear_cg.py

+    --------
+    bfgs_method : Quasi-newton solver for the same problem
+    conjugate_gradient : Optimized solver for linear and symmetric case
+    conjugate_gradient_normal : Equivalent solver but for linear case


accordingly here

kohr-h · 2017-01-19T13:10:10Z

odl/solvers/smooth/nonlinear_cg.py

+def conjugate_gradient_nonlinear(f, x, line_search=1.0, maxiter=1000, nreset=0,
+                                 tol=1e-16, beta_method='FR',
+                                 callback=None):
+    """Conjugate gradient for nonlinear problems.


method
also perhaps "general nonlinear problems" since we're also solving a nonlinear problem in the other case (the least squares problem)

kohr-h · 2017-01-19T13:10:37Z

odl/solvers/smooth/nonlinear_cg.py

+        used as starting point of the iteration, and its values are
+        updated in each iteration step.
+    line_search : float or `LineSearch`, optional
+        Strategy to choose the step length. If a float is given, uses it as a


kohr-h · 2017-01-19T13:15:22Z

odl/solvers/smooth/nonlinear_cg.py

+
+            # Find optimal step along s
+            dir_derivative = -dx.inner(s)
+            if abs(dir_derivative) < tol:


Strictly speaking it should be <= here because "tolerance" in inclusive. It's also relevant if a user sets tol=0 because in some (toy) cases the update could actually be 0 and the method should terminate then.

adler-j · 2017-01-19T13:55:36Z

Ready for merge?

kohr-h · 2017-01-19T13:56:53Z

Ready for merge?

That's for @aringh to decide ;-)

kohr-h · 2017-01-19T13:57:51Z

Perhaps you can also squash the first two commits to get rid of that WIP commit.

adler-j · 2017-01-19T13:58:35Z

Will do once this is accepted

adler-j · 2017-01-23T17:54:16Z

Bump for @aringh attention.

aringh

Tiny comment. Looks good to me :-)

aringh · 2017-01-25T09:11:39Z

odl/solvers/smooth/nonlinear_cg.py

+        Number of times the solver should be reset. Default: no reset.
+    tol : float, optional
+        Tolerance that should be used for terminating the iteration.
+    beta_method : {'FR', 'PR', 'HS', 'DY'}


optional. Mention default value?

We do that only if it's not clear from the signature, i.e., when the parameter is in kwargs. Or if the default is None and we need to explain what happens in that case.

The convention for "limited-range" parameters is to name the default first. So I'd say it's fine as it is.

adler-j · 2017-01-25T11:05:27Z

We'll @aringh had this innocent comment about missing a "optional" so I made a script to find stuff like this, and needless to say I found like 100+ errors.

Fixed them all so going to need a further review.

aringh · 2017-01-27T16:09:08Z

I had a quick look, and it looks good to me :-)

adler-j · 2017-02-06T11:11:47Z

Merge after CI.

adler-j force-pushed the issue-251__nonlinear_cg branch from 6e54df9 to b1cb674 Compare December 10, 2016 16:29

adler-j added status: review needed area: solvers labels Dec 11, 2016

adler-j requested review from aringh and kohr-h December 11, 2016 14:39

kohr-h requested changes Jan 3, 2017

View reviewed changes

aringh requested changes Jan 16, 2017

View reviewed changes

adler-j force-pushed the issue-251__nonlinear_cg branch 2 times, most recently from 442aafe to 2e96668 Compare January 16, 2017 15:28

adler-j mentioned this pull request Jan 17, 2017

Nonlinear conjugate gradient #251

Closed

aringh reviewed Jan 18, 2017

View reviewed changes

adler-j changed the title ~~ENH: initial commit for nonlinear conjugate gradient~~ Nonlinear conjugate gradient Jan 19, 2017

adler-j force-pushed the issue-251__nonlinear_cg branch from 12900f9 to a296b3a Compare January 19, 2017 09:33

adler-j mentioned this pull request Jan 19, 2017

Issue 831 callback show convergence #832

Merged

kohr-h approved these changes Jan 19, 2017

View reviewed changes

aringh approved these changes Jan 25, 2017

View reviewed changes

adler-j force-pushed the issue-251__nonlinear_cg branch 2 times, most recently from e56c7f1 to 7507c52 Compare January 25, 2017 11:15

adler-j added 13 commits February 6, 2017 12:11

ENH: initial commit for nonlinear conjugate gradient

cb1032e

WIP: Improvements to nonlinear CG

ae0399e

ENH: Slight optimization to backtracking line search

91527e0

ENH: Fix nonlinear conjugate gradient, now works

f52a0d2

API: Move conjugate_gradient_nonlinear to solvers/smooth

57835f2

API: Unify conjugate_gradient_nonlinear api with other smooth solvers

82bebe9

ENH: Add conjugate_gradient_nonlinear to smooth tests

c9f843f

MAINT: minor fixes to conjugate_gradient_normal

a8e3284

BUG: Fix sign error in nonlinear CG

6bc4056

MAINT: Style fixes to nonlinear CG

a5a5e74

MAINT: Add missing optional to doc in several places

8c8070d

MAINT: Fix missing doc, wrong doc order etc

9550756

BUG: Fix failing grid and iterative tests

463a1d3

adler-j force-pushed the issue-251__nonlinear_cg branch from accbad0 to 463a1d3 Compare February 6, 2017 11:11

adler-j merged commit bd9e4c5 into master Feb 6, 2017

adler-j deleted the issue-251__nonlinear_cg branch February 6, 2017 11:20

kohr-h removed the status: review needed label Apr 14, 2017

		@@ -1,4 +1,4 @@
		# Copyright 2014-2016 The ODL development group
		# Copyright 2014-2017 The ODL development group

Conversation

adler-j commented Aug 31, 2016

Uh oh!

adler-j commented Sep 25, 2016

Uh oh!

adler-j commented Dec 11, 2016

Uh oh!

adler-j commented Jan 3, 2017

Uh oh!

kohr-h left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aringh left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adler-j commented Jan 16, 2017

Uh oh!

aringh left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!