Conversation
|
This obviously needs the PR #587 before proceeding. |
6e54df9 to
b1cb674
Compare
|
This is now ready for a review. Only remaining thing to do is that this should be moved out of |
|
Bump for review |
kohr-h
left a comment
There was a problem hiding this comment.
Some fixes needed, method itself and tests look good though.
odl/solvers/iterative/iterative.py
Outdated
| See Also | ||
| -------- | ||
| conjugate_gradient : Optimized solver for symmetric matrices | ||
| conjugate_gradient_normal : Equivalent solver but for nonlinear case |
odl/solvers/iterative/iterative.py
Outdated
| conjugate_gradient : Optimized solver for linear and symmetric case | ||
| conjugate_gradient_normal : Equivalent solver but for linear case | ||
| """ | ||
| # TODO: add a book reference |
There was a problem hiding this comment.
Should be OK with a Wikipedia link?
There was a problem hiding this comment.
As I wrote higher up, we have a book reference for FR: GNS2009
odl/solvers/iterative/iterative.py
Outdated
| conjugate_gradient_normal : Equivalent solver but for linear case | ||
| """ | ||
| # TODO: add a book reference | ||
| # TODO: update doc |
odl/solvers/iterative/iterative.py
Outdated
|
|
||
| The method is described in a | ||
| `Wikipedia article | ||
| <https://en.wikipedia.org/wiki/Nonlinear_conjugate_gradient_method>`_. |
There was a problem hiding this comment.
Would be nice to have a problem statement.
There was a problem hiding this comment.
Will get that done, also short word on algorithm.
There was a problem hiding this comment.
If you want a book reference there is one in GNS2009 (at least for what wiki calls Flethcer-Reeves update) which we already have in our list of references
odl/solvers/iterative/iterative.py
Outdated
| Operator in the inverse problem. If not linear, it must have | ||
| an implementation of `Operator.derivative`, which | ||
| in turn must implement `Operator.adjoint`, i.e. | ||
| the call ``op.derivative(x).adjoint`` must be valid. |
odl/solvers/iterative/iterative.py
Outdated
| tol : float, optional | ||
| Tolerance that should be used for terminating the iteration. | ||
| beta_method : {'FR', 'PR', 'HS', 'DY'} | ||
| Method to calculate ``beta`` in the iterates. TODO |
There was a problem hiding this comment.
Seems like a short math section would make sense to explain this.
There was a problem hiding this comment.
Yeah I'll add that but that goes up in notes, not here since that would clog this.
There was a problem hiding this comment.
Second thought, I wont. This can be easily googled and writing it down here would require quite a few lines.
odl/solvers/iterative/iterative.py
Outdated
| niter : int | ||
| Number of iterations per reset. | ||
| nreset : int, optional | ||
| Number of times the solver should be reset. Default: no reset. |
There was a problem hiding this comment.
I find this API confusing. niter or max_iter were always the total number of iterations in some sense, now the total number is (nreset + 1) * niter. Total surprise to me, and not a positive one. (I know the docstring says it, but if you're familiar with an API and pattern matching tells you it's the same here, this tiny change in text goes through totally unnoticed.)
IMO, either change the algorithm to use niter_before_reset = niter // (nreset + 1) and make sure you really run niter in total by adding the remaining ones at the end (I would do that), or rename niter to niter_before_reset or niter_inner vs. niter_outer or something that makes clear it's not the total number (I wouldn't do that because it unnecessarily changes the API).
There was a problem hiding this comment.
I'll do your first suggestion.
odl/solvers/iterative/iterative.py
Outdated
| elif not callable(line_search): | ||
| line_search = ConstantLineSearch(line_search) | ||
|
|
||
| for rest_nr in range(nreset + 1): |
odl/solvers/iterative/iterative.py
Outdated
| x.lincomb(1, x, a, dx) # x = x + a * dx | ||
|
|
||
| dx_old = dx | ||
| s = dx # for 'HS' and 'DY' beta methods |
There was a problem hiding this comment.
Not only that, you also use them in the search direction update and the line search, no?
| def solver(op, x, rhs): | ||
| norm2 = op.adjoint(op(x)).norm() / x.norm() | ||
| odl.solvers.landweber(op, x, rhs, niter=10, omega=0.5 / norm2) | ||
| odl.solvers.landweber(op, x, rhs, niter=50, omega=0.5 / norm2) |
There was a problem hiding this comment.
It started failing due to some other change.
aringh
left a comment
There was a problem hiding this comment.
Looks great :-). The only critical comment I have is on how to handle when we set beta = 0. It can handled as it is now, but then we are in some sense half-way into "more advanced" reseting options, and the nreset becomes only a lower bound.
odl/solvers/iterative/iterative.py
Outdated
| Right-hand side of the equation defining the inverse problem | ||
| niter : int | ||
| Number of iterations per reset. | ||
| nreset : int, optional |
There was a problem hiding this comment.
If we are interested there are more advanced ways to determine when to reset. The one I know about is by Powell from 1977, but maybe there are more recent ones as well: http://link.springer.com/article/10.1007/BF01593790
odl/solvers/iterative/iterative.py
Outdated
| raise ValueError('unknown ``beta_method``') | ||
|
|
||
| # Reset beta if negative. | ||
| beta = max(0, beta) |
There was a problem hiding this comment.
If you take beta = 0, you will effectively reset the conjugate gradient method since s just becomes the steepest decent direction. I don't know if that actually matters, but we should think through how we want to handle that.
There was a problem hiding this comment.
I basically got this from wikipedia, I'm far from an expert. Personally, I'd feel fine with this version for now and then improving it later (if anyone wants to use it).
There was a problem hiding this comment.
I don't consider myself an expert in the area either ;-) and it is a perfectly fine solution for me (if you look at the reference that wikipedia uses, it claims convergence of the PR-version if this is used). I justed wanted to bring to attention that it actually is a restart of the method.
442aafe to
2e96668
Compare
|
Fixed comments and moved the code into the smooth solvers, guess it needs a re-review. |
aringh
left a comment
There was a problem hiding this comment.
Minor comments, except for a potential sign error.
If there is a sign error it is a bit weird that the tests pass. I have double checked with the reference on wikipedia for the DY-formula, which contains all four expressions for beta, an it looks like there should indeed be a minus sign for HS and DY.
odl/solvers/smooth/nonlinear_cg.py
Outdated
| line_search : float or `LineSearch`, optional | ||
| Strategy to choose the step length. If a float is given, uses it as a | ||
| fixed step length. | ||
| maxiter : int |
There was a problem hiding this comment.
optional. In cg and cg normal this is called niter, is there a reason for using a different name?
There was a problem hiding this comment.
I guess because the algorithm can terminate early and is not guaranteed to run this number of iterations. Should be cross-checked anyway and changed in the other methods if it applies there, too. I like the distinction.
There was a problem hiding this comment.
Agree with @kohr-h, that is why. Keeping this
odl/solvers/smooth/nonlinear_cg.py
Outdated
| raise TypeError('`x` {!r} is not in the domain of `f` {!r}' | ||
| ''.format(x, f.domain)) | ||
|
|
||
| if line_search is None: |
There was a problem hiding this comment.
The default value is set to 1, but maybe you want to catch this anyway?
There was a problem hiding this comment.
No you are correct, it should be "if isinstance", or perhaps a try catch float conversion.
odl/solvers/smooth/nonlinear_cg.py
Outdated
| elif beta_method == 'PR': | ||
| beta = dx.inner(dx - dx_old) / dx_old.inner(dx_old) | ||
| elif beta_method == 'HS': | ||
| beta = dx.inner(dx - dx_old) / s.inner(dx - dx_old) |
There was a problem hiding this comment.
It looks like there is a minus sign missing
There was a problem hiding this comment.
Where? What source are you using to see that?
odl/solvers/smooth/nonlinear_cg.py
Outdated
| elif beta_method == 'HS': | ||
| beta = dx.inner(dx - dx_old) / s.inner(dx - dx_old) | ||
| elif beta_method == 'DY': | ||
| beta = dx.inner(dx) / s.inner(dx - dx_old) |
There was a problem hiding this comment.
It looks like there is a minus sign missing
Could also be a sign error in the Wikipedia reference. Looking at the formulas it seems weird that the last two have this ominous minus sign although the rest looks kind of similar to the first two formulas. The simplest thing would be to flip the sign and see what happens. |
I did, and the tests still pass. Which is a bit troublesome. The reason for the minus signs on wikipedia but that they are not there is in the reference, which is non-obvious from a first look, is that the wikipedia page works with |
|
Holding on until the sign error is fixed, I'll have a look. Nice catch. |
|
The sign error was indeed a sign error. Fixed it and number of iterations needed went down drastically. |
12900f9 to
a296b3a
Compare
|
Can I merge this? |
kohr-h
left a comment
There was a problem hiding this comment.
Minor stuff, after fix good for merge from my point of view.
| @@ -1,4 +1,4 @@ | |||
| # Copyright 2014-2016 The ODL development group | |||
| # Copyright 2014-2017 The ODL development group | |||
There was a problem hiding this comment.
Oh right, we need to change that across the board (not here though).
odl/solvers/smooth/nonlinear_cg.py
Outdated
| # You should have received a copy of the GNU General Public License | ||
| # along with ODL. If not, see <http://www.gnu.org/licenses/>. | ||
|
|
||
| """Nonlinear version of conjugate gradient.""" |
odl/solvers/smooth/nonlinear_cg.py
Outdated
| callback=None): | ||
| """Conjugate gradient for nonlinear problems. | ||
|
|
||
| Notes |
There was a problem hiding this comment.
Any reason why this comes before Parameters?
There was a problem hiding this comment.
We do that in some cases, particularily when the description becomes too "thin" otherwise.
There was a problem hiding this comment.
Ok, but in the long run I find it preferable to get directly to the parameters without much ado since that's what users have to look at most frequently. The math explanation is useful once or a couple of times, but later it's just in the way to the parameter reference.
So I prefer the note section after Parameters in almost all cases.
odl/solvers/smooth/nonlinear_cg.py
Outdated
|
|
||
| :math:`\min f(x)` | ||
|
|
||
| for a differentiable function |
odl/solvers/smooth/nonlinear_cg.py
Outdated
|
|
||
| The method is described in a | ||
| `Wikipedia article | ||
| <https://en.wikipedia.org/wiki/Nonlinear_conjugate_gradient_method>`_. |
There was a problem hiding this comment.
Nice, that's exactly the amount of mathy documentation reasonable for such a docstring.
odl/solvers/smooth/nonlinear_cg.py
Outdated
| See Also | ||
| -------- | ||
| bfgs_method : Quasi-newton solver for the same problem | ||
| conjugate_gradient : Optimized solver for linear and symmetric case |
There was a problem hiding this comment.
"least-squares problem with linear and symmetric operator" maybe? Otherwise not so clear what "linear case" means, could also be "linear functional"
odl/solvers/smooth/nonlinear_cg.py
Outdated
| -------- | ||
| bfgs_method : Quasi-newton solver for the same problem | ||
| conjugate_gradient : Optimized solver for linear and symmetric case | ||
| conjugate_gradient_normal : Equivalent solver but for linear case |
odl/solvers/smooth/nonlinear_cg.py
Outdated
| def conjugate_gradient_nonlinear(f, x, line_search=1.0, maxiter=1000, nreset=0, | ||
| tol=1e-16, beta_method='FR', | ||
| callback=None): | ||
| """Conjugate gradient for nonlinear problems. |
There was a problem hiding this comment.
method
also perhaps "general nonlinear problems" since we're also solving a nonlinear problem in the other case (the least squares problem)
odl/solvers/smooth/nonlinear_cg.py
Outdated
| used as starting point of the iteration, and its values are | ||
| updated in each iteration step. | ||
| line_search : float or `LineSearch`, optional | ||
| Strategy to choose the step length. If a float is given, uses it as a |
odl/solvers/smooth/nonlinear_cg.py
Outdated
|
|
||
| # Find optimal step along s | ||
| dir_derivative = -dx.inner(s) | ||
| if abs(dir_derivative) < tol: |
There was a problem hiding this comment.
Strictly speaking it should be <= here because "tolerance" in inclusive. It's also relevant if a user sets tol=0 because in some (toy) cases the update could actually be 0 and the method should terminate then.
|
Ready for merge? |
That's for @aringh to decide ;-) |
|
Perhaps you can also squash the first two commits to get rid of that WIP commit. |
|
Will do once this is accepted |
|
Bump for @aringh attention. |
aringh
left a comment
There was a problem hiding this comment.
Tiny comment. Looks good to me :-)
odl/solvers/smooth/nonlinear_cg.py
Outdated
| Number of times the solver should be reset. Default: no reset. | ||
| tol : float, optional | ||
| Tolerance that should be used for terminating the iteration. | ||
| beta_method : {'FR', 'PR', 'HS', 'DY'} |
There was a problem hiding this comment.
optional. Mention default value?
There was a problem hiding this comment.
We do that only if it's not clear from the signature, i.e., when the parameter is in kwargs. Or if the default is None and we need to explain what happens in that case.
The convention for "limited-range" parameters is to name the default first. So I'd say it's fine as it is.
e56c7f1 to
7507c52
Compare
|
I had a quick look, and it looks good to me :-) |
accbad0 to
463a1d3
Compare
|
Merge after CI. |
Work in progress, but I feel that we need this sooner or later.