Stateful optimization#2188
Conversation
Memory benchmark result| Test Name | %Δ | Master (MB) | PR (MB) | Δ (MB) | Time PR (s) | Time Master (s) |
| -------------------------------------- | ------------ | ------------------ | ------------------ | ------------ | ------------------ | ------------------ |
test_objective_jac_w7x | -2.39 % | 4.155e+03 | 4.055e+03 | -99.25 | 34.95 | 31.74 |
test_proximal_jac_w7x_with_eq_update | 0.37 % | 6.553e+03 | 6.577e+03 | 24.37 | 158.82 | 153.16 |
test_proximal_freeb_jac | 0.06 % | 1.343e+04 | 1.343e+04 | 7.75 | 86.15 | 81.22 |
test_proximal_freeb_jac_blocked | 0.07 % | 7.755e+03 | 7.760e+03 | 5.31 | 73.00 | 69.58 |
test_proximal_freeb_jac_batched | 0.79 % | 7.659e+03 | 7.719e+03 | 60.19 | 71.22 | 69.38 |
test_proximal_jac_ripple | 1.82 % | 3.591e+03 | 3.656e+03 | 65.37 | 56.97 | 55.34 |
test_proximal_jac_ripple_bounce1d | -1.30 % | 3.819e+03 | 3.770e+03 | -49.64 | 72.76 | 69.32 |
test_eq_solve | 0.30 % | 2.167e+03 | 2.173e+03 | 6.59 | 92.16 | 90.14 |For the memory plots, go to the summary of |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #2188 +/- ##
==========================================
+ Coverage 94.40% 94.44% +0.03%
==========================================
Files 101 101
Lines 28739 28803 +64
==========================================
+ Hits 27132 27202 +70
+ Misses 1607 1601 -6
🚀 New features to boost your workflow:
|
primal system is A f = b.
The correct tangent system is on wikipedia.
update state,
keep both; see above. |
The goal is to use known information to estimate the initial guess for the next primal and tangent solves. My suggestion above does that exactly to first order for the primal solve, and uses past history to do it for nearly first order for the tangent solve. You can choose any other estimation heurstic from optimization literature to estimate df_next too, e.g. momentum, mirror etc. In the worst case, the simplest initial guess for the next tangent solve is the previous tangent solve solution, e.g. df_new = df. choosing df_new = 0 doesn't make sense to me. |
First pass at #1034. For now this just adds the lowest level API - allowing desc optimizers to maintain and update state between iterations.
Still to do:
Some open questions:
Resolves #1034