This is a new optimizer, that combines elementwise adaptivity with orthogonal updates for large networks
Novelty:
- an elementwise second momentum estimator applied to update directions
- a sign-stabilized update where the momentum is first sign-transformed before orthogonalization
Paper link : https://arxiv.org/abs/2507.11005
This is a new optimizer, that combines elementwise adaptivity with orthogonal updates for large networks
Novelty:
Paper link : https://arxiv.org/abs/2507.11005