Research Scientist, Meta
1 paper at NeurIPS 2025
We investigate the role of learning rate grafting and the staleness of the preconditioner in Shampoo by decoupling the updates of the eigenvalues and eigenbasis of its preconditioner.