Researcher, Alan Turing Institute
3 papers at NeurIPS 2025
We investigate the role of learning rate grafting and the staleness of the preconditioner in Shampoo by decoupling the updates of the eigenvalues and eigenbasis of its preconditioner.
We directly estimate interventional distributions from observational data with meta-learning
This paper introduces distributional training data attribution, a data attribution framework that accounts for stochasticity in deep learning training, enabling a mathematical justification for why influence functions work in this setting.