Researcher, Samsung - SAIT AI Lab, Montreal
1 paper at NeurIPS 2025
We observe that Adam’s performance in training transformers degrades differently under different types of random rotations of the objective function, highlighting the need for new, basis-dependent theory to fully understand Adam’s success.