PhD student, Swiss Federal Institute of Technology Zurich
1 paper at NeurIPS 2025
We propose a unified framework for model merging that leverages multiple symmetry classes to enable low- and zero-loss interpolation between independently trained Transformer models, including Vision Transformers and GPT-2.