4 papers across 3 sessions
We propose InfiFPO, a novel model fusion method for preference alignment that integrates multi-source probability information to enhance LLM performance, outperforming existing approaches across 11 benchmarks.
We propose a unified framework for model merging that leverages multiple symmetry classes to enable low- and zero-loss interpolation between independently trained Transformer models, including Vision Transformers and GPT-2.