3 papers across 2 sessions
We introduce Set-LLM, a permutation-invariant LLM architecture that eliminates order bias and sensitivity.
We propose a unified framework for model merging that leverages multiple symmetry classes to enable low- and zero-loss interpolation between independently trained Transformer models, including Vision Transformers and GPT-2.
Graph-based symbolic regression that captures expression equivalences on graph representations and incorporate constrained search utilizing hybrid neural-guided Monte-Carlo tree search.