Intern, NVIDIA
1 paper at NeurIPS 2025
Language models are surprisingly robust to non-canonical tokenizations of the input, which can even lead to improved performance