Multilingual

4 papers across 3 sessions

Poster Session 1

Wednesday, December 3, 2025 · 11:00 AM → 2:00 PM

#4913 Spotlight · Yung-Sung Chuang, Yang Li, Dong Wang, Ching-Feng Yeh, Kehan Lyu, Ramya Raghavendra, Jim Glass, LIFEI HUANG, Jason Weston, Luke Zettlemoyer, Xinlei Chen, Zhuang Liu, Saining Xie, Scott Yih, Shang-Wen Li, Hu Xu

We generalize CLIP training to worldwide web-scale, with +0.8% better than English only counterpart on zero-shot ImageNet classification (no compromise), SoTA on zero-shot multilingual: 57.4% on CVQA and 50.2% on Babel-ImageNet.

Poster Session 3

2 papers

Thursday, December 4, 2025 · 11:00 AM → 2:00 PM

Exhibit Hall C,D,E

macOSWorld: A Multilingual Interactive Benchmark for GUI Agents

#5506 · Pei Yang, Hai Ci, Mike Zheng Shou

Explaining and Mitigating Crosslingual Tokenizer Inequities

#1909 · Catherine Arnett, Tyler Chang, Stella Biderman, Benjamin Bergen

We identify several factors that lead to token premium effects in monolingual tokenizers and provide two interventions which significantly reduce tokenizer inequities.

Poster Session 4

1 paper

Thursday, December 4, 2025 · 4:30 PM → 7:30 PM

Exhibit Hall C,D,E

Exploring the Translation Mechanism of Large Language Models

#5207 · Hongbin Zhang, Kehai Chen, Xuefeng Bai, Xiucheng Li, Yang Xiang, Min Zhang

This study makes a key contribution by introducing a novel systematic framework to interpret the translation mechanisms of LLMs from a computational components perspective, an area previously unexplored.