Researcher, EleutherAI
1 paper at NeurIPS 2025
We identify several factors that lead to token premium effects in monolingual tokenizers and provide two interventions which significantly reduce tokenizer inequities.