1 paper across 1 session
We identify several factors that lead to token premium effects in monolingual tokenizers and provide two interventions which significantly reduce tokenizer inequities.