Research scientist, Google DeepMind
2 papers at NeurIPS 2025
We identify several factors that lead to token premium effects in monolingual tokenizers and provide two interventions which significantly reduce tokenizer inequities.
We find bigram subnetworks in Transformer language models that are critical to model performance.