PhD student, University of Oslo
1 paper at NeurIPS 2025
An end-to-end learnable tokenizer for Vision Transformers that enhances spatial and semantic learning by allowing retrofitting of pretrained models to use pixel-level tokens