ViT - NeurIPS 2025

ViT

1 paper across 1 session

Poster Session 2

Wednesday, December 3, 2025 · 4:30 PM → 7:30 PM

Differentiable Hierarchical Visual Tokenization

#4911 Spotlight · Marius Aasan, Martine Hjelkrem Tan, Nico Catalano, Changkyu Choi, Adín Ramírez Rivera

An end-to-end learnable tokenizer for Vision Transformers that enhances spatial and semantic learning by allowing retrofitting of pretrained models to use pixel-level tokens