Full Professor, Central South University
3 papers at NeurIPS 2025
We present a vision-centric token compression in LLM, inspired by human selective reading strategy.
To train an adaptive video tokenizer, we introduce probabilistic taildrop to inject visual complexity prior to the tokenizer and incorporate GRPO for post-training, which further boosts efficiency in a task-aware adaptive manner.