Assistant Professor, Westlake University
4 papers at NeurIPS 2025
We enable dynamic inference in next-scale autoregressive generation by restructuring intermediate representations with frequency-aware supervision.
HoliTom introduces a training-free holistic outer-inner token merging framework for video LLMs, significantly accelerating inference with negligible performance degradation.