Video LLMs; Token Reduction; Discrete Token Representation; Extreme Token Reduction; Token Merge; Token Clustering

1 paper across 1 session

Poster Session 6

Friday, December 5, 2025 · 4:30 PM → 7:30 PM

VQToken: Neural Discrete Token Representation Learning for Extreme Token Reduction in Video Large Language Models

We introduce an extreme token-reduction task and a discrete representation (VQToken) that adaptively compresses video token sequences by 99.93% of their original length with only a 0.66% accuracy drop.