ML Resident, Yandex
1 paper at NeurIPS 2025
Automatically detecting task-specific important tokens to accelerate speculative decoding