PhD student, Tsinghua University
1 paper at NeurIPS 2025
We propose MK-CAViT, a multi-kernel Vision Transformer with HGR-based correlation attention, achieving efficient multi-scale feature learning.