Principal Researcher, TikTok
4 papers at NeurIPS 2025
We propose the first robust weak-to-strong generalization framework to elicit robust knowledge from a strong student VLM in an unsupervised scheme.
A SoTA sequence parallelism for linear attention with a brand new collective communication.
Efficient Reasoning VLM