Distinguished Scientist, Microsoft
4 papers at NeurIPS 2025
We introduce a new comprehensive benchmark, MMTU, designed to evaluate models ability to understand, reason, and manipulate diverse tables.
We propose GUI-Actor, a VLM-based, coordinate-free GUI grounding method with an attention-based action head and verifier, achieving state-of-the-art results and strong generalization.