Full Professor, Xi'an Jiaotong University
2 papers at NeurIPS 2025
This paper measures and mitigates shortcut leaning in the misinformation detection.
Current LLM code evaluation is flawed by weak test cases; we propose SAGA, a novel method using human expertise to generate superior verifiers, demonstrated by our new CodeComPass benchmark and TCGCoder-7B model, for more reliable assessment.