Researcher, Fremont Unified School District
1 paper at NeurIPS 2025
We introduce RAGuard, the first benchmark to evaluate RAG system robustness against naturally misleading evidence, revealing that even strong LLMs underperform when exposed to real-world retrieval noise.