PhD student, School of Computer Science, Carnegie Mellon University
1 paper at NeurIPS 2025
We introduce a principled framework for validating LLM-as-a-judge systems under rating indeterminacy, where multiple ratings can be "correct."