Golnoosh Farnadi

Assistant Professor, McGill University

1 paper at NeurIPS 2025

Homepage· OpenReview· Semantic Scholar· Google Scholar

Poster Session 2

1 paper

Wednesday, December 3, 2025 · 4:30 PM → 7:30 PM

Exhibit Hall C,D,E

Neither Valid nor Reliable? Investigating the Use of LLMs as Judges

#1313 · Khaoula Chehbouni, Mohammed Haddou, Jackie CK Cheung, Golnoosh Farnadi

In this position paper we investigate the validity and reliability of LLMs as judges and highlight challenges inherent to their use and existing practices in NLG evaluation.