1 paper across 1 session
In this position paper we investigate the validity and reliability of LLMs as judges and highlight challenges inherent to their use and existing practices in NLG evaluation.