2 papers across 2 sessions
A peer prediction-based automatic evaluator for scoring human values in crowdsourcing datasets contaminated by LLM-generated responses.