4 papers across 3 sessions
We train reward models to encourage fairer step-by-step reasoning in LLMs, reducing bias on high-stakes decision-making tasks
FairNet introduces a dynamic, instance-level fairness correction method for machine learning models.
We propose RBD, a plug-in module that detects and corrects biased LLM evaluations through structured reasoning, significantly improving accuracy, consistency, and scalability across multiple bias types and evaluator models.