Silvia Sapora

PhD student, University of Oxford

1 paper at NeurIPS 2025

OpenReview· Semantic Scholar· Google Scholar

Poster Session 2

1 paper

Wednesday, December 3, 2025 · 4:30 PM → 7:30 PM

Exhibit Hall C,D,E

Meta-Learning Objectives for Preference Optimization

#212 · Carlo Alfano, Silvia Sapora, Jakob Nicolaus Foerster, Patrick Rebeschini, Yee Whye Teh

Using a new suite of MuJoCo tasks for systematic evaluation, we develop specialized mirror descent-based preference optimization algorithms that outperform existing methods in both MuJoCo and LLM alignment tasks.