Researcher, Max Planck Institute for Software Systems (MPI-SWS)
3 papers at NeurIPS 2025
This purely theoretical paper introduces and studies new models of query learning with contrastive examples.
We propose a novel inference-time personalized alignment method that elicits the user's preferences with a few preference queries.
We propose a curriculum strategy for guiding the training of agents that operate under strict trajectory constraints during deployment by adaptively tightening constraints based on agent's performance.