1 paper across 1 session
Introduced Refined Regularized Preference Optimization with a self-alignment framework to enable fine-grained alignment of large video language models by learning from their own errors.