1 paper across 1 session
This paper proposes VIBE, an annotation-free method that selects video summaries for human decision-making by scoring task relevance and visual grounding without retraining.