researcher, Google
1 paper at NeurIPS 2025
We propose an inference-strategy for long-video QA, which substantially improves the accuracy of a VLM by curating its input context.