Data Sets or Data Repositories

3 papers across 3 sessions

Poster Session 1

Wednesday, December 3, 2025 · 11:00 AM → 2:00 PM

MimeQA: Towards Socially-Intelligent Nonverbal Foundation Models

#4616 · Hengzhi Li, Megan Tjandrasuwita, Yi R. (May) Fung, Armando Solar-Lezama, Paul Liang

We present MimeQA, a question-answering dataset on mime videos, to evaluate video LLMs' nonverbal social reasoning capabilities, and found that models perform below human performance.

Poster Session 2

1 paper

Wednesday, December 3, 2025 · 4:30 PM → 7:30 PM

Exhibit Hall C,D,E

Robo2VLM: Improving Visual Question Answering using Large-Scale Robot Manipulation Data

#2205 Spotlight · Kaiyuan Eric Chen, Shuangyu Xie, Zehan Ma, Pannag Sanketi, Ken Goldberg

Robo2VLM is a framework that generates VQA datasets for robotic manipulation using real-world robot trajectories and non-visual sensor data

Poster Session 5

1 paper

Friday, December 5, 2025 · 11:00 AM → 2:00 PM

Exhibit Hall C,D,E

LibriBrain: Over 50 Hours of Within-Subject MEG to Improve Speech Decoding Methods at Scale

#2002 · Miran Özdogan, Gilad Landau, Gereon Elvers, Dulhan Jayalath, Pratik Somaiya, Francesco Mantegna, Mark Woolrich, Oiwi Parker Jones

LibriBrain is the largest non-invasive MEG dataset (over 50 hours) recorded from a single subject listening to naturalistic speech, designed to advance scalable and reproducible machine learning methods for speech decoding from brain activity.