Associate Professor, University of Minnesota - Twin Cities
2 papers at NeurIPS 2025
We have built a highly modular, multimodal general-purpose agent that can interact with a computer via text, images, audio, and video.