Agent - NeurIPS 2025

Agent

14 papers across 3 sessions

Poster Session 4

Thursday, December 4, 2025 · 4:30 PM → 7:30 PM

3DLLM-Mem: Long-Term Spatial-Temporal Memory for Embodied 3D Large Language Model

#4806 · Wenbo Hu, Yining Hong, Yanjun Wang, Leison Gao, Zibu Wei, Xingcheng Yao, Nanyun Peng, Yonatan Bitton, Idan Szpektor, Kai-Wei Chang

We introduce a new benchmark for long-term spatial-temporal memory in 3D embodied agent. We propose a novel model with memory fusion technique for enhanced memory capabilities.

Deep Video Discovery: Agentic Search with Tool Use for Long-form Video Understanding

#4802 · Xiaoyi Zhang, Zhaoyang Jia, Zongyu Guo, Jiahao Li, Bin Li, Houqiang Li, Yan Lu

To resolve long video understanding task, we propose Deep Video Discovery agent to iterative reasoning and gather information from video content via an agentic search and tool use strategy..

Wide-Horizon Thinking and Simulation-Based Evaluation for Real-World LLM Planning with Multifaceted Constraints

#5407 Spotlight · Dongjie Yang, Chengqiang Lu, Qimeng Wang, Xinbei Ma, Yan Gao, Yao Hu, Hai Zhao

GUI Exploration Lab: Enhancing Screen Navigation in Agents via Multi-Turn Reinforcement Learning

#415 · Haolong Yan, Yeqing Shen, Xin Huang, Jia Wang, Kaijun Tan, Zhixuan Liang, Hongxin Li, Zheng Ge, Osamu Yoshie, Si Li, Xiangyu Zhang, Daxin Jiang

We introduce GUI Exploration Lab, a flexible simulator for GUI agent navigation. Experiments show a staged SFT + RL approach (especially multi-turn RL) significantly boosts navigation and exploration capabilities.

CPathAgent: An Agent-based Foundation Model for Interpretable High-Resolution Pathology Image Analysis Mimicking Pathologists' Diagnostic Logic

#1806 · YUXUAN SUN, Yixuan Si, Chenglu Zhu, Kai Zhang, Zhongyi Shui, Bowen Ding, Tao Lin, Lin Yang

Generalizing Experience for Language Agents with Hierarchical MetaFlows

#2001 · Shengda Fan, Xin Cong, Zhong Zhang, Yuepeng Fu, Yesai Wu, Hao Wang, Xinyu Zhang, Enrui Hu, Yankai Lin

MedChain: Bridging the Gap Between LLM Agents and Clinical Practice with Interactive Sequence

#5100 Spotlight · Jie Liu, Wenxuan Wang, Zizhan Ma, Guolin Huang, Yihang SU, Kao-Jung Chang, Haoliang Li, Linlin Shen, Michael R Lyu, Wenting Chen

MedChain, a dataset of 12,163 clinical cases that mimics real-world medical practice through personalization, interactivity, and sequentiality. MedChain-Agent, an AI system with case-based learning and feedback mechanisms.

Factorio Learning Environment

#312 · Jack Hopkins, Mart Bakler, Akbir Khan

Factorio Learning Environment is an evaluation for frontier models that offers exponentially scaling challenges.