3 papers across 2 sessions
We propose LangHOPS, the first Multimodal Large Language Model (MLLM)-based framework for open-vocabulary object–part instance segmentation.
We propose a way to create and access episodic memory when training transformer policies with RL on long horizon tasks that require remembering things from the past.