1 paper across 1 session
We present ARGenSeg, a unified framework for multimodal understanding and pixel-level perception, achieving state-of-the-art performance of image segmentation.