Cross-modal retrieval

5 papers across 3 sessions

Poster Session 1

3 papers

Wednesday, December 3, 2025 · 11:00 AM → 2:00 PM

Exhibit Hall C,D,E

Interactive Cross-modal Learning for Text-3D Scene Retrieval

#4905 · Yanglin Feng, Yongxiang Li, Yuan Sun, Yang Qin, Dezhong Peng, Peng Hu

We propose an Interactive Text-to-3D Scene Retrieval Method to handle inherent query limitations.

CellCLIP - Learning Perturbation Effects in Cell Painting via Text-Guided Contrastive Learning

#1609 · MingYu Lu, Ethan Weinberger, Chanwoo Kim, Su-In Lee

We propose CellCLIP, a multimodal contrastive learning framework for Cell Painting data. It aligns image profiles and perturbation descriptions using pretrained encoders and a novel channel-wise scheme.

Rebalancing Contrastive Alignment with Bottlenecked Semantic Increments in Text-Video Retrieval

#4800 · Jian Xiao, Zijie Song, Jialong Hu, Hao Cheng, Jia Li, Zhenzhen Hu, Richang Hong

We introduce GARE, a gap-aware retrieval framework that learns pair-specific increments to alleviate optimization tension and false-negative noise in cross-modal alignment, achieving better uniformity and semantic structure.

Poster Session 2

1 paper

Wednesday, December 3, 2025 · 4:30 PM → 7:30 PM

Exhibit Hall C,D,E

MotionBind: Multi-Modal Human Motion Alignment for Retrieval, Recognition, and Generation

#4811 · Kaleab Kinfu, Rene Vidal

Poster Session 5

1 paper

Friday, December 5, 2025 · 11:00 AM → 2:00 PM

Exhibit Hall C,D,E

Robust Cross-modal Alignment Learning for Cross-Scene Spatial Reasoning and Grounding

#4812 · Yanglin Feng, Hongyuan Zhu, Dezhong Peng, Xi Peng, Xiaomin Song, Peng Hu

We propose a novel Cross-Scene Spatial Reasoning and Grounding task, along with a new baseline and a benchmark dataset.