1 paper across 1 session
We build a benchmark on attribute-focused text-to-image retrieval and propose a pipeline of using promptable image embeddings for solving it, leading to performance gain.