1 paper across 1 session
We propose to accumulate multimodal knowledge to overcome the label distribution shift caused by unseen compositions recombined from attributes and objects by leveraging unsupervised data at test time.