1 paper across 1 session
We introduce a new probing architecture capable of efficiently combining features from multiple pre-trained vision foundation models.