1 paper across 1 session
We show that interval estimation based methods produce better distilled embedders in multi-teacher distillation settings compared to MSE or Cosine base methods.