2 papers across 2 sessions
We analyze imbalanced training loss, showing that gradient descent dynamics can gradually reduce bias and recover minority-specific features with longer training.