Researcher, Apple
1 paper at NeurIPS 2025
The generalization of a DiT is influenced by the inductive bias of attention locality rather than harmonic bases like UNet. Using attention window restrictions can modify its generalization ability.