Researcher, Apple
2 papers at NeurIPS 2025
The generalization of a DiT is influenced by the inductive bias of attention locality rather than harmonic bases like UNet. Using attention window restrictions can modify its generalization ability.