MS student, University of Pittsburgh
1 paper at NeurIPS 2025
The way we rasterize images to 1D sequences to feed into long sequence models is sub-optimal! We show that orders other than row major can be better, and provide an RL method to learn the optimal ordering.