Assistant Professor, Department of Computer Science, University of Wisconsin - Madison
2 papers at NeurIPS 2025
This paper gives a pure dp algorithm for all the pairs min cut problem with the same error as private min-st-cut.
We propose the first efficient, training-free online routing algorithm for high-volume LLM serving under token budget constraints, achieving significant improvements in both routing performance and cost efficiency.