PhD student, Technical University of Munich
1 paper at NeurIPS 2025
We propose a new algorithm that introduces guarantees for minimum user satisfaction rates in language model zoos while optimizing for operating cost, which can be practical for inference endpoint services.