PhD student, Nanyang Technological University
1 paper at NeurIPS 2025
We propose Router-R1, an RL-based framework that interleaves multi-round reasoning with dynamic LLM selection, supports zero-shot integration of new models, and optimizes performance-cost trade-offs