PhD student, University of Cambridge
1 paper at NeurIPS 2025
We propose a Cascaded multi-LLM framework with deferral and abstention policies and online learning to balance accuracy, cost, and abstention, delegating tasks across models and humans to improve efficiency in QA tasks.