Undergrad student, Princeton University
1 paper at NeurIPS 2025
It proposes a new learned eviction algorithm that predicts the conversation continuation probability to guide LLM prefix cache eviction.