Principal Staff Software Engineer, Machine Learning, LinkedIn
1 paper at NeurIPS 2025
Theoretical analysis of scheduling algorithms for LLM queries with latency constraints when using RadixAttention along with a novel scheduling algorithm.