Full Professor, Stanford University
2 papers at NeurIPS 2025
We present the first systematic study of lossy latency–quality trade-offs in LLM agents, introducing HFTBench and StreetFighter benchmarks, and proposing an adaptive mixed-precision framework for real-world latency-sensitive tasks.
This paper introduces ItDPDM, a Poisson-based diffusion model that directly models discrete data without relying on continuous embeddings or variational approximations, achieving improved likelihood estimation on music and image benchmarks.