1 paper across 1 session
We present the first systematic study of lossy latency–quality trade-offs in LLM agents, introducing HFTBench and StreetFighter benchmarks, and proposing an adaptive mixed-precision framework for real-world latency-sensitive tasks.