1 paper across 1 session
A model-free speculative decoding method that accelerates agentic AI workloads using suffix trees. Achieves 5.3x speedup on multi-agent tasks.