3 papers across 2 sessions
A model-free speculative decoding method that accelerates agentic AI workloads using suffix trees. Achieves 5.3x speedup on multi-agent tasks.