Researcher, Snowflake
1 paper at NeurIPS 2025
A model-free speculative decoding method that accelerates agentic AI workloads using suffix trees. Achieves 5.3x speedup on multi-agent tasks.