Associate Professor, SUN YAT-SEN UNIVERSITY
1 paper at NeurIPS 2025
DynaPipe dynamically redistributes layers and uses asynchronous coordination to balance computation during LLM inference, significantly reducing latency and outperforming existing pipeline parallelism systems.