Full Professor, University of California Berkeley
5 papers at NeurIPS 2025
A dataset of multi-agent system traces, and a systematic analysis of failures in multi-agent LLM systems, featuring a structured taxonomy and an automated evaluation pipeline.
a sparse attention with $\mathcal O(n \log n)$ complexity for long video generation
We propose a method to speedup video diffusion generation through efficient attention.
We present a model-aware approach that leverages the model’s own signals to dynamically choose training data, markedly boosting both training and data efficiency in RL fine-tuning.
Accelerating attention for long-context reasoning by identifying and loading important tokens and by approximating attention to less important tokens