PhD student, University of California, Berkeley
1 paper at NeurIPS 2025
A dataset of multi-agent system traces, and a systematic analysis of failures in multi-agent LLM systems, featuring a structured taxonomy and an automated evaluation pipeline.