Lecturer, King's College London, University of London
2 papers at NeurIPS 2025
LLMs misuse key Multi-agent system concepts; we call for aligning them more closely with foundational Multi-agent theory.
We propose EffiBench-X, a multi-language code efficiency benchmark, to address the gap in existing benchmarks primarily focusing on a single programming language (e.g., Python).