3 papers across 2 sessions
MEMENTO improves neural routing solvers by using memory to adapt decisions at inference time, outperforming fine-tuning and search methods while pushing SOTA on 11 of 12 tasks.
Using search strategies at inference-time can provide massive performance boost on numerous complex reinforcement learning tasks, within only a couple seconds of execution time.
We extend autoregressive multi-agent sequence models, including Sable and MAT, to the Offline MARL setting and demonstrate that they significanlty outperform current state-of-the-art methods across a diverse set of benchmarks with up to 50 agents.