3 papers across 2 sessions
We propose a token- and task-wise MoE structure to in-context RL models, harnessing architectural advances of MoE to unleash RL's in-context learning capacity.
The paper presents AnyMDP, a framework for procedurally generating diverse tasks to enhance In-Context Reinforcement Learning (ICRL) scalability, and explores the trade-off between generalization and adaptation efficiency.