MS student, nanjing university
2 papers at NeurIPS 2025
We propose a token- and task-wise MoE structure to in-context RL models, harnessing architectural advances of MoE to unleash RL's in-context learning capacity.
We learn offline meta-policies from natural language supervision with contrastive language-decision pre-training, aligning text embeddings to comprehend environment dynamics.