2 papers across 2 sessions
We solve POMDPs by nesting sequential Monte Carlo
Meta-RL with self-supervised predictive coding modules can learn interpretable, task-relevant representations that better approximate Bayes-optimal belief states than black-box meta-RL models across diverse partially observable environments.