3 papers across 3 sessions
A novel intrinsic motivation method based on world-model memory mismatch enables embodied agents to exhibit robust autonomous behaviors that closely match whole-brain neural data from zebrafish.
To achieve personalization in LLMs, we leverage the user model to incorporate a curiosity-based intrinsic reward into multi-turn RLHF.