1 paper across 1 session
A method for constructing an optimal behavior basis for the Option Keyboard, enabling zero-shot identification of optimal solutions for any linear-reward task.