Researcher, Amazon
2 papers at NeurIPS 2025
We introduce CQN-AS, a value-based RL algorithm that learns a critic network that outputs Q-values over a sequence of actions.