Johan Obando-Ceron

PhD student, Mila - Quebec AI Institute, Université de Montréal

4 papers at NeurIPS 2025

Homepage· OpenReview· Semantic Scholar· Google Scholar

Poster Session 5

Friday, December 5, 2025 · 11:00 AM → 2:00 PM

Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning

#310 Spotlight · Roger Creus Castanyer, Johan Obando-Ceron, Lu Li, Pierre-Luc Bacon, Glen Berseth, Aaron Courville, Pablo Samuel Castro

We stabilize gradients for training increasingly deep reinforcement learning agents by using a second-order optimizer and residual connections

Generating Creative Chess Puzzles

#311 · Xidong Feng, Vivek Veeriah, Marcus Chiam, Michael D Dennis, Federico Barbero, Johan Obando-Ceron, Jiaxin Shi, Satinder Singh, Shaobo Hou, Nenad Tomasev, Tom Zahavy

Measure gradients, not activations! Enhancing neuronal activity in deep reinforcement learning

#214 · Jiashun Liu, Zihao Wu, Johan Obando-Ceron, Pablo Samuel Castro, Aaron Courville, Ling Pan

Measuring neuronal activity via activations is ineffective in complex agents, as these values do not reflect true learning capacity. We introduce GraMa, which offers robust quantification and resetting guidance across various network architectures.

Poster Session 6

1 paper

Friday, December 5, 2025 · 4:30 PM → 7:30 PM

Exhibit Hall C,D,E

Trajectory Balance with Asynchrony: Decoupling Exploration and Learning for Fast, Scalable LLM Post-Training

#304 · Brian R. Bartoldson, Siddarth Venkatraman, James Diffenderfer, Moksh Jain, Tal Ben-Nun, Seanie Lee, Minsu Kim, Johan Obando-Ceron, Yoshua Bengio, Bhavya Kailkhura

We improve the speed and performance of LLM post-training via a new asynchronous RL approach, leveraging an off-policy objective, replay buffer, and sampling strategies.