Assistant Professor, Massachusetts Institute of Technology
2 papers at NeurIPS 2025
RL to train LLMs how to generate data and update themselves to adapt to new knowledge/tasks.
We propose a contextualized position encoding using dynamic Householder matrices in place of static rotary ones, along with a hardware-efficient training algorithm that improves state tracking performance.