PhD student, Institute of Computing Technology, Chinese Academy of Sciences
1 paper at NeurIPS 2025
We propose SEA, a simple inference-time alignment method that reformulates alignment as an iterative optimization procedure on an energy function over logits in the continuous space defined by the optimal RLHF policy for deep and effective alignment.