PhD student, University of Pennsylvania, University of Pennsylvania
2 papers at NeurIPS 2025
We study the problem of computing an optimal large language model (LLM) policy for a constrained alignment problem.