2 papers across 2 sessions
We study the problem of computing an optimal large language model (LLM) policy for a constrained alignment problem.
We extend generalization guarantees for inequality constrained statistical optimization problems, to problems that also include statistical equality constraints.