3 papers across 2 sessions
We align instruction-tuned and reasoning LLMs on instruction hierarchy via executable verifier supervision, enabling oracle-free and trace-free training that generalizes to safety benchmarks.