Researcher, Cycraft Inc.
1 paper at NeurIPS 2025
We align instruction-tuned and reasoning LLMs on instruction hierarchy via executable verifier supervision, enabling oracle-free and trace-free training that generalizes to safety benchmarks.