1 paper across 1 session
We show that explicit reasoning via chain-of-thought can hurt instruction-following in LLMs by reducing constraint adherence, and propose four mitigation methods to recover or improve performance.