2 papers across 1 session
We developed an iterative weakly supervised pipeline to refine LLM-generated pseudo-labels, consistently outperforming original LLMs and existing self-refinement methods across diverse datasets, while effectively supporting LLM safety alignment.