Researcher, Microsoft Research, Redmond
2 papers at NeurIPS 2025
We propose SAS to simulate larger attention head numbe and hidden size per head for better performance, keeping the original model size.
A reinforcement-learning post-training framework teaches LLM assistants to reason about contextual integrity, slashing inappropriate information disclosure while helping users complete their tasks.