Principal Researcher, International Digital Economy Academy
2 papers at NeurIPS 2025
We propose GPAS, a simple method that scales activations without scaling gradients to accelerate pretraining convergence of LLMs.
ChemCoTBench bridges complex chemical reasoning with arithmetic-inspired step-by-step workflows, enabling LLMs to systematically tackle real-world tasks like molecular optimization and reaction prediction.