Researcher, China Three Gorges Corporation
2 papers at NeurIPS 2025
A new scaling law formula with learning rate annealing that can fit and predict full loss curves.