PhD student, Department of Computer Science, University of Maryland, College Park
1 paper at NeurIPS 2025
We fit scaling laws for large language models with varying width-to-depth ratios and parameter counts.