PhD student, Ohio State University, Columbus
2 papers at NeurIPS 2025
Current AI benchmarks suffer from systematic flaws like data leakage and selective reporting. We propose PeerBench, a community-run eval platform with secret and live tests and reputation-weighted scoring to restore trust in AI performance claims.
Multimodal dataset and method to detect reading activity for applications towards AI glasses