Chen Yueh-Han

MS student, New York University

2 papers at NeurIPS 2025

Homepage· OpenReview· Semantic Scholar· Google Scholar

Poster Session 1

1 paper

Wednesday, December 3, 2025 · 11:00 AM → 2:00 PM

Exhibit Hall C,D,E

Predicting Empirical AI Research Outcomes with Language Models

#2413 · Jiaxin Wen, Chenglei Si, Chen Yueh-Han, He He, Shi Feng

We build a LM-based system that can outperfrom expert AI researchers in predicting the outcomes of empirical AI research ideas, without running actual experiments.

Poster Session 5

1 paper

Friday, December 5, 2025 · 11:00 AM → 2:00 PM

Exhibit Hall C,D,E

SAGE-Eval: Evaluating LLMs for Systematic Generalizations of Safety Facts

#1104 Spotlight · Chen Yueh-Han, Guy Davidson, Brenden Lake

SAGE‑Eval is the first benchmark to test whether frontier LLMs robustly generalize critical safety knowledge to novel situations, and we show that the strongest model we tested only passed 58% of safety facts evaluated.