today local_bar

Juhan Bae

PhD student, University of Toronto

3 papers at NeurIPS 2025

Homepage· OpenReview· Semantic Scholar· Google Scholar

Poster Session 2

Wednesday, December 3, 2025 · 4:30 PM → 7:30 PM

Exhibit Hall C,D,E

IF-Guide: Influence Function-Guided Detoxification of LLMs

#1400 · Zachary Coalson, Juhan Bae, Nicholas Carlini, Sanghyun Hong

We use influence functions to attribute and suppress training examples that promote toxic behaviors in LLMs.

Poster Session 5

Friday, December 5, 2025 · 11:00 AM → 2:00 PM

Exhibit Hall C,D,E

What is Your Data Worth to GPT? LLM-Scale Data Valuation with Influence Functions

#103 · Sang Keun Choe, Hwijeen Ahn, Juhan Bae, Kewen Zhao, Youngseog Chung, Adithya Pratapa, Willie Neiswanger, Emma Strubell, Teruko Mitamura, Jeff Schneider, Eduard Hovy, Roger Baker Grosse, Eric P. Xing

We scale the influence-function-based data valuation method to recent LLMs and their massive training datasets.

Poster Session 6

Friday, December 5, 2025 · 4:30 PM → 7:30 PM

Exhibit Hall C,D,E

Better Training Data Attribution via Better Inverse Hessian-Vector Products

#3907 · Andrew Wang, Elisa Nguyen, Runshi Yang, Juhan Bae, Sheila A. McIlraith, Roger Baker Grosse

We apply the EKFAC-preconditioner on Neumann series iterations to arrive at an unbiased iHVP approximation for TDA that improves influence function and unrolled differentiation performance.