PhD student, Seoul National University
1 paper at NeurIPS 2025
Neuron Chunking is a hardware-aware sparsification framework that abstracts access patterns into contiguity distributions to couple neuron selection with flash I/O behavior and improve I/O efficiency in VLM inference.