Lecturer, Monash University
3 papers at NeurIPS 2025
We develop the unbiased Slice Wassertein RBF kernel to better measure cross-modal alignment between acoustic and linguistic modalities for audio captioning and reasoning tasks.
We introduce the first graph foundation model specifically designed for retrieval-augmented generation in large language models.
FPSAttention is a training-aware FP8 quantization and sparsity co-design for video diffusion models that achieves up to 4.96× speedup without quality loss by aligning 3D tile granularity, denoising-step adaptation, and hardware-efficient kernels.