Bhiksha Raj

Full Professor, Carnegie Mellon University

3 papers at NeurIPS 2025

Homepage· OpenReview· Semantic Scholar· Google Scholar

Poster Session 3

2 papers

Thursday, December 4, 2025 · 11:00 AM → 2:00 PM

Exhibit Hall C,D,E

Mellow: a small audio language model for reasoning

#2008 · Soham Deshmukh, Satvik Dixit, Rita Singh, Bhiksha Raj

a small audio-language model for audio reasoning that achieves SoTA performance with 50 times fewer parameters and 60 times fewer audio hours.

On Fairness of Unified Multimodal Large Language Model for Image Generation

#1206 · Ming Liu, Hao Chen, Jindong Wang, Liwen Wang, Bhiksha Raj, Wensheng Zhang

Poster Session 6

1 paper

Friday, December 5, 2025 · 4:30 PM → 7:30 PM

Exhibit Hall C,D,E

Directed-Tokens: A Robust Multi-Modality Alignment Approach to Large Language-Vision Models

#4611 · Thanh-Dat Truong, Huu-Thien Tran, Tran Thai Son, Bhiksha Raj, Khoa Luu

This paper introduces a new simple but efficient learning mechanism for improving the robust alignment between visual and textual modalities by solving shuffling problems.