1 paper across 1 session
a small audio-language model for audio reasoning that achieves SoTA performance with 50 times fewer parameters and 60 times fewer audio hours.