5 papers across 2 sessions
We introduce SMMILE, the first multimodal medical benchmark for evaluating in-context learning abilities of vision-language models.
We propose ModuLM, a flexible framework for LLM-based molecular relational learning, supporting multimodal inputs and dynamic model construction.