Poster Session 2 · Wednesday, December 3, 2025 4:30 PM → 7:30 PM
#2406
AgMMU: A Comprehensive Agricultural Multimodal Understanding Benchmark
Abstract
We present AgMMU, a challenging real-world benchmark for evaluating and advancing vision-language models (VLMs) in the knowledge-intensive domain of agriculture. Unlike prior datasets that rely on crowdsourced prompts, AgMMU is distilled from 116,231 authentic dialogues between everyday growers and USDA-authorized Cooperative Extension experts. Through a three-stage pipeline: automated knowledge extraction, QA generation, and human verification, we construct:
- AgMMU, an evaluation set of 746 multiple-choice questions (MCQs) and 746 open-ended questions (OEQs), and
- AgBase, a development corpus of 57,079 multimodal facts covering five high-stakes agricultural topics: insect identification, species identification, disease categorization, symptom description, and management instruction.
AgMMU has three key advantages:
- Authentic & Expert-Verified: All facts, images, and answers originate from real farmer and gardener inquiries answered by credentialed specialists, ensuring high-fidelity agricultural knowledge.
- Complete Development Suite: AgMMU uniquely couples a dual-format evaluation benchmark (MCQ and OEQ) with AgBase, a large-scale training set, enabling both rigorous assessment and targeted improvement of VLMs.
- Knowledge-intensive Challenge: Our tasks demand the synergy of nuanced visual perception and domain expertise, exposing fundamental limitations of current general-purpose models and charting a path toward robust, application-ready agricultural AI.
Benchmarking 12 leading VLMs reveals pronounced gaps in fine-grained perception and factual grounding. Open-sourced models trail after proprietary ones by a wide margin. Simple fine-tuning on AgBase boosts open-sourced model performance on challenging OEQs for up to 11.6% on average, narrowing this gap and also motivating future research to propose better strategies in knowledge extraction and distillation from AgBase.
We hope AgMMU stimulates research on domain-specific knowledge integration and trustworthy decision support in agriculture AI development.