knowledge-intensive task

1 paper across 1 session

Poster Session 4

Thursday, December 4, 2025 · 4:30 PM → 7:30 PM

HawkBench: Investigating Resilience of RAG Methods on Stratified Information-Seeking Tasks

#1915 Spotlight · Hongjin Qian, Zheng Liu, Chao Gao, Yankai Wang, Defu Lian, Zhicheng Dou

HawkBench is a human-labeled, multi-domain benchmark with 1,600 samples for evaluating RAG systems on diverse queries, revealing limits in generalizability and the need for adaptive strategies.