1 paper across 1 session
We introduce a benchmark using simulated biological systems to evaluate LLMs' scientific discovery capabilities.