2 papers across 2 sessions
ClinBench is an open-source, multi-model, multi-domain framework for rigorously benchmarking large language models on clinical information-extraction tasks.
BenchmarkCards provides standardized documentation for large language model benchmarks, simplifying benchmark selection and usage.