1 paper across 1 session
DataSIR, a multi-format sensitive information benchmark dataset, including 1,647,501 samples, designed to evaluate the performance of different models in recognizing sensitive information under evolving data leakage techniques.