Poster Session 5 · Friday, December 5, 2025 11:00 AM → 2:00 PM
#5209
GIST: Greedy Independent Set Thresholding for Max-Min Diversification with Submodular Utility
Abstract
This work studies a novel subset selection problem called max-min diversification with monotone submodular utility (MDMS), which has a wide range of applications in machine learning, e.g., data sampling and feature selection. Given a set of points in a metric space, the goal of MDMS is to maximize subject to a cardinality constraint , where is a monotone submodular function and is the max-min diversity objective.
We propose the GIST algorithm, which gives a -approximation guarantee for MDMS by approximating a series of maximum independent set problems with a bicriteria greedy algorithm. We also prove that it is NP-hard to approximate within a factor of .
Finally, we show in our empirical study that GIST outperforms state-of-the-art benchmarks for a single-shot data sampling task on ImageNet.