Guarantees on Nearest-Neighbor Condensation heuristics
The problem of nearest-neighbor (NN) condensation aims to reduce the size of a training set of a nearest-neighbor classifier while maintaining its classification accuracy. Although many condensation techniques have been proposed, few bounds have been proved on the amount of reduction achieved. In this paper, we present one of the first theoretical results for practical NN condensation algorithms. We propose two condensation algorithms, called RSS and VSS, along with provable upper-bounds on the size of their selected subsets. Additionally, we shed light on the selection size of two other state-of-the-art algorithms, called MSS and FCNN, and compare them to the new algorithms.
READ FULL TEXT