A Binary Optimization Approach for Constrained K-Means Clustering

10/24/2018
by   Huu Le, et al.
8

K-Means clustering still plays an important role in many computer vision problems. While the conventional Lloyd method, which alternates between centroid update and cluster assignment, is primarily used in practice, it may converge to a solution with empty clusters. Furthermore, some applications may require the clusters to satisfy a specific set of constraints, e.g., cluster sizes, must-link/cannot-link. Several methods have been introduced to solve constrained K-Means clustering. Due to the non-convex nature of K-Means, however, existing approaches may result in sub-optimal solutions that poorly approximate the true clusters. In this work, we provide a new perspective to tackle this problem. Particularly, we reconsider constrained K-Means as a Binary Optimization Problem and propose a novel optimization scheme to search for feasible solutions in the binary domain. This approach allows us to solve constrained K-Means where multiple types of constraints can be simultaneously enforced. Experimental results on synthetic and real datasets show that our method provides better clustering accuracy with faster runtime compared to several commonly used techniques.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/31/2018

On the Persistence of Clustering Solutions and True Number of Clusters in a Dataset

Typically clustering algorithms provide clustering solutions with prespe...
research
06/03/2019

Clustering by Orthogonal NMF Model and Non-Convex Penalty Optimization

The non-negative matrix factorization (NMF) model with an additional ort...
research
07/24/2019

Constrained K-means with General Pairwise and Cardinality Constraints

In this work, we study constrained clustering, where constraints are uti...
research
05/18/2021

On Convex Clustering Solutions

Convex clustering is an attractive clustering algorithm with favorable p...
research
05/22/2017

Size Matters: Cardinality-Constrained Clustering and Outlier Detection via Conic Optimization

Plain vanilla K-means clustering is prone to produce unbalanced clusters...
research
05/22/2017

Improved Clustering with Augmented k-means

Identifying a set of homogeneous clusters in a heterogeneous dataset is ...
research
09/28/2021

Clustering to the Fewest Clusters Under Intra-Cluster Dissimilarity Constraints

This paper introduces the equiwide clustering problem, where valid parti...

Please sign up or login with your details

Forgot password? Click here to reset