Clustering with Fairness Constraints: A Flexible and Scalable Approach

by   Imtiaz Masud Ziko, et al.

This study investigates a general variational formulation of fair clustering, which can integrate fairness constraints with a large class of clustering objectives. Unlike the existing methods, our formulation can impose any desired (target) demographic proportions within each cluster. Furthermore, it enables to control the trade-off between fairness and the clustering objective. We derive an auxiliary function (tight upper bound) of our KL-based fairness penalty via its concave-convex decomposition and Lipschitz-gradient property. Our upper bound can be optimized jointly with various clustering objectives, including both prototype-based such as K-means and graph-based such as Normalized Cut. Interestingly, at each iteration, our general fair-clustering algorithm performs an independent update for each assignment variable, while guaranteeing convergence. Therefore, it can be easily distributed for large-scale data sets. Such scalability is important as it enables to explore different trade-off levels between fairness and clustering objectives. Unlike existing fairness-constrained spectral clustering, our formulation does not need storing an affinity matrix and computing its eigenvalue decomposition. Moreover, unlike existing prototype-based methods, our experiments reveal that fairness does not come at a significant cost of the clustering objective. In fact, several of our tests showed that our fairness penalty helped to avoid weak local minima of the clustering objective (i.e., with fairness, we obtained better clustering objectives). We demonstrate the flexibility and scalability of our algorithm with comprehensive evaluations over both synthetic and real world data sets, many of which are much larger than those used in recent fair-clustering methods.


page 1

page 2

page 3

page 4


Scalable Laplacian K-modes

We advocate Laplacian K-modes for joint clustering and density mode find...

Fair Clustering Under a Bounded Cost

Clustering is a fundamental unsupervised learning problem where a datase...

Deep Fair Discriminative Clustering

Deep clustering has the potential to learn a strong representation and h...

Efficient Algorithms For Fair Clustering with a New Fairness Notion

We revisit the problem of fair clustering, first introduced by Chieriche...

Guarantees for Spectral Clustering with Fairness Constraints

Given the widespread popularity of spectral clustering (SC) for partitio...

Scalable Spectral Clustering with Group Fairness Constraints

There are synergies of research interests and industrial efforts in mode...

Please sign up or login with your details

Forgot password? Click here to reset