Cluster-based Contrastive Disentangling for Generalized Zero-Shot Learning
Generalized Zero-Shot Learning (GZSL) aims to recognize both seen and unseen classes by training only the seen classes, in which the instances of unseen classes tend to be biased towards the seen class. In this paper, we propose a Cluster-based Contrastive Disentangling (CCD) method to improve GZSL by alleviating the semantic gap and domain shift problems. Specifically, we first cluster the batch data to form several sets containing similar classes. Then, we disentangle the visual features into semantic-unspecific and semantic-matched variables, and further disentangle the semantic-matched variables into class-shared and class-unique variables according to the clustering results. The disentangled learning module with random swapping and semantic-visual alignment bridges the semantic gap. Moreover, we introduce contrastive learning on semantic-matched and class-unique variables to learn high intra-set and intra-class similarity, as well as inter-set and inter-class discriminability. Then, the generated visual features conform to the underlying characteristics of general images and have strong discriminative information, which alleviates the domain shift problem well. We evaluate our proposed method on four datasets and achieve state-of-the-art results in both conventional and generalized settings.
READ FULL TEXT