Self-supervised Visual Attribute Learning for Fashion Compatibility
Many self-supervised learning (SSL) methods have been successful in learning semantically meaningful visual representations by solving pretext tasks. However, state-of-the-art SSL methods focus on object recognition or detection tasks, which aim to learn object shapes, but ignore visual attributes such as color and texture via color distortion augmentation. However, learning these visual attributes could be more important than learning object shapes for other vision tasks, such as fashion compatibility. To address this deficiency, we propose Self-supervised Tasks for Outfit Compatibility (STOC) without any supervision. Specifically, STOC aims to learn colors and textures of fashion items and embed similar items nearby. STOC outperforms state-of-the-art SSL by 9.5 completion task on our unsupervised benchmark.
READ FULL TEXT