Towards Harmonized Regional Style Transfer and Manipulation for Facial Images
Regional facial image synthesis conditioned on semantic mask has achieved great success using generative adversarial networks. However, the appearance of different regions may be inconsistent with each other when conducting regional image editing. In this paper, we focus on the problem of harmonized regional style transfer and manipulation for facial images. The proposed approach supports regional style transfer and manipulation at the same time. A multi-scale encoder and style mapping networks are proposed in our work. The encoder is responsible for extracting regional styles of real faces. Style mapping networks generate styles from random samples for all facial regions. As the key part of our work, we propose a multi-region style attention module to adapt the multiple regional style embeddings from a reference image to a target image for generating harmonious and plausible results. Furthermore, we propose a new metric "harmony score" and conduct experiments in a challenging setting: three widely used face datasets are involved and we test the model by transferring the regional facial appearance between datasets. Images in different datasets are usually quite different, which makes the inconsistency between target and reference regions more obvious. Results show that our model can generate reliable style transfer and multi-modal manipulation results compared with SOTAs. Furthermore, we show two face editing applications using the proposed approach.
READ FULL TEXT