The effect of loss function on conditional generative adversarial networks
Conditional Generative Adversarial Network (cGAN) is a general purpose approach for many image-to-image translation tasks, which aims to translate images from one form to another resulting in high-quality translated images. In this paper, the loss function of the cGAN model is modified by combining the adversarial loss of state-of-the-art Generative Adversarial Network (GAN) models with a new combination of non-adversarial loss functions to enhance model performance and generate more realistic images. Specifically, the effect of the Wasserstein GAN (WGAN), the WGAN with Gradient Penalty (WGAN-GP), and least Squared GAN (lsGAN) adversarial loss functions are explored. Several comparisons are performed to select an optimized combination of L1 with structure, gradient, content-based, Kullback-Leibler divergence, and softmax non-adversarial loss functions. For experimentation purposes, the Facades dataset is used in case of image-to-image translation task. Peak-signal-to-noise-ratio (PSNR), Structural Similarity Index (SSIM), Universal Quality Index (UQI), and Visual Information Fidelity (VIF) are used to quantitatively evaluate the translated images. Based on our experimental results, the best combination of the loss functions for image-to-image translation on facade dataset is (WGAN) adversarial loss with (L1 and content) non-adversarial loss functions. The model generates fine structure images, and captures both high and low frequency details of translated images. Image in-painting and lesion segmentation is investigated to demonstrate practicality of proposed work.
READ FULL TEXT