Generating Synthetic Clinical Data that Capture Class Imbalanced Distributions with Generative Adversarial Networks: Example using Antiretroviral Therapy for HIV

08/18/2022
by   Nicholas I-Hsien Kuo, et al.
0

Clinical data usually cannot be freely distributed due to their highly confidential nature and this hampers the development of machine learning in the healthcare domain. One way to mitigate this problem is by generating realistic synthetic datasets using generative adversarial networks (GANs). However, GANs are known to suffer from mode collapse and thus creating outputs of low diveristy. In this paper, we extend the classic GAN setup with an external memory to replay features from real samples. Using antiretroviral therapy for human immunodeficiency virus (ART for HIV) as a case study, we show that our extended setup increases convergence and more importantly, it is effective in capturing the severe class imbalanced distributions common to real world clinical data.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset