Stacked Autoencoder Based Multi-Omics Data Integration for Cancer Survival Prediction
Cancer survival prediction is important for developing personalized treatments and inducing disease-causing mechanisms. Multi-omics data integration is attracting widespread interest in cancer research for providing information for understanding cancer progression at multiple genetic levels. Many works, however, are limited because of the high dimensionality and heterogeneity of multi-omics data. In this paper, we propose a novel method to integrate multi-omics data for cancer survival prediction, called Stacked AutoEncoder-based Survival Prediction Neural Network (SAEsurv-net). In the cancer survival prediction for TCGA cases, SAEsurv-net addresses the curse of dimensionality with a two-stage dimensionality reduction strategy and handles multi-omics heterogeneity with a stacked autoencoder model. The two-stage dimensionality reduction strategy achieves a balance between computation complexity and information exploiting. The stacked autoencoder model removes most heterogeneities such as data's type and size in the first group of autoencoders, and integrates multiple omics data in the second autoencoder. The experiments show that SAEsurv-net outperforms models based on a single type of data as well as other state-of-the-art methods.
READ FULL TEXT