Biologically-primed deep neural network improves colorectal Cancer Molecular subtypes prediction from H E stained images
Colorectal cancer (CRC) molecular subtypes play a crucial role in determining treatment options. Immunotherapy is effective for the microsatellite instability (MSI) subtype of CRC, but not for the microsatellite stability (MSS) subtype. Recently, convolutional neural networks (CNNs) have been proposed for automated determination of CRC subtypes from H&E stained histopathological images. However, previous CNN architectures only consider binary outcomes of MSI or MSS, and do not account for additional biological cues that may affect the histopathological imaging phenotype. In this study, we propose a biologically-primed CNN (BP-CNN) architecture for CRC subtype classification from H&E stained images. Our BP-CNN accounts for additional biological cues by casting the binary classification outcome into a biologically-informed multi-class outcome. We evaluated the BP-CNN approach using a 5-fold cross-validation experimental setup for model development on the TCGA-CRC-DX cohort, comparing it to a baseline binary classification CNN. Our BP-CNN achieved superior performance when using either single-nucleotide-polymorphism (SNP) molecular features (AUC: 0.824±0.02 vs. 0.761±0.04, paired t-test, p<0.05) or CpG-Island methylation phenotype (CIMP) molecular features (AUC: 0.834±0.01 vs. 0.787±0.03, paired t-test, p<0.05). A combination of CIMP and SNP models further improved classification accuracy (AUC: 0.847±0.01 vs. 0.787±0.03, paired t-test, p=0.01). Our BP-CNN approach has the potential to provide insight into the biological cues that influence cancer histopathological imaging phenotypes and to improve the accuracy of deep-learning-based methods for determining cancer subtypes from histopathological imaging data.
READ FULL TEXT