AffinityNet: semi-supervised few-shot learning for disease type prediction
Motivation:While deep learning has achieved great success in computer vision and other fields, currently it does not work well on genomic data due to "big p, small n" problem (i.e., relatively small number of samples with high-dimensional features). In order to make deep learning work with a small amount of training data, we have to design new models that can facilitate few-shot learning. In this paper we focus on developing data efficient deep learning models that learn from a limited number of training examples and generalize well. Results: We developed two deep learningmodules: feature attention layer and k-Nearest-Neighbor (kNN) attention poolinglayer tomake ourmodelmuchmore data efficient than conventionaldeep learningmodels. Feature attention layer can directly select important features that are useful for patient classification. kNN attention pooling layer is based on graph attention model, and is good for semi-supervised few-shot learning. Experiments on both synthetic data and cancer genomic data from TCGA projects show that our method has better generalization power than conventional neural network model. Availability: We have implemented our method using PyTorch deep learning framework (https://pytorch.org). The code is freely available at https://github.com/BeautyOfWeb/AffinityNet.
READ FULL TEXT