Semi-Supervised Prediction of Gene Regulatory Networks Using Machine Learning Algorithms
Use of computational methods to predict gene regulatory networks (GRNs) from gene expression data is a challenging task. Many studies have been conducted using unsupervised methods to fulfill the task; however, such methods usually yield low prediction accuracies due to the lack of training data. In this article, we propose semi-supervised methods for GRN prediction by utilizing two machine learning algorithms, namely support vector machines (SVM) and random forests (RF). The semi-supervised methods make use of unlabeled data for training. We investigate inductive and transductive learning approaches, both of which adopt an iterative procedure to obtain reliable negative training data from the unlabeled data. We then apply our semi-supervised methods to gene expression data of Escherichia coli and Saccharomyces cerevisiae, and evaluate the performance of our methods using the expression data. Our analysis indicated that the transductive learning approach outperformed the inductive learning approach for both organisms. However, there was no conclusive difference identified in the performance of SVM and RF. Experimental results also showed that the proposed semi-supervised methods performed better than existing supervised methods for both organisms.
READ FULL TEXT