Sparse group factor analysis for biclustering of multiple data sources

by   Kerstin Bunte, et al.

Motivation: Modelling methods that find structure in data are necessary with the current large volumes of genomic data, and there have been various efforts to find subsets of genes exhibiting consistent patterns over subsets of treatments. These biclustering techniques have focused on one data source, often gene expression data. We present a Bayesian approach for joint biclustering of multiple data sources, extending a recent method Group Factor Analysis (GFA) to have a biclustering interpretation with additional sparsity assumptions. The resulting method enables data-driven detection of linear structure present in parts of the data sources. Results: Our simulation studies show that the proposed method reliably infers bi-clusters from heterogeneous data sources. We tested the method on data from the NCI-DREAM drug sensitivity prediction challenge, resulting in an excellent prediction accuracy. Moreover, the predictions are based on several biclusters which provide insight into the data sources, in this case on gene expression, DNA methylation, protein abundance, exome sequence, functional connectivity fingerprints and drug sensitivity.


page 2

page 6


GFA: Exploratory Analysis of Multiple Data Sources with Group Factor Analysis

The R package GFA provides a full pipeline for factor analysis of multip...

Computational Pathology: Challenges and Promises for Tissue Analysis

The histological assessment of human tissue has emerged as the key chall...

Aggregating Predictions on Multiple Non-disclosed Datasets using Conformal Prediction

Conformal Prediction is a machine learning methodology that produces val...

An Integrated System of Drug Matching and Abnormal Approval Number Correction

This essay is based on the joint project with 111, Inc. The pharmacy e-C...

Logistic Regression Augmented Community Detection for Network Data with Application in Identifying Autism-Related Gene Pathways

When searching for gene pathways leading to specific disease outcomes, a...

Consensus Knowledge Graph Learning via Multi-view Sparse Low Rank Block Model

Network analysis has been a powerful tool to unveil relationships and in...

Please sign up or login with your details

Forgot password? Click here to reset