Graph-based regularization for regression problems with highly-correlated designs

03/20/2018
by   Yuan Li, et al.
0

Sparse models for high-dimensional linear regression and machine learning have received substantial attention over the past two decades. Model selection, or determining which features or covariates are the best explanatory variables, is critical to the interpretability of a learned model. Much of the current literature assumes that covariates are only mildly correlated. However, in modern applications ranging from functional MRI to genome-wide association studies, covariates are highly correlated and do not exhibit key properties (such as the restricted eigenvalue condition, RIP, or other related assumptions). This paper considers a high-dimensional regression setting in which a graph governs both correlations among the covariates and the similarity among regression coefficients. Using side information about the strength of correlations among features, we form a graph with edge weights corresponding to pairwise covariances. This graph is used to define a graph total variation regularizer that promotes similar weights for highly correlated features. The graph structure encapsulated by this regularizer helps precondition correlated features to yield provably accurate estimates. Using graph-based regularizers to develop theoretical guarantees for highly-correlated covariates has not been previously examined. This paper shows how our proposed graph-based regularization yields mean-squared error guarantees for a broad range of covariance graph structures and correlation strengths which in many cases are optimal by imposing additional structure on β^ which encourages alignment with the covariance graph. Our proposed approach outperforms other state-of-the-art methods for highly-correlated design in a variety of experiments on simulated and real fMRI data.

READ FULL TEXT

page 14

page 15

page 23

research
04/08/2023

Benign Overfitting of Non-Sparse High-Dimensional Linear Regression with Correlated Noise

We investigate the high-dimensional linear regression problem in situati...
research
11/06/2017

Independently Interpretable Lasso: A New Regularizer for Sparse Regression with Uncorrelated Variables

Sparse regularization such as ℓ_1 regularization is a quite powerful and...
research
02/07/2014

On the Prediction Performance of the Lasso

Although the Lasso has been extensively studied, the relationship betwee...
research
10/03/2022

Factor-Augmented Regularized Model for Hazard Regression

A prevalent feature of high-dimensional data is the dependence among cov...
research
09/13/2021

Regression Analysis of Correlations for Correlated Data

Correlated data are ubiquitous in today's data-driven society. A fundame...
research
03/23/2019

Bayesian Factor-adjusted Sparse Regression

This paper investigates the high-dimensional linear regression with high...
research
07/20/2013

A convex pseudo-likelihood framework for high dimensional partial correlation estimation with convergence guarantees

Sparse high dimensional graphical model selection is a topic of much int...

Please sign up or login with your details

Forgot password? Click here to reset