Estimation and Inference for High Dimensional Generalized Linear Models: A Splitting and Smoothing Approach

by   Zhe Fei, et al.

For a better understanding of the molecular causes of lung cancer, the Boston Lung Cancer Study (BLCS) has generated comprehensive molecular data from both lung cancer cases and controls. It has been challenging to model such high dimensional data with non-linear outcomes, and to give accurate uncertainty measures of the estimators. To properly infer cancer risks at the molecular level, we propose a novel inference framework for generalized linear models and use it to estimate the high dimensional SNP effects and their potential interactions with smoking. We use multi-sample splitting and smoothing to reduce the highdimensional problem to low-dimensional maximum likelihood estimations. Unlike other methods, the proposed estimator does not involve penalization/regularization and, thus, avoids its drawbacks in making inferences. Our estimator is asymptotically unbiased and normal, and gives confidence intervals with proper coverage. To facilitate hypothesis testing and drawing inferences on predetermined contrasts, our method can be applied to infer any fixed low-dimensional parameters in the presence of high dimensional nuisance parameters. To demonstrate the advantages of the method, we conduct extensive simulations, and analyze the BLCS SNP data and obtain some biologically meaningful results.


page 1

page 2

page 3

page 4


A Revisit to De-biased Lasso for Generalized Linear Models

De-biased lasso has emerged as a popular tool to draw statistical infere...

Inference for High Dimensional Censored Quantile Regression

With the availability of high dimensional genetic biomarkers, it is of i...

Debiased Lasso After Sample Splitting for Estimation and Inference in High Dimensional Generalized Linear Models

We consider random sample splitting for estimation and inference in high...

De-biased Lasso for Generalized Linear Models with A Diverging Number of Covariates

Modeling and drawing inference on the joint associations between single ...

A Likelihood Ratio Framework for High Dimensional Semiparametric Regression

We propose a likelihood ratio based inferential framework for high dimen...

Application of Kriging Models for a Drug Combination Experiment on Lung Cancer

Combinatorial drugs have been widely applied in disease treatment, espec...

Visual High Dimensional Hypothesis Testing

In exploratory data analysis of known classes of high dimensional data, ...

Please sign up or login with your details

Forgot password? Click here to reset