A Partially Functional Linear Modeling Framework for Integrating Genetic, Imaging, and Clinical Data
This paper is motivated by the joint analysis of genetic, imaging, and clinical (GIC) data collected in many large-scale biomedical studies, such as the UK Biobank study and the Alzheimer's Disease Neuroimaging Initiative (ADNI) study. We propose a regression framework based on partially functional linear regression models to map high-dimensional GIC-related pathways for phenotypes of interest. We develop a joint model selection and estimation procedure by embedding imaging data in the reproducing kernel Hilbert space and imposing the ℓ_0 penalty for the coefficients of scalar variables. We systematically investigate the theoretical properties of scalar and functional efficient estimators, including non-asymptotic error bound, minimax error bound, and asymptotic normality. We apply the proposed method to the ADNI dataset to identify important features from several millions of genetic polymorphisms and study the effects of a certain set of informative genetic variants and the hippocampus surface on thirteen cognitive variables.
READ FULL TEXT