Large-scale machine learning-based phenotyping significantly improves genomic discovery for optic nerve head morphology

by   Babak Alipanahi, et al.

Genome-wide association studies (GWAS) require accurate cohort phenotyping, but expert labeling can be costly, time-intensive, and variable. Here we develop a machine learning (ML) model to predict glaucomatous optic nerve head features from color fundus photographs. We used the model to predict vertical cup-to-disc ratio (VCDR), a diagnostic parameter and cardinal endophenotype for glaucoma, in 65,680 Europeans in the UK Biobank (UKB). A GWAS of ML-based VCDR identified 299 independent genome-wide significant (GWS; P≤5×10^-8) hits in 156 loci. The ML-based GWAS replicated 62 of 65 GWS loci from a recent VCDR GWAS in the UKB for which two ophthalmologists manually labeled images for 67,040 Europeans. The ML-based GWAS also identified 92 novel loci, significantly expanding our understanding of the genetic etiologies of glaucoma and VCDR. Pathway analyses support the biological significance of the novel hits to VCDR, with select loci near genes involved in neuronal and synaptic biology or known to cause severe Mendelian ophthalmic disease. Finally, the ML-based GWAS results significantly improve polygenic prediction of VCDR and primary open-angle glaucoma in the independent EPIC-Norfolk cohort.


page 1

page 2

page 3

page 4


Winter Wheat Crop Yield Prediction on Multiple Heterogeneous Datasets using Machine Learning

Winter wheat is one of the most important crops in the United Kingdom, a...

Machine Learning based Prediction of Hierarchical Classification of Transposable Elements

Transposable Elements (TEs) or jumping genes are the DNA sequences that ...

Assessing the Reproducibility of Machine-learning-based Biomarker Discovery in Parkinson's Disease

Genome-Wide Association Studies (GWAS) help identify genetic variations ...

Phenotyping with Positive Unlabelled Learning for Genome-Wide Association Studies

Identifying phenotypes plays an important role in furthering our underst...

Learnability of Learning Performance and Its Application to Data Valuation

For most machine learning (ML) tasks, evaluating learning performance on...

Application of Clustering Algorithms for Dimensionality Reduction in Infrastructure Resilience Prediction Models

Recent studies increasingly adopt simulation-based machine learning (ML)...

Please sign up or login with your details

Forgot password? Click here to reset