Considerations of automated machine learning in clinical metabolic profiling: Altered homocysteine plasma concentration associated with metformin exposure

by   Alena Orlenko, et al.

With the maturation of metabolomics science and proliferation of biobanks, clinical metabolic profiling is an increasingly opportunistic frontier for advancing translational clinical research. Automated Machine Learning (AutoML) approaches provide exciting opportunity to guide feature selection in agnostic metabolic profiling endeavors, where potentially thousands of independent data points must be evaluated. In previous research, AutoML using high-dimensional data of varying types has been demonstrably robust, outperforming traditional approaches. However, considerations for application in clinical metabolic profiling remain to be evaluated. Particularly, regarding the robustness of AutoML to identify and adjust for common clinical confounders. In this study, we present a focused case study regarding AutoML considerations for using the Tree-Based Optimization Tool (TPOT) in metabolic profiling of exposure to metformin in a biobank cohort. First, we propose a tandem rank-accuracy measure to guide agnostic feature selection and corresponding threshold determination in clinical metabolic profiling endeavors. Second, while AutoML, using default parameters, demonstrated potential to lack sensitivity to low-effect confounding clinical covariates, we demonstrated residual training and adjustment of metabolite features as an easily applicable approach to ensure AutoML adjustment for potential confounding characteristics. Finally, we present increased homocysteine with long-term exposure to metformin as a potentially novel, non-replicated metabolite association suggested by TPOT; an association not identified in parallel clinical metabolic profiling endeavors. While considerations are recommended, including adjustment approaches for clinical confounders, AutoML presents an exciting tool to enhance clinical metabolic profiling and advance translational research endeavors.


page 14

page 16


Causal inference in the context of an error prone exposure: air pollution and mortality

We propose a new approach for estimating causal effects when the exposur...

A Causal Exposure Response Function with Local Adjustment for Confounding

In the last two decades, ambient levels of air pollution have declined s...

A potential outcomes approach to selection bias

Selection bias occurs when the association between exposure and disease ...

Sensitivity Analysis for Unmeasured Confounding via Effect Extrapolation

Inferring the causal effect of a non-randomly assigned exposure on an ou...

"The Human Body is a Black Box": Supporting Clinical Decision-Making with Deep Learning

Machine learning technologies are increasingly developed for use in heal...

Net benefit separation and the determination curve: a probabilistic framework for cost-effectiveness estimation

Considerations regarding clinical effectiveness and cost are essential i...

RPBA – Robust Parallel Bundle Adjustment Based on Covariance Information

A core component of all Structure from Motion (SfM) approaches is bundle...

Please sign up or login with your details

Forgot password? Click here to reset