Machine learning reveals how metabolite profiles predict aging and health

Machine learning reveals how metabolite profiles predict aging and health
Machine learning reveals how metabolite profiles predict aging and health

Metabolite data and AI combine to redefine how we measure aging and predict life expectancy.

Study: Metabolomic Age (MileAge) Predicts Health and Lifespan: Comparison of Multiple Machine Learning Algorithms. Image credit: Sergei Tarasov/Shutterstock

In a recent study published in the journal Scientific advancesresearchers from King’s College London explored metabolomic aging clocks using machine learning models trained on plasma metabolite data from the UK Biobank. The study aimed to assess the potential of metabolomic clocks of aging in predicting health outcomes and lifespan by assessing their accuracy, robustness and relevance to biological indicators of aging beyond the chronological age.

Background

Biological aging, as distinct from chronological age, reflects molecular and cellular damage that influences health and susceptibility to disease. Chronological age alone cannot account for the variability of physiological states linked to aging in individuals. However, recent advances in omics technologies, particularly metabolomics, have provided insight into biological aging through molecular profiling.

Metabolites, or small molecules from metabolic pathways, can provide assessments of physiological health and are linked to aging-related outcomes, such as chronic disease and mortality. Previous studies have correlated metabolomics data with aging but have been limited by limited sample and marker sizes.

Recent efforts to derive “aging clocks” using machine learning from omics data have demonstrated significant predictive power for health outcomes. However, challenges still remain to optimize the accuracy and interpretability of these models, particularly using metabolomics.

The current study

The current study used nuclear magnetic resonance (NMR) spectroscopy to analyze plasma metabolite data from the UK Biobank, involving 225,212 participants aged 37 to 73 years. Exclusion criteria included pregnancy, data inconsistencies, and extreme metabolite values. The dataset included 168 metabolites representing lipid profiles, amino acids, and glycolysis products.

The researchers applied 17 machine learning algorithms, including linear regression, tree models, and ensemble techniques, to the dataset to develop metabolomic aging clocks. They also used a rigorous nested cross-validation approach to ensure robust model evaluation.

Some of the key preprocessing steps included handling metabolite outliers and correcting age prediction biases inherent in the models. The predictive models aimed to estimate chronological age using metabolite profiles, and the differences between predicted and actual ages were defined as the “delta MileAge”. Statistical corrections have been widely applied to remove systematic biases and improve forecast accuracy, particularly for younger and older age groups.

The models were evaluated for their predictive accuracy using metrics such as mean absolute error (MAE), root mean square error (RMSE), and correlation coefficients. For example, the cubist regression model achieved an MAE of 5.31 years, outperforming other models like multivariate adaptive regression splines (MAE = 6.36 years). Further analysis adjusted the predictions to remove systematic biases and improve their alignment with chronological age.

Study design and overview. (A) Overview of the nested cross-validation approach. MAE, mean absolute error; RMSE, root mean square error. (B) Histogram of the chronological age distribution of the analytical sample. The statistical mode (age, 61 years) is displayed in red. (C) Distribution of metabolite levels by chronological age, showing scatterplots of all observations and smooth curves (note the difference in y-axis scale). Smooth curves were estimated using generalized additive models, with shaded areas corresponding to 95% confidence intervals (CI). GlycA, acetylated glycoproteins. (D) Scatterplot showing the hazard ratio (HR) for all-cause mortality and beta for chronological age associated with a one standard deviation difference in metabolite levels. Metabolites that had statistically significant associations with chronological age and all-cause mortality are shown in purple.

Results

The results indicated that metabolomic aging clocks developed from plasma metabolite profiles could effectively differentiate biological from chronological aging. Among the different models tested in the study, the cubist rule-based regression model provided the strongest predictive associations with health markers and mortality and outperformed other algorithms in terms of accuracy and robustness.

Additionally, positive MileAge delta values, which indicated accelerated aging, were linked to frailty, shorter telomeres, higher morbidity and increased mortality risk. Specifically, a one-year increase in MileAge delta corresponded to a 4% increase in all-cause mortality risk, with hazard ratios (HRs) exceeding 1.5 in extreme cases.

Additionally, the study showed that people suffering from accelerated aging were more likely to report poorer health and suffer from chronic diseases. Associations with telomere fragility and attrition were particularly pronounced, with some differences amounting to an 18-year disparity in fragility index scores. Interestingly, women had slightly higher mile age deltas than men in most models.

The study also confirmed the nonlinear nature of metabolite-age relationships and highlighted the usefulness of statistical corrections to improve prediction accuracy. Furthermore, comparison of existing aging markers showed that metabolomic clocks of aging captured unique health-related signals and often outperformed the simplest predictors. However, the results highlighted that slowed aging (negative MileAge deltas) did not systematically translate into better health outcomes, highlighting the complexity of measuring biological aging.

Conclusions

Overall, the study demonstrated the utility of metabolomic aging clocks in predicting biological aging and associated health outcomes. Comparing multiple machine learning algorithms, the results also showed the superior performance of the cubist rule-based model in relating metabolite-derived ages to health markers and mortality.

The results suggest that metabolomic aging clocks hold potential for proactive health management and risk stratification and highlight the need for further validation across diverse populations and longitudinal data for broader clinical application. This study sets a new benchmark for algorithm development, illustrating how metabolomic profiles can offer actionable insights into aging and health.

-

-

PREV Therapeutic equipment given to the Scorff hospital in Lorient
NEXT AI reveals how metabolic profiles predict aging and health