Matthias Unterhuber1, Karl-Patrik Kresoja1, Karl-Philipp Rommel1, Christian Besler1, Andrea Baragetti2, Nora Klöting3, Uta Ceglarek4, Matthias Blüher3, Markus Scholz5, Alberico L Catapano2, Holger Thiele1, Philipp Lurz6. 1. Department of Cardiology, Heart Center Leipzig at University Leipzig, Leipzig, Germany. 2. Department of Pharmacological and Biomolecular Sciences, University of Milan, and I.R.C.C.S MultiMedica, Milan, Italy. 3. Medical Department III - Endocrinology, Nephrology, Rheumatology, University of Leipzig Medical Center, Leipzig, Germany; Helmholtz Institute for Metabolic, Obesity and Vascular Research (HI-MAG) of the Helmholtz, Leipzig, Germany. 4. Institute of Laboratory Medicine, Clinical Chemistry and Molecular Diagnostics, Leipzig University, Leipzig, Germany. 5. Institute for Medical Informatics, Statistics and Epidemiology, Medical Faculty, University of Leipzig, Leipzig, Germany; LIFE Research Center of Civilization Diseases, Leipzig, Germany. 6. Department of Cardiology, Heart Center Leipzig at University Leipzig, Leipzig, Germany. Electronic address: Philipp.Lurz@medizin.uni-leipzig.de.
Abstract
BACKGROUND: Individualized risk prediction represents a prerequisite for providing personalized medicine. OBJECTIVES: This study compared proteomics-enabled machine-learning (ML) algorithms with classical and clinical risk prediction methods for all-cause mortality in cohorts of patients with cardiovascular risk factors in the LIFE-Heart Study, followed by validation in the PLIC (Progressione della Lesione Intimale Carotidea) study. METHODS: Using the OLINK-Cardiovascular-II panel, 92 proteins were measured in a cohort of 1,998 individuals from the LIFE-Heart Study (derivation) and 772 subjects from the PLIC cohort (external validation). We constructed protein-based mortality prediction models using eXtreme Gradient Boosting (XGBoost) and a neural network, comparing the prediction performance with classical clinical risk scores (Systemic Coronary Risk Evaluation, Framingham), logistic and Cox regression models. RESULTS: All-cause mortality occurred in 156 (8%) patients in the internal validation and 68 (9%) patients in the external validation cohort, within a median follow-up of 10 and 11 years, respectively. On internal and external validation, the Framingham Risk Score achieved areas under the curve (AUCs) of 0.64 (95% CI: 0.59-0.68) and 0.65 (95% CI: 0.58-0.74), logistic regression AUCs of 0.65 (95% CI: 0.57-0.73) and 0.67 (95% CI: 0.59-0.74), Cox regression AUCs of 0.55 (95% CI: 0.51-0.59) and 0.65 (95% CI: 0.57-0.73), the XGBoost classifier AUCs of 0.83 (95% CI: 0.79-0.87) and 0.91 (95% CI: 0.86-0.95), the XGBoost survival estimator AUCs of 0.83 (95% CI: 0.79-0.87) and 0.93 (95% CI: 0.88-0.97), and the neural network AUCs of 0.87 (95% CI: 0.83-0.91) and 0.94 (95% CI: 0.90-0.98), respectively (modern vs classical ML: P < 0.001). CONCLUSIONS: ML-driven multiprotein risk models outperform classical regression models and clinical scores for prediction of all-cause mortality in patients at increased cardiovascular risk.
BACKGROUND: Individualized risk prediction represents a prerequisite for providing personalized medicine. OBJECTIVES: This study compared proteomics-enabled machine-learning (ML) algorithms with classical and clinical risk prediction methods for all-cause mortality in cohorts of patients with cardiovascular risk factors in the LIFE-Heart Study, followed by validation in the PLIC (Progressione della Lesione Intimale Carotidea) study. METHODS: Using the OLINK-Cardiovascular-II panel, 92 proteins were measured in a cohort of 1,998 individuals from the LIFE-Heart Study (derivation) and 772 subjects from the PLIC cohort (external validation). We constructed protein-based mortality prediction models using eXtreme Gradient Boosting (XGBoost) and a neural network, comparing the prediction performance with classical clinical risk scores (Systemic Coronary Risk Evaluation, Framingham), logistic and Cox regression models. RESULTS: All-cause mortality occurred in 156 (8%) patients in the internal validation and 68 (9%) patients in the external validation cohort, within a median follow-up of 10 and 11 years, respectively. On internal and external validation, the Framingham Risk Score achieved areas under the curve (AUCs) of 0.64 (95% CI: 0.59-0.68) and 0.65 (95% CI: 0.58-0.74), logistic regression AUCs of 0.65 (95% CI: 0.57-0.73) and 0.67 (95% CI: 0.59-0.74), Cox regression AUCs of 0.55 (95% CI: 0.51-0.59) and 0.65 (95% CI: 0.57-0.73), the XGBoost classifier AUCs of 0.83 (95% CI: 0.79-0.87) and 0.91 (95% CI: 0.86-0.95), the XGBoost survival estimator AUCs of 0.83 (95% CI: 0.79-0.87) and 0.93 (95% CI: 0.88-0.97), and the neural network AUCs of 0.87 (95% CI: 0.83-0.91) and 0.94 (95% CI: 0.90-0.98), respectively (modern vs classical ML: P < 0.001). CONCLUSIONS: ML-driven multiprotein risk models outperform classical regression models and clinical scores for prediction of all-cause mortality in patients at increased cardiovascular risk.
Authors: Ruben Evertz; Torben Lange; Sören J Backhaus; Alexander Schulz; Bo Eric Beuthner; Rodi Topci; Karl Toischer; Miriam Puls; Johannes T Kowallick; Gerd Hasenfuß; Andreas Schuster Journal: J Interv Cardiol Date: 2022-04-20 Impact factor: 1.776