Henry Völzke1, Glenn Fung, Till Ittermann, Shipeng Yu, Sebastian E Baumeister, Marcus Dörr, Wolfgang Lieb, Uwe Völker, Allan Linneberg, Torben Jørgensen, Stephan B Felix, Rainer Rettig, Bharat Rao, Heyo K Kroemer. 1. aInstitute for Community Medicine, Ernst Moritz Arndt University, Greifswald, Germany bSiemens Healthcare, Malvern, Pennsylvania, USA cClinic of Internal Medicine B, Ernst Moritz Arndt University, Greifswald dInstitute of Epidemiology, Christian Albrechts University, Kiel eInterfaculty Institute of Functional Genomics, Ernst Moritz Arndt University, Greifswald, Germany fResearch Centre for Prevention and Health, Glostrup University Hospital, Glostrup gFaculty of Health Sciences, University of Copenhagen, Copenhagen, Denmark hInstitute of Physiology, University Medicine, Ernst Moritz Arndt University, Greifswald iUniversity Medical Center, Göttingen, Germany *Henry Völzke and Glenn Fung contributed equally to the writing of this article.
Abstract
OBJECTIVE: Data mining represents an alternative approach to identify new predictors of multifactorial diseases. This work aimed at building an accurate predictive model for incident hypertension using data mining procedures. METHODS: The primary study population consisted of 1605 normotensive individuals aged 20-79 years with 5-year follow-up from the population-based study, that is the Study of Health in Pomerania (SHIP). The initial set was randomly split into a training and a testing set. We used a probabilistic graphical model applying a Bayesian network to create a predictive model for incident hypertension and compared the predictive performance with the established Framingham risk score for hypertension. Finally, the model was validated in 2887 participants from INTER99, a Danish community-based intervention study. RESULTS: In the training set of SHIP data, the Bayesian network used a small subset of relevant baseline features including age, mean arterial pressure, rs16998073, serum glucose and urinary albumin concentrations. Furthermore, we detected relevant interactions between age and serum glucose as well as between rs16998073 and urinary albumin concentrations [area under the receiver operating characteristic (AUC 0.76)]. The model was confirmed in the SHIP validation set (AUC 0.78) and externally replicated in INTER99 (AUC 0.77). Compared to the established Framingham risk score for hypertension, the predictive performance of the new model was similar in the SHIP validation set and moderately better in INTER99. CONCLUSION: Data mining procedures identified a predictive model for incident hypertension, which included innovative and easy-to-measure variables. The findings promise great applicability in screening settings and clinical practice.
OBJECTIVE: Data mining represents an alternative approach to identify new predictors of multifactorial diseases. This work aimed at building an accurate predictive model for incident hypertension using data mining procedures. METHODS: The primary study population consisted of 1605 normotensive individuals aged 20-79 years with 5-year follow-up from the population-based study, that is the Study of Health in Pomerania (SHIP). The initial set was randomly split into a training and a testing set. We used a probabilistic graphical model applying a Bayesian network to create a predictive model for incident hypertension and compared the predictive performance with the established Framingham risk score for hypertension. Finally, the model was validated in 2887 participants from INTER99, a Danish community-based intervention study. RESULTS: In the training set of SHIP data, the Bayesian network used a small subset of relevant baseline features including age, mean arterial pressure, rs16998073, serum glucose and urinary albumin concentrations. Furthermore, we detected relevant interactions between age and serum glucose as well as between rs16998073 and urinary albumin concentrations [area under the receiver operating characteristic (AUC 0.76)]. The model was confirmed in the SHIP validation set (AUC 0.78) and externally replicated in INTER99 (AUC 0.77). Compared to the established Framingham risk score for hypertension, the predictive performance of the new model was similar in the SHIP validation set and moderately better in INTER99. CONCLUSION: Data mining procedures identified a predictive model for incident hypertension, which included innovative and easy-to-measure variables. The findings promise great applicability in screening settings and clinical practice.
Authors: Ann-Kristin Becker; Marcus Dörr; Stephan B Felix; Fabian Frost; Hans J Grabe; Markus M Lerch; Matthias Nauck; Uwe Völker; Henry Völzke; Lars Kaderali Journal: PLoS Comput Biol Date: 2021-02-12 Impact factor: 4.475
Authors: Mohammad Ziaul Islam Chowdhury; Iffat Naeem; Hude Quan; Alexander A Leung; Khokan C Sikdar; Maeve O'Beirne; Tanvir C Turin Journal: PLoS One Date: 2022-04-07 Impact factor: 3.240