Elsie Gyang Ross1,2, Kenneth Jung2, Joel T Dudley3, Li Li3,4, Nicholas J Leeper1, Nigam H Shah2. 1. Division of Vascular Surgery (E.G.R., N.J.L.), Stanford University School of Medicine, Stanford, CA. 2. Center for Biomedical Informatics Research (K.J., N.H.S., E.G.R), Stanford University School of Medicine, Stanford, CA. 3. Icahn School of Medicine at Mount Sinai, New York, NY (J.T.D., L.L.). 4. Sema4, a Mount Sinai Venture, Stamford, CT (L.L.).
Abstract
BACKGROUND: Patients with peripheral artery disease (PAD) are at risk of major adverse cardiac and cerebrovascular events. There are no readily available risk scores that can accurately identify which patients are most likely to sustain an event, making it difficult to identify those who might benefit from more aggressive intervention. Thus, we aimed to develop a novel predictive model-using machine learning methods on electronic health record data-to identify which PAD patients are most likely to develop major adverse cardiac and cerebrovascular events. METHODS AND RESULTS: Data were derived from patients diagnosed with PAD at 2 tertiary care institutions. Predictive models were built using a common data model that allowed for utilization of both structured (coded) and unstructured (text) data. Only data from time of entry into the health system up to PAD diagnosis were used for modeling. Models were developed and tested using nested cross-validation. A total of 7686 patients were included in learning our predictive models. Utilizing almost 1000 variables, our best predictive model accurately determined which PAD patients would go on to develop major adverse cardiac and cerebrovascular events with an area under the curve of 0.81 (95% CI, 0.80-0.83). CONCLUSIONS: Machine learning algorithms applied to data in the electronic health record can learn models that accurately identify PAD patients at risk of future major adverse cardiac and cerebrovascular events, highlighting the great potential of electronic health records to provide automated risk stratification for cardiovascular diseases. Common data models that can enable cross-institution research and technology development could potentially be an important aspect of widespread adoption of newer risk-stratification models.
BACKGROUND: Patients with peripheral artery disease (PAD) are at risk of major adverse cardiac and cerebrovascular events. There are no readily available risk scores that can accurately identify which patients are most likely to sustain an event, making it difficult to identify those who might benefit from more aggressive intervention. Thus, we aimed to develop a novel predictive model-using machine learning methods on electronic health record data-to identify which PAD patients are most likely to develop major adverse cardiac and cerebrovascular events. METHODS AND RESULTS: Data were derived from patients diagnosed with PAD at 2 tertiary care institutions. Predictive models were built using a common data model that allowed for utilization of both structured (coded) and unstructured (text) data. Only data from time of entry into the health system up to PAD diagnosis were used for modeling. Models were developed and tested using nested cross-validation. A total of 7686 patients were included in learning our predictive models. Utilizing almost 1000 variables, our best predictive model accurately determined which PAD patients would go on to develop major adverse cardiac and cerebrovascular events with an area under the curve of 0.81 (95% CI, 0.80-0.83). CONCLUSIONS: Machine learning algorithms applied to data in the electronic health record can learn models that accurately identify PAD patients at risk of future major adverse cardiac and cerebrovascular events, highlighting the great potential of electronic health records to provide automated risk stratification for cardiovascular diseases. Common data models that can enable cross-institution research and technology development could potentially be an important aspect of widespread adoption of newer risk-stratification models.
Authors: W A Knaus; D P Wagner; E A Draper; J E Zimmerman; M Bergner; P G Bastos; C A Sirio; D J Murphy; T Lotring; A Damiano Journal: Chest Date: 1991-12 Impact factor: 9.410
Authors: George Hripcsak; Jon D Duke; Nigam H Shah; Christian G Reich; Vojtech Huser; Martijn J Schuemie; Marc A Suchard; Rae Woong Park; Ian Chi Kei Wong; Peter R Rijnbeek; Johan van der Lei; Nicole Pratt; G Niklas Norén; Yu-Chuan Li; Paul E Stang; David Madigan; Patrick B Ryan Journal: Stud Health Technol Inform Date: 2015
Authors: F FitzHenry; F S Resnic; S L Robbins; J Denton; L Nookala; D Meeker; L Ohno-Machado; M E Matheny Journal: Appl Clin Inform Date: 2015-08-26 Impact factor: 2.342
Authors: Vasiliki Bikia; Terence Fong; Rachel E Climie; Rosa-Maria Bruno; Bernhard Hametner; Christopher Mayer; Dimitrios Terentes-Printzios; Peter H Charlton Journal: Eur Heart J Digit Health Date: 2021-10-18
Authors: I Ghanzouri; S Amal; V Ho; L Safarnejad; J Cabot; C G Brown-Johnson; N Leeper; S Asch; N H Shah; E G Ross Journal: Sci Rep Date: 2022-08-03 Impact factor: 4.996
Authors: Ben Li; Tiam Feridooni; Cesar Cuen-Ojeda; Teruko Kishibe; Charles de Mestral; Muhammad Mamdani; Mohammed Al-Omran Journal: NPJ Digit Med Date: 2022-01-19