Jihad S Obeid1, Ali Khalifa, Brandon Xavier, Halim Bou-Daher, Don C Rockey. 1. Department of Public Health Sciences Division of Gastroenterology and Hepatology Medical University of South Carolina Digestive Disease Research Center, Medical University of South Carolina, Charleston, SC.
Abstract
GOAL: The goal of this study was to evaluate an artificial intelligence approach, namely deep learning, on clinical text in electronic health records (EHRs) to identify patients with cirrhosis. BACKGROUND AND AIMS: Accurate identification of cirrhosis in EHR is important for epidemiological, health services, and outcomes research. Currently, such efforts depend on International Classification of Diseases (ICD) codes, with limited success. MATERIALS AND METHODS: We trained several machine learning models using discharge summaries from patients with known cirrhosis from a patient registry and random controls without cirrhosis or its complications based on ICD codes. Models were validated on patients for whom discharge summaries were manually reviewed and used as the gold standard test set. We tested Naive Bayes and Random Forest as baseline models and a deep learning model using word embedding and a convolutional neural network (CNN). RESULTS: The training set included 446 cirrhosis patients and 689 controls, while the gold standard test set included 139 cirrhosis patients and 152 controls. Among the machine learning models, the CNN achieved the highest area under the receiver operating characteristic curve (0.993), with a precision of 0.965 and recall of 0.978, compared with 0.879 and 0.981 for the Naive Bayes and Random Forest, respectively (precision 0.787 and 0.958, and recalls 0.878 and 0.827). The precision by ICD codes for cirrhosis was 0.883 and recall was 0.978. CONCLUSIONS: A CNN model trained on discharge summaries identified cirrhosis patients with high precision and recall. This approach for phenotyping cirrhosis in the EHR may provide a more accurate assessment of disease burden in a variety of studies.
GOAL: The goal of this study was to evaluate an artificial intelligence approach, namely deep learning, on clinical text in electronic health records (EHRs) to identify patients with cirrhosis. BACKGROUND AND AIMS: Accurate identification of cirrhosis in EHR is important for epidemiological, health services, and outcomes research. Currently, such efforts depend on International Classification of Diseases (ICD) codes, with limited success. MATERIALS AND METHODS: We trained several machine learning models using discharge summaries from patients with known cirrhosis from a patient registry and random controls without cirrhosis or its complications based on ICD codes. Models were validated on patients for whom discharge summaries were manually reviewed and used as the gold standard test set. We tested Naive Bayes and Random Forest as baseline models and a deep learning model using word embedding and a convolutional neural network (CNN). RESULTS: The training set included 446 cirrhosis patients and 689 controls, while the gold standard test set included 139 cirrhosis patients and 152 controls. Among the machine learning models, the CNN achieved the highest area under the receiver operating characteristic curve (0.993), with a precision of 0.965 and recall of 0.978, compared with 0.879 and 0.981 for the Naive Bayes and Random Forest, respectively (precision 0.787 and 0.958, and recalls 0.878 and 0.827). The precision by ICD codes for cirrhosis was 0.883 and recall was 0.978. CONCLUSIONS: A CNN model trained on discharge summaries identified cirrhosis patients with high precision and recall. This approach for phenotyping cirrhosis in the EHR may provide a more accurate assessment of disease burden in a variety of studies.
Authors: Guergana K Savova; James J Masanz; Philip V Ogren; Jiaping Zheng; Sunghwan Sohn; Karin C Kipper-Schuler; Christopher G Chute Journal: J Am Med Inform Assoc Date: 2010 Sep-Oct Impact factor: 4.497
Authors: P S Kamath; R H Wiesner; M Malinchoc; W Kremers; T M Therneau; C L Kosberg; G D'Amico; E R Dickson; W R Kim Journal: Hepatology Date: 2001-02 Impact factor: 17.425
Authors: Zubair Afzal; Martijn J Schuemie; Jan C van Blijderveen; Elif F Sen; Miriam C J M Sturkenboom; Jan A Kors Journal: BMC Med Inform Decis Mak Date: 2013-03-02 Impact factor: 2.796
Authors: Jihad S Obeid; Laura M Beskow; Marie Rape; Ramkiran Gouripeddi; R Anthony Black; James J Cimino; Peter J Embi; Chunhua Weng; Rebecca Marnocha; John B Buse Journal: J Clin Transl Sci Date: 2017-08
Authors: Lauren Lapointe-Shaw; Firass Georgie; David Carlone; Orlando Cerocchi; Hannah Chung; Yvonne Dewit; Jordan J Feld; Laura Holder; Jeffrey C Kwong; Beate Sander; Jennifer A Flemming Journal: PLoS One Date: 2018-08-22 Impact factor: 3.240
Authors: Jihad S Obeid; Erin R Weeda; Andrew J Matuskowitz; Kevin Gagnon; Tami Crawford; Christine M Carr; Lewis J Frey Journal: BMC Med Inform Decis Mak Date: 2019-08-19 Impact factor: 2.796