OBJECTIVE: To identify patients with heart failure (HF) by using language contained in the electronic medical record (EMR). METHODS: We validated 2 methods of identifying HF through the EMR, which offers transcription of clinical notes within 24 hours or less of the encounter. The first method was natural language processing (NLP) of the EMR text. The second method was predictive modeling based on machine learning, using the text of clinical reports. Natural language processing was compared with both manual record review and billing records. Predictive modeling was compared with manual record review. RESULTS: Natural language processing identified 2904 HF cases; billing records independently identified 1684 HF cases, 252 (15%) of them not identified by NLP. Review of a random sample of these 252 cases did not identify HF, yielding 100% sensitivity (95% confidence interval [CI] = 86, 100) and 97.8% specificity (95% CI = 97.7, 97.9) for NLP. Manual review confirmed 1107 of the 2904 cases identified by NLP, yielding a positive predictive value (PPV) of 38% (95% CI = 36, 40). Predictive modeling yielded a PPV of 82% (95% CI = 73,93), 56% sensitivity (95% CI = 46, 67), and 96% specificity (95% CI = 94, 99). CONCLUSIONS: The EMR can be used to identify HF via 2 complementary approaches. Natural language processing may be more suitable for studies requiring highest sensitivity, whereas predictive modeling may be more suitable for studies requiring higher PPV.
OBJECTIVE: To identify patients with heart failure (HF) by using language contained in the electronic medical record (EMR). METHODS: We validated 2 methods of identifying HF through the EMR, which offers transcription of clinical notes within 24 hours or less of the encounter. The first method was natural language processing (NLP) of the EMR text. The second method was predictive modeling based on machine learning, using the text of clinical reports. Natural language processing was compared with both manual record review and billing records. Predictive modeling was compared with manual record review. RESULTS: Natural language processing identified 2904 HF cases; billing records independently identified 1684 HF cases, 252 (15%) of them not identified by NLP. Review of a random sample of these 252 cases did not identify HF, yielding 100% sensitivity (95% confidence interval [CI] = 86, 100) and 97.8% specificity (95% CI = 97.7, 97.9) for NLP. Manual review confirmed 1107 of the 2904 cases identified by NLP, yielding a positive predictive value (PPV) of 38% (95% CI = 36, 40). Predictive modeling yielded a PPV of 82% (95% CI = 73,93), 56% sensitivity (95% CI = 46, 67), and 96% specificity (95% CI = 94, 99). CONCLUSIONS: The EMR can be used to identify HF via 2 complementary approaches. Natural language processing may be more suitable for studies requiring highest sensitivity, whereas predictive modeling may be more suitable for studies requiring higher PPV.
Authors: Jennifer H Garvin; Scott L DuVall; Brett R South; Bruce E Bray; Daniel Bolton; Julia Heavirland; Steve Pickard; Paul Heidenreich; Shuying Shen; Charlene Weir; Matthew Samore; Mary K Goldstein Journal: J Am Med Inform Assoc Date: 2012-03-21 Impact factor: 4.497
Authors: Henk Harkema; Wendy W Chapman; Melissa Saul; Evan S Dellon; Robert E Schoen; Ateev Mehrotra Journal: J Am Med Inform Assoc Date: 2011-09-21 Impact factor: 4.497
Authors: Jonathan S Schildcrout; Melissa A Basford; Jill M Pulley; Daniel R Masys; Dan M Roden; Deede Wang; Christopher G Chute; Iftikhar J Kullo; David Carrell; Peggy Peissig; Abel Kho; Joshua C Denny Journal: J Biomed Inform Date: 2010-08-03 Impact factor: 6.317