Alison M Uyeda1, J Randall Curtis2, Ruth A Engelberg3, Lyndia C Brumback4, Yue Guo5, James Sibley6, William B Lober7, Trevor Cohen5, Janaki Torrence3, Joanna Heywood3, Sudiptho R Paul3, Erin K Kross3, Robert Y Lee3. 1. Department of Medicine (A.M.U., J.R.C., R.A.E., J.T., J.H., S.R.P., E.K.K., R.Y.L.), University of Washington, Seattle, WA; Cambia Palliative Care Center of Excellence at UW Medicine (A.M.U., J.R.C., R.A.E., L.C.B., Y.G., J.S., W.B.L., T.C., J.T., J.H., S.R.P., E.K.K., R.Y.L.), University of Washington, Seattle, WA. 2. Department of Medicine (A.M.U., J.R.C., R.A.E., J.T., J.H., S.R.P., E.K.K., R.Y.L.), University of Washington, Seattle, WA; Cambia Palliative Care Center of Excellence at UW Medicine (A.M.U., J.R.C., R.A.E., L.C.B., Y.G., J.S., W.B.L., T.C., J.T., J.H., S.R.P., E.K.K., R.Y.L.), University of Washington, Seattle, WA; Division of Pulmonary, Critical Care, and Sleep Medicine (J.R.C., R.A.E., J.T., J.H., S.R.P., E.K.K., R.Y.L.), Department of Medicine, Harborview Medical Center, University of Washington, Seattle, WA; Department of Biobehavioral Nursing and Health Informatics (J.R.C., J.S., W.B.L.), University of Washington, Seattle, WA. Electronic address: jrc@u.washington.edu. 3. Department of Medicine (A.M.U., J.R.C., R.A.E., J.T., J.H., S.R.P., E.K.K., R.Y.L.), University of Washington, Seattle, WA; Cambia Palliative Care Center of Excellence at UW Medicine (A.M.U., J.R.C., R.A.E., L.C.B., Y.G., J.S., W.B.L., T.C., J.T., J.H., S.R.P., E.K.K., R.Y.L.), University of Washington, Seattle, WA; Division of Pulmonary, Critical Care, and Sleep Medicine (J.R.C., R.A.E., J.T., J.H., S.R.P., E.K.K., R.Y.L.), Department of Medicine, Harborview Medical Center, University of Washington, Seattle, WA. 4. Cambia Palliative Care Center of Excellence at UW Medicine (A.M.U., J.R.C., R.A.E., L.C.B., Y.G., J.S., W.B.L., T.C., J.T., J.H., S.R.P., E.K.K., R.Y.L.), University of Washington, Seattle, WA; Department of Biostatistics (L.C.B.), University of Washington, Seattle, WA. 5. Cambia Palliative Care Center of Excellence at UW Medicine (A.M.U., J.R.C., R.A.E., L.C.B., Y.G., J.S., W.B.L., T.C., J.T., J.H., S.R.P., E.K.K., R.Y.L.), University of Washington, Seattle, WA; Department of Biomedical Informatics and Medical Education (Y.G., W.B.L., T.C.), University of Washington, Seattle, WA. 6. Cambia Palliative Care Center of Excellence at UW Medicine (A.M.U., J.R.C., R.A.E., L.C.B., Y.G., J.S., W.B.L., T.C., J.T., J.H., S.R.P., E.K.K., R.Y.L.), University of Washington, Seattle, WA; Department of Biobehavioral Nursing and Health Informatics (J.R.C., J.S., W.B.L.), University of Washington, Seattle, WA. 7. Cambia Palliative Care Center of Excellence at UW Medicine (A.M.U., J.R.C., R.A.E., L.C.B., Y.G., J.S., W.B.L., T.C., J.T., J.H., S.R.P., E.K.K., R.Y.L.), University of Washington, Seattle, WA; Department of Biomedical Informatics and Medical Education (Y.G., W.B.L., T.C.), University of Washington, Seattle, WA; Department of Biobehavioral Nursing and Health Informatics (J.R.C., J.S., W.B.L.), University of Washington, Seattle, WA.
Abstract
CONTEXT: Documented goals-of-care discussions are an important quality metric for patients with serious illness. Natural language processing (NLP) is a promising approach for identifying goals-of-care discussions in the electronic health record (EHR). OBJECTIVES: To compare three NLP modeling approaches for identifying EHR documentation of goals-of-care discussions and generate hypotheses about differences in performance. METHODS: We conducted a mixed-methods study to evaluate performance and misclassification for three NLP featurization approaches modeled with regularized logistic regression: bag-of-words (BOW), rule-based, and a hybrid approach. From a prospective cohort of 150 patients hospitalized with serious illness over 2018 to 2020, we collected 4391 inpatient EHR notes; 99 (2.3%) contained documented goals-of-care discussions. We used leave-one-out cross-validation to estimate performance by comparing pooled NLP predictions to human abstractors with receiver-operating-characteristic (ROC) and precision-recall (PR) analyses. We qualitatively examined a purposive sample of 70 NLP-misclassified notes using content analysis to identify linguistic features that allowed us to generate hypotheses underpinning misclassification. RESULTS: All three modeling approaches discriminated between notes with and without goals-of-care discussions (AUCROC: BOW, 0.907; rule-based, 0.948; hybrid, 0.965). Precision and recall were only moderate (precision at 70% recall: BOW, 16.2%; rule-based, 50.4%; hybrid, 49.3%; AUCPR: BOW, 0.505; rule-based, 0.579; hybrid, 0.599). Qualitative analysis revealed patterns underlying performance differences between BOW and rule-based approaches. CONCLUSION: NLP holds promise for identifying EHR-documented goals-of-care discussions. However, the rarity of goals-of-care content in EHR data limits performance. Our findings highlight opportunities to optimize NLP modeling approaches, and support further exploration of different NLP approaches to identify goals-of-care discussions.
CONTEXT: Documented goals-of-care discussions are an important quality metric for patients with serious illness. Natural language processing (NLP) is a promising approach for identifying goals-of-care discussions in the electronic health record (EHR). OBJECTIVES: To compare three NLP modeling approaches for identifying EHR documentation of goals-of-care discussions and generate hypotheses about differences in performance. METHODS: We conducted a mixed-methods study to evaluate performance and misclassification for three NLP featurization approaches modeled with regularized logistic regression: bag-of-words (BOW), rule-based, and a hybrid approach. From a prospective cohort of 150 patients hospitalized with serious illness over 2018 to 2020, we collected 4391 inpatient EHR notes; 99 (2.3%) contained documented goals-of-care discussions. We used leave-one-out cross-validation to estimate performance by comparing pooled NLP predictions to human abstractors with receiver-operating-characteristic (ROC) and precision-recall (PR) analyses. We qualitatively examined a purposive sample of 70 NLP-misclassified notes using content analysis to identify linguistic features that allowed us to generate hypotheses underpinning misclassification. RESULTS: All three modeling approaches discriminated between notes with and without goals-of-care discussions (AUCROC: BOW, 0.907; rule-based, 0.948; hybrid, 0.965). Precision and recall were only moderate (precision at 70% recall: BOW, 16.2%; rule-based, 50.4%; hybrid, 49.3%; AUCPR: BOW, 0.505; rule-based, 0.579; hybrid, 0.599). Qualitative analysis revealed patterns underlying performance differences between BOW and rule-based approaches. CONCLUSION: NLP holds promise for identifying EHR-documented goals-of-care discussions. However, the rarity of goals-of-care content in EHR data limits performance. Our findings highlight opportunities to optimize NLP modeling approaches, and support further exploration of different NLP approaches to identify goals-of-care discussions.
Authors: Jill M Steiner; Christina Morse; Robert Y Lee; J Randall Curtis; Ruth A Engelberg Journal: J Pain Symptom Manage Date: 2020-06-26 Impact factor: 3.612