Theresa A Koleck1, Caitlin Dreisbach2,3, Philip E Bourne3, Suzanne Bakken1,4,5. 1. School of Nursing, Columbia University, New York, New York, USA. 2. School of Nursing, University of Virginia, Charlottesville, Virginia, USA. 3. Data Science Institute, University of Virginia, Charlottesville, Virginia, USA. 4. Department of Biomedical Informatics, Columbia University, New York, New York, USA. 5. Data Science Institute, Columbia University, New York, New York, USA.
Abstract
OBJECTIVE: Natural language processing (NLP) of symptoms from electronic health records (EHRs) could contribute to the advancement of symptom science. We aim to synthesize the literature on the use of NLP to process or analyze symptom information documented in EHR free-text narratives. MATERIALS AND METHODS: Our search of 1964 records from PubMed and EMBASE was narrowed to 27 eligible articles. Data related to the purpose, free-text corpus, patients, symptoms, NLP methodology, evaluation metrics, and quality indicators were extracted for each study. RESULTS: Symptom-related information was presented as a primary outcome in 14 studies. EHR narratives represented various inpatient and outpatient clinical specialties, with general, cardiology, and mental health occurring most frequently. Studies encompassed a wide variety of symptoms, including shortness of breath, pain, nausea, dizziness, disturbed sleep, constipation, and depressed mood. NLP approaches included previously developed NLP tools, classification methods, and manually curated rule-based processing. Only one-third (n = 9) of studies reported patient demographic characteristics. DISCUSSION: NLP is used to extract information from EHR free-text narratives written by a variety of healthcare providers on an expansive range of symptoms across diverse clinical specialties. The current focus of this field is on the development of methods to extract symptom information and the use of symptom information for disease classification tasks rather than the examination of symptoms themselves. CONCLUSION: Future NLP studies should concentrate on the investigation of symptoms and symptom documentation in EHR free-text narratives. Efforts should be undertaken to examine patient characteristics and make symptom-related NLP algorithms or pipelines and vocabularies openly available.
OBJECTIVE: Natural language processing (NLP) of symptoms from electronic health records (EHRs) could contribute to the advancement of symptom science. We aim to synthesize the literature on the use of NLP to process or analyze symptom information documented in EHR free-text narratives. MATERIALS AND METHODS: Our search of 1964 records from PubMed and EMBASE was narrowed to 27 eligible articles. Data related to the purpose, free-text corpus, patients, symptoms, NLP methodology, evaluation metrics, and quality indicators were extracted for each study. RESULTS: Symptom-related information was presented as a primary outcome in 14 studies. EHR narratives represented various inpatient and outpatient clinical specialties, with general, cardiology, and mental health occurring most frequently. Studies encompassed a wide variety of symptoms, including shortness of breath, pain, nausea, dizziness, disturbed sleep, constipation, and depressed mood. NLP approaches included previously developed NLP tools, classification methods, and manually curated rule-based processing. Only one-third (n = 9) of studies reported patient demographic characteristics. DISCUSSION: NLP is used to extract information from EHR free-text narratives written by a variety of healthcare providers on an expansive range of symptoms across diverse clinical specialties. The current focus of this field is on the development of methods to extract symptom information and the use of symptom information for disease classification tasks rather than the examination of symptoms themselves. CONCLUSION: Future NLP studies should concentrate on the investigation of symptoms and symptom documentation in EHR free-text narratives. Efforts should be undertaken to examine patient characteristics and make symptom-related NLP algorithms or pipelines and vocabularies openly available.
Authors: Li Zhou; Amy W Baughman; Victor J Lei; Kenneth H Lai; Amol S Navathe; Frank Chang; Margarita Sordo; Maxim Topaz; Feiran Zhong; Madhavan Murrali; Shamkant Navathe; Roberto A Rocha Journal: Stud Health Technol Inform Date: 2015
Authors: Adi V Gundlapalli; Brett R South; Shobha Phansalkar; Anita Y Kinney; Shuying Shen; Sylvain Delisle; Trish Perl; Matthew H Samore Journal: Summit Transl Bioinform Date: 2008-03-01
Authors: Erin C McKiernan; Philip E Bourne; C Titus Brown; Stuart Buck; Amye Kenall; Jennifer Lin; Damon McDougall; Brian A Nosek; Karthik Ram; Courtney K Soderberg; Jeffrey R Spies; Kaitlin Thaney; Andrew Updegrove; Kara H Woo; Tal Yarkoni Journal: Elife Date: 2016-07-07 Impact factor: 8.140
Authors: Chelsea Canan; Jennifer M Polinski; G Caleb Alexander; Mary K Kowal; Troyen A Brennan; William H Shrank Journal: J Am Med Inform Assoc Date: 2017-11-01 Impact factor: 4.497
Authors: Eric P Green; Alexandra Whitcomb; Cynthia Kahumbura; Joseph G Rosen; Siddhartha Goyal; Daphine Achieng; Ben Bellows Journal: Gates Open Res Date: 2019-05-29
Authors: David S Carrell; Bradley A Malin; David J Cronkite; John S Aberdeen; Cheryl Clark; Muqun Rachel Li; Dikshya Bastakoty; Steve Nyemba; Lynette Hirschman Journal: J Am Med Inform Assoc Date: 2020-07-01 Impact factor: 4.497