Ben J Marafino1, Jason M Davies2, Naomi S Bardach3, Mitzi L Dean1, R Adams Dudley4. 1. Philip R. Lee Institute for Health Policy Studies, School of Medicine, University of California, San Francisco, California, USA. 2. Philip R. Lee Institute for Health Policy Studies, School of Medicine, University of California, San Francisco, California, USA Department of Neurosurgery, University of California, San Francisco, California, USA. 3. Philip R. Lee Institute for Health Policy Studies, School of Medicine, University of California, San Francisco, California, USA Department of Pediatrics, University of California, San Francisco, California, USA. 4. Philip R. Lee Institute for Health Policy Studies, School of Medicine, University of California, San Francisco, California, USA Center for Healthcare Value, University of California, San Francisco, California, USA Department of Epidemiology and Biostatistics, University of California, San Francisco, California, USA Department of Medicine, University of California, San Francisco, California, USA.
Abstract
BACKGROUND: Existing risk adjustment models for intensive care unit (ICU) outcomes rely on manual abstraction of patient-level predictors from medical charts. Developing an automated method for abstracting these data from free text might reduce cost and data collection times. OBJECTIVE: To develop a support vector machine (SVM) classifier capable of identifying a range of procedures and diagnoses in ICU clinical notes for use in risk adjustment. MATERIALS AND METHODS: We selected notes from 2001-2008 for 4191 neonatal ICU (NICU) and 2198 adult ICU patients from the MIMIC-II database from the Beth Israel Deaconess Medical Center. Using these notes, we developed an implementation of the SVM classifier to identify procedures (mechanical ventilation and phototherapy in NICU notes) and diagnoses (jaundice in NICU and intracranial hemorrhage (ICH) in adult ICU). On the jaundice classification task, we also compared classifier performance using n-gram features to unigrams with application of a negation algorithm (NegEx). RESULTS: Our classifier accurately identified mechanical ventilation (accuracy=0.982, F1=0.954) and phototherapy use (accuracy=0.940, F1=0.912), as well as jaundice (accuracy=0.898, F1=0.884) and ICH diagnoses (accuracy=0.938, F1=0.943). Including bigram features improved performance on the jaundice (accuracy=0.898 vs 0.865) and ICH (0.938 vs 0.927) tasks, and outperformed NegEx-derived unigram features (accuracy=0.898 vs 0.863) on the jaundice task. DISCUSSION: Overall, a classifier using n-gram support vectors displayed excellent performance characteristics. The classifier generalizes to diverse patient populations, diagnoses, and procedures. CONCLUSIONS: SVM-based classifiers can accurately identify procedure status and diagnoses among ICU patients, and including n-gram features improves performance, compared to existing methods. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
BACKGROUND: Existing risk adjustment models for intensive care unit (ICU) outcomes rely on manual abstraction of patient-level predictors from medical charts. Developing an automated method for abstracting these data from free text might reduce cost and data collection times. OBJECTIVE: To develop a support vector machine (SVM) classifier capable of identifying a range of procedures and diagnoses in ICU clinical notes for use in risk adjustment. MATERIALS AND METHODS: We selected notes from 2001-2008 for 4191 neonatal ICU (NICU) and 2198 adult ICU patients from the MIMIC-II database from the Beth Israel Deaconess Medical Center. Using these notes, we developed an implementation of the SVM classifier to identify procedures (mechanical ventilation and phototherapy in NICU notes) and diagnoses (jaundice in NICU and intracranial hemorrhage (ICH) in adult ICU). On the jaundice classification task, we also compared classifier performance using n-gram features to unigrams with application of a negation algorithm (NegEx). RESULTS: Our classifier accurately identified mechanical ventilation (accuracy=0.982, F1=0.954) and phototherapy use (accuracy=0.940, F1=0.912), as well as jaundice (accuracy=0.898, F1=0.884) and ICH diagnoses (accuracy=0.938, F1=0.943). Including bigram features improved performance on the jaundice (accuracy=0.898 vs 0.865) and ICH (0.938 vs 0.927) tasks, and outperformed NegEx-derived unigram features (accuracy=0.898 vs 0.863) on the jaundice task. DISCUSSION: Overall, a classifier using n-gram support vectors displayed excellent performance characteristics. The classifier generalizes to diverse patient populations, diagnoses, and procedures. CONCLUSIONS: SVM-based classifiers can accurately identify procedure status and diagnoses among ICU patients, and including n-gram features improves performance, compared to existing methods. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Authors: Sara E Erickson; Eduard E Vasilevskis; Michael W Kuzniewicz; Brian A Cason; Rondall K Lane; Mitzi L Dean; Deborah J Rennie; R Adams Dudley Journal: Crit Care Med Date: 2011-03 Impact factor: 7.598
Authors: Mohammed Saeed; Mauricio Villarroel; Andrew T Reisner; Gari Clifford; Li-Wei Lehman; George Moody; Thomas Heldt; Tin H Kyaw; Benjamin Moody; Roger G Mark Journal: Crit Care Med Date: 2011-05 Impact factor: 7.598
Authors: Hanqing Cao; K P Lee; Colleen M Ennett; Larry Eshelman; Larry Nielsen; Mohammed Saeed; Brian Gross Journal: Annu Int Conf IEEE Eng Med Biol Soc Date: 2010
Authors: Michael W Kuzniewicz; Eduard E Vasilevskis; Rondall Lane; Mitzi L Dean; Nisha G Trivedi; Deborah J Rennie; Ted Clay; Pamela L Kotler; R Adams Dudley Journal: Chest Date: 2008-04-10 Impact factor: 9.410
Authors: Wendy W Chapman; Prakash M Nadkarni; Lynette Hirschman; Leonard W D'Avolio; Guergana K Savova; Ozlem Uzuner Journal: J Am Med Inform Assoc Date: 2011 Sep-Oct Impact factor: 4.497
Authors: Adam Wright; Allison B McCoy; Stanislav Henkin; Abhivyakti Kale; Dean F Sittig Journal: J Am Med Inform Assoc Date: 2013-03-30 Impact factor: 4.497
Authors: Eduard E Vasilevskis; Michael W Kuzniewicz; Brian A Cason; Rondall K Lane; Mitzi L Dean; Ted Clay; Deborah J Rennie; R Adams Dudley Journal: J Crit Care Date: 2010-08-16 Impact factor: 3.425
Authors: Garren Gaut; Mark Steyvers; Zac E Imel; David C Atkins; Padhraic Smyth Journal: IEEE J Biomed Health Inform Date: 2015-11-25 Impact factor: 5.772
Authors: Siddharth Biswal; Zarina Nip; Valdery Moura Junior; Matt T Bianchi; Eric S Rosenthal; M Brandon Westover Journal: Conf Proc IEEE Eng Med Biol Soc Date: 2015