Literature DB >> 23543111

Use of a support vector machine for categorizing free-text notes: assessment of accuracy across two institutions.

Adam Wright1, Allison B McCoy, Stanislav Henkin, Abhivyakti Kale, Dean F Sittig.   

Abstract

BACKGROUND: Electronic health record (EHR) users must regularly review large amounts of data in order to make informed clinical decisions, and such review is time-consuming and often overwhelming. Technologies like automated summarization tools, EHR search engines and natural language processing have been shown to help clinicians manage this information.
OBJECTIVE: To develop a support vector machine (SVM)-based system for identifying EHR progress notes pertaining to diabetes, and to validate it at two institutions.
MATERIALS AND METHODS: We retrieved 2000 EHR progress notes from patients with diabetes at the Brigham and Women's Hospital (1000 for training and 1000 for testing) and another 1000 notes from the University of Texas Physicians (for validation). We manually annotated all notes and trained a SVM using a bag of words approach. We then used the SVM on the testing and validation sets and evaluated its performance with the area under the curve (AUC) and F statistics.
RESULTS: The model accurately identified diabetes-related notes in both the Brigham and Women's Hospital testing set (AUC=0.956, F=0.934) and the external University of Texas Faculty Physicians validation set (AUC=0.947, F=0.935). DISCUSSION: Overall, the model we developed was quite accurate. Furthermore, it generalized, without loss of accuracy, to another institution with a different EHR and a distinct patient and provider population.
CONCLUSIONS: It is possible to use a SVM-based classifier to identify EHR progress notes pertaining to diabetes, and the model generalizes well.

Entities:  

Keywords:  electronic health record; natural language processing; search; support vector machine

Mesh:

Year:  2013        PMID: 23543111      PMCID: PMC3756266          DOI: 10.1136/amiajnl-2012-001576

Source DB:  PubMed          Journal:  J Am Med Inform Assoc        ISSN: 1067-5027            Impact factor:   4.497


  27 in total

1.  Natural language processing and its future in medicine.

Authors:  C Friedman; G Hripcsak
Journal:  Acad Med       Date:  1999-08       Impact factor: 6.893

2.  Quantifying clinical narrative redundancy in an electronic health record.

Authors:  Jesse O Wrenn; Daniel M Stein; Suzanne Bakken; Peter D Stetson
Journal:  J Am Med Inform Assoc       Date:  2010 Jan-Feb       Impact factor: 4.497

3.  Overcoming barriers to NLP for clinical text: the role of shared tasks and the need for additional creative solutions.

Authors:  Wendy W Chapman; Prakash M Nadkarni; Lynette Hirschman; Leonard W D'Avolio; Guergana K Savova; Ozlem Uzuner
Journal:  J Am Med Inform Assoc       Date:  2011 Sep-Oct       Impact factor: 4.497

4.  Summarization of clinical information: a conceptual model.

Authors:  Joshua C Feblowitz; Adam Wright; Hardeep Singh; Lipika Samal; Dean F Sittig
Journal:  J Biomed Inform       Date:  2011-03-31       Impact factor: 6.317

5.  A flexible framework for deriving assertions from electronic medical records.

Authors:  Kirk Roberts; Sanda M Harabagiu
Journal:  J Am Med Inform Assoc       Date:  2011-07-01       Impact factor: 4.497

6.  Hybrid methods for improving information access in clinical documents: concept, assertion, and relation identification.

Authors:  Anne-Lyse Minard; Anne-Laure Ligozat; Asma Ben Abacha; Delphine Bernhard; Bruno Cartoni; Louise Deléger; Brigitte Grau; Sophie Rosset; Pierre Zweigenbaum; Cyril Grouin
Journal:  J Am Med Inform Assoc       Date:  2011-05-19       Impact factor: 4.497

7.  Naïve Electronic Health Record phenotype identification for Rheumatoid arthritis.

Authors:  Robert J Carroll; Anne E Eyler; Joshua C Denny
Journal:  AMIA Annu Symp Proc       Date:  2011-10-22

8.  Automated Classification of Radiology Reports for Acute Lung Injury: Comparison of Keyword and Machine Learning Based Natural Language Processing Approaches.

Authors:  Imre Solti; Colin R Cooke; Fei Xia; Mark M Wurfel
Journal:  Proceedings (IEEE Int Conf Bioinformatics Biomed)       Date:  2009-11

9.  An analysis of clinical queries in an electronic health record search utility.

Authors:  Karthik Natarajan; Daniel Stein; Samat Jain; Noémie Elhadad
Journal:  Int J Med Inform       Date:  2010-04-24       Impact factor: 4.046

10.  Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system.

Authors:  Qing T Zeng; Sergey Goryachev; Scott Weiss; Margarita Sordo; Shawn N Murphy; Ross Lazarus
Journal:  BMC Med Inform Decis Mak       Date:  2006-07-26       Impact factor: 2.796

View more
  14 in total

1.  A systematic comparison of feature space effects on disease classifier performance for phenotype identification of five diseases.

Authors:  Christopher Kotfila; Özlem Uzuner
Journal:  J Biomed Inform       Date:  2015-08-01       Impact factor: 6.317

2.  Text classification for assisting moderators in online health communities.

Authors:  Jina Huh; Meliha Yetisgen-Yildiz; Wanda Pratt
Journal:  J Biomed Inform       Date:  2013-09-08       Impact factor: 6.317

Review 3.  Recent Advances in Clinical Natural Language Processing in Support of Semantic Analysis.

Authors:  S Velupillai; D Mowery; B R South; M Kvist; H Dalianis
Journal:  Yearb Med Inform       Date:  2015-08-13

4.  Automatic classification of RDoC positive valence severity with a neural network.

Authors:  Cheryl Clark; Ben Wellner; Rachel Davis; John Aberdeen; Lynette Hirschman
Journal:  J Biomed Inform       Date:  2017-07-08       Impact factor: 6.317

5.  Natural Language Processing for Cohort Discovery in a Discharge Prediction Model for the Neonatal ICU.

Authors:  Michael W Temple; Christoph U Lehmann; Daniel Fabbri
Journal:  Appl Clin Inform       Date:  2016-02-24       Impact factor: 2.342

6.  N-gram support vector machines for scalable procedure and diagnosis classification, with applications to clinical free text data from the intensive care unit.

Authors:  Ben J Marafino; Jason M Davies; Naomi S Bardach; Mitzi L Dean; R Adams Dudley
Journal:  J Am Med Inform Assoc       Date:  2014-04-30       Impact factor: 4.497

7.  Using Natural Language Processing to Measure and Improve Quality of Diabetes Care: A Systematic Review.

Authors:  Alexander Turchin; Luisa F Florez Builes
Journal:  J Diabetes Sci Technol       Date:  2021-03-19

8.  A novel method of adverse event detection can accurately identify venous thromboembolisms (VTEs) from narrative electronic health record data.

Authors:  Christian M Rochefort; Aman D Verma; Tewodros Eguale; Todd C Lee; David L Buckeridge
Journal:  J Am Med Inform Assoc       Date:  2014-10-20       Impact factor: 4.497

9.  Visual Interpretation of Biomedical Time Series Using Parzen Window-Based Density-Amplitude Domain Transformation.

Authors:  Selahaddin Batuhan Akben; Ahmet Alkan
Journal:  PLoS One       Date:  2016-09-28       Impact factor: 3.240

10.  Predicting Health Care Utilization After Behavioral Health Referral Using Natural Language Processing and Machine Learning.

Authors:  Nathaniel Roysden; Adam Wright
Journal:  AMIA Annu Symp Proc       Date:  2015-11-05
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.