Literature DB >> 24578357

Learning regular expressions for clinical text classification.

Duy Duc An Bui1, Qing Zeng-Treitler1.   

Abstract

OBJECTIVES: Natural language processing (NLP) applications typically use regular expressions that have been developed manually by human experts. Our goal is to automate both the creation and utilization of regular expressions in text classification.
METHODS: We designed a novel regular expression discovery (RED) algorithm and implemented two text classifiers based on RED. The RED+ALIGN classifier combines RED with an alignment algorithm, and RED+SVM combines RED with a support vector machine (SVM) classifier. Two clinical datasets were used for testing and evaluation: the SMOKE dataset, containing 1091 text snippets describing smoking status; and the PAIN dataset, containing 702 snippets describing pain status. We performed 10-fold cross-validation to calculate accuracy, precision, recall, and F-measure metrics. In the evaluation, an SVM classifier was trained as the control.
RESULTS: The two RED classifiers achieved 80.9-83.0% in overall accuracy on the two datasets, which is 1.3-3% higher than SVM's accuracy (p<0.001). Similarly, small but consistent improvements have been observed in precision, recall, and F-measure when RED classifiers are compared with SVM alone. More significantly, RED+ALIGN correctly classified many instances that were misclassified by the SVM classifier (8.1-10.3% of the total instances and 43.8-53.0% of SVM's misclassifications).
CONCLUSIONS: Machine-generated regular expressions can be effectively used in clinical text classification. The regular expression-based classifier can be combined with other classifiers, like SVM, to improve classification performance. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

Entities:  

Keywords:  Machine Learning; Natural Language Processing; Regular Expressions; Support Vector Machines; Text Classification

Mesh:

Year:  2014        PMID: 24578357      PMCID: PMC4147608          DOI: 10.1136/amiajnl-2013-002411

Source DB:  PubMed          Journal:  J Am Med Inform Assoc        ISSN: 1067-5027            Impact factor:   4.497


  26 in total

1.  Automated categorisation of clinical incident reports using statistical text classification.

Authors:  Mei-Sing Ong; Farah Magrabi; Enrico Coiera
Journal:  Qual Saf Health Care       Date:  2010-08-19

2.  Automated information extraction of key trial design elements from clinical trial publications.

Authors:  Berry de Bruijn; Simona Carini; Svetlana Kiritchenko; Joel Martin; Ida Sim
Journal:  AMIA Annu Symp Proc       Date:  2008-11-06

3.  Use of semantic features to classify patient smoking status.

Authors:  Patrick J McCormick; Noémie Elhadad; Peter D Stetson
Journal:  AMIA Annu Symp Proc       Date:  2008-11-06

4.  Facilitating pharmacogenetic studies using electronic health records and natural-language processing: a case study of warfarin.

Authors:  Hua Xu; Min Jiang; Matt Oetjens; Erica A Bowton; Andrea H Ramirez; Janina M Jeff; Melissa A Basford; Jill M Pulley; James D Cowan; Xiaoming Wang; Marylyn D Ritchie; Daniel R Masys; Dan M Roden; Dana C Crawford; Joshua C Denny
Journal:  J Am Med Inform Assoc       Date:  2011 Jul-Aug       Impact factor: 4.497

5.  Active learning for clinical text classification: is it better than random sampling?

Authors:  Rosa L Figueroa; Qing Zeng-Treitler; Long H Ngo; Sergey Goryachev; Eduardo P Wiechmann
Journal:  J Am Med Inform Assoc       Date:  2012-06-15       Impact factor: 4.497

6.  Identifying patient smoking status from medical discharge records.

Authors:  Ozlem Uzuner; Ira Goldstein; Yuan Luo; Isaac Kohane
Journal:  J Am Med Inform Assoc       Date:  2007-10-18       Impact factor: 4.497

7.  A hybrid approach to sentiment sentence classification in suicide notes.

Authors:  Sunghwan Sohn; Manabu Torii; Dingcheng Li; Kavishwar Wagholikar; Stephen Wu; Hongfang Liu
Journal:  Biomed Inform Insights       Date:  2012-01-30

8.  Clinical decision support with automated text processing for cervical cancer screening.

Authors:  Kavishwar B Wagholikar; Kathy L MacLaughlin; Michael R Henry; Robert A Greenes; Ronald A Hankey; Hongfang Liu; Rajeev Chaudhry
Journal:  J Am Med Inform Assoc       Date:  2012-04-29       Impact factor: 4.497

9.  Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system.

Authors:  Qing T Zeng; Sergey Goryachev; Scott Weiss; Margarita Sordo; Shawn N Murphy; Ross Lazarus
Journal:  BMC Med Inform Decis Mak       Date:  2006-07-26       Impact factor: 2.796

10.  Automatic topic identification of health-related messages in online health community using text classification.

Authors:  Yingjie Lu
Journal:  Springerplus       Date:  2013-07-10
View more
  19 in total

1.  Trends in biomedical informatics: automated topic analysis of JAMIA articles.

Authors:  Dong Han; Shuang Wang; Chao Jiang; Xiaoqian Jiang; Hyeon-Eui Kim; Jimeng Sun; Lucila Ohno-Machado
Journal:  J Am Med Inform Assoc       Date:  2015-11       Impact factor: 4.497

2.  Analyzing the Usage of Standards in Radiation Therapy Clinical Studies.

Authors:  Y Zhen; Y Jiang; L Yuan; J Kirkpartrick; J Wu; Y Ge
Journal:  IEEE EMBS Int Conf Biomed Health Inform       Date:  2017-04-13

3.  Convolutional Neural Networks for Biomedical Text Classification: Application in Indexing Biomedical Articles.

Authors:  Anthony Rios; Ramakanth Kavuluru
Journal:  ACM BCB       Date:  2015-09

4.  Extractive text summarization system to aid data extraction from full text in systematic review development.

Authors:  Duy Duc An Bui; Guilherme Del Fiol; John F Hurdle; Siddhartha Jonnalagadda
Journal:  J Biomed Inform       Date:  2016-10-27       Impact factor: 6.317

5.  A Software Application for Mining and Presenting Relevant Cancer Clinical Trials per Cancer Mutation.

Authors:  Lisa M Gandy; Jordan Gumm; Amanda L Blackford; Elana J Fertig; Luis A Diaz
Journal:  Cancer Inform       Date:  2017-06-22

6.  Classifying clinical notes with pain assessment using machine learning.

Authors:  Samah Jamal Fodeh; Dezon Finch; Lina Bouayad; Stephen L Luther; Han Ling; Robert D Kerns; Cynthia Brandt
Journal:  Med Biol Eng Comput       Date:  2017-12-26       Impact factor: 2.602

7.  Exploring Reasons for Delayed Start-of-Care Nursing Visits in Home Health Care: Algorithm Development and Data Science Study.

Authors:  Maryam Zolnoori; Jiyoun Song; Margaret V McDonald; Yolanda Barrón; Kenrick Cato; Paulina Sockolow; Sridevi Sridharan; Nicole Onorato; Kathryn H Bowles; Maxim Topaz
Journal:  JMIR Nurs       Date:  2021-12-30

8.  Regular Expression-Based Learning for METs Value Extraction.

Authors:  Douglas Redd; Jinqiu Kuang; April Mohanty; Bruce E Bray; Qing Zeng-Treitler
Journal:  AMIA Jt Summits Transl Sci Proc       Date:  2016-07-20

9.  Extraction of Ejection Fraction from Echocardiography Notes for Constructing a Cohort of Patients having Heart Failure with reduced Ejection Fraction (HFrEF).

Authors:  Kavishwar B Wagholikar; Christina M Fischer; Alyssa Goodson; Christopher D Herrick; Martin Rees; Eloy Toscano; Calum A MacRae; Benjamin M Scirica; Akshay S Desai; Shawn N Murphy
Journal:  J Med Syst       Date:  2018-09-25       Impact factor: 4.460

10.  PDF text classification to leverage information extraction from publication reports.

Authors:  Duy Duc An Bui; Guilherme Del Fiol; Siddhartha Jonnalagadda
Journal:  J Biomed Inform       Date:  2016-04-01       Impact factor: 6.317

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.