Literature DB >> 22580178

Ontology-guided feature engineering for clinical text classification.

Vijay N Garla1, Cynthia Brandt.   

Abstract

In this study we present novel feature engineering techniques that leverage the biomedical domain knowledge encoded in the Unified Medical Language System (UMLS) to improve machine-learning based clinical text classification. Critical steps in clinical text classification include identification of features and passages relevant to the classification task, and representation of clinical text to enable discrimination between documents of different classes. We developed novel information-theoretic techniques that utilize the taxonomical structure of the Unified Medical Language System (UMLS) to improve feature ranking, and we developed a semantic similarity measure that projects clinical text into a feature space that improves classification. We evaluated these methods on the 2008 Integrating Informatics with Biology and the Bedside (I2B2) obesity challenge. The methods we developed improve upon the results of this challenge's top machine-learning based system, and may improve the performance of other machine-learning based clinical text classification systems. We have released all tools developed as part of this study as open source, available at http://code.google.com/p/ytex.
Copyright © 2012 Elsevier Inc. All rights reserved.

Entities:  

Mesh:

Year:  2012        PMID: 22580178      PMCID: PMC3431438          DOI: 10.1016/j.jbi.2012.04.010

Source DB:  PubMed          Journal:  J Biomed Inform        ISSN: 1532-0464            Impact factor:   6.317


  12 in total

1.  Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications.

Authors:  Guergana K Savova; James J Masanz; Philip V Ogren; Jiaping Zheng; Sunghwan Sohn; Karin C Kipper-Schuler; Christopher G Chute
Journal:  J Am Med Inform Assoc       Date:  2010 Sep-Oct       Impact factor: 4.497

2.  An ontology-based measure to compute semantic similarity in biomedicine.

Authors:  Montserrat Batet; David Sánchez; Aida Valls
Journal:  J Biomed Inform       Date:  2010-09-15       Impact factor: 6.317

3.  Measures of semantic similarity and relatedness in the biomedical domain.

Authors:  Ted Pedersen; Serguei V S Pakhomov; Siddharth Patwardhan; Christopher G Chute
Journal:  J Biomed Inform       Date:  2006-06-10       Impact factor: 6.317

4.  Five-way smoking status classification using text hot-spot identification and error-correcting output codes.

Authors:  Aaron M Cohen
Journal:  J Am Med Inform Assoc       Date:  2007-10-18       Impact factor: 4.497

Review 5.  A review of feature selection techniques in bioinformatics.

Authors:  Yvan Saeys; Iñaki Inza; Pedro Larrañaga
Journal:  Bioinformatics       Date:  2007-08-24       Impact factor: 6.937

6.  Recognizing obesity and comorbidities in sparse data.

Authors:  Ozlem Uzuner
Journal:  J Am Med Inform Assoc       Date:  2009-04-23       Impact factor: 4.497

7.  The Yale cTAKES extensions for document classification: architecture and application.

Authors:  Vijay Garla; Vincent Lo Re; Zachariah Dorey-Stein; Farah Kidwai; Matthew Scotch; Julie Womack; Amy Justice; Cynthia Brandt
Journal:  J Am Med Inform Assoc       Date:  2011-05-27       Impact factor: 4.497

8.  Semantic similarity estimation in the biomedical domain: an ontology-based information-theoretic perspective.

Authors:  David Sánchez; Montserrat Batet
Journal:  J Biomed Inform       Date:  2011-04-02       Impact factor: 6.317

9.  Semi-automated construction of decision rules to predict morbidities from clinical texts.

Authors:  Richárd Farkas; György Szarvas; István Hegedus; Attila Almási; Veronika Vincze; Róbert Ormándi; Róbert Busa-Fekete
Journal:  J Am Med Inform Assoc       Date:  2009-04-23       Impact factor: 4.497

10.  Machine-learned solutions for three stages of clinical information extraction: the state of the art at i2b2 2010.

Authors:  Berry de Bruijn; Colin Cherry; Svetlana Kiritchenko; Joel Martin; Xiaodan Zhu
Journal:  J Am Med Inform Assoc       Date:  2011-05-12       Impact factor: 4.497

View more
  14 in total

1.  Knowledge-based biomedical word sense disambiguation: an evaluation and application to clinical document classification.

Authors:  Vijay N Garla; Cynthia Brandt
Journal:  J Am Med Inform Assoc       Date:  2012-10-16       Impact factor: 4.497

2.  Automatic prediction of coronary artery disease from clinical narratives.

Authors:  Kevin Buchan; Michele Filannino; Özlem Uzuner
Journal:  J Biomed Inform       Date:  2017-06-27       Impact factor: 6.317

3.  Development of an automated phenotyping algorithm for hepatorenal syndrome.

Authors:  Jejo D Koola; Sharon E Davis; Omar Al-Nimri; Sharidan K Parr; Daniel Fabbri; Bradley A Malin; Samuel B Ho; Michael E Matheny
Journal:  J Biomed Inform       Date:  2018-03-09       Impact factor: 6.317

4.  Semantic similarity in the biomedical domain: an evaluation across knowledge sources.

Authors:  Vijay N Garla; Cynthia Brandt
Journal:  BMC Bioinformatics       Date:  2012-10-10       Impact factor: 3.169

5.  Feature Engineering for Surrogate Models of Consolidation Degree in Additive Manufacturing.

Authors:  Mriganka Roy; Olga Wodo
Journal:  Materials (Basel)       Date:  2021-04-27       Impact factor: 3.623

6.  A novel method of adverse event detection can accurately identify venous thromboembolisms (VTEs) from narrative electronic health record data.

Authors:  Christian M Rochefort; Aman D Verma; Tewodros Eguale; Todd C Lee; David L Buckeridge
Journal:  J Am Med Inform Assoc       Date:  2014-10-20       Impact factor: 4.497

7.  Automated concept and relationship extraction for the semi-automated ontology management (SEAM) system.

Authors:  Kristina Doing-Harris; Yarden Livnat; Stephane Meystre
Journal:  J Biomed Semantics       Date:  2015-04-02

8.  Clinical text classification with rule-based features and knowledge-guided convolutional neural networks.

Authors:  Liang Yao; Chengsheng Mao; Yuan Luo
Journal:  BMC Med Inform Decis Mak       Date:  2019-04-04       Impact factor: 2.796

9.  Using cited references to improve the retrieval of related biomedical documents.

Authors:  Francisco M Ortuño; Ignacio Rojas; Miguel A Andrade-Navarro; Jean-Fred Fontaine
Journal:  BMC Bioinformatics       Date:  2013-03-27       Impact factor: 3.169

10.  Automatic prediction of rheumatoid arthritis disease activity from the electronic medical records.

Authors:  Chen Lin; Elizabeth W Karlson; Helena Canhao; Timothy A Miller; Dmitriy Dligach; Pei Jun Chen; Raul Natanael Guzman Perez; Yuanyan Shen; Michael E Weinblatt; Nancy A Shadick; Robert M Plenge; Guergana K Savova
Journal:  PLoS One       Date:  2013-08-16       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.