Literature DB >> 16779160

The effect of feature representation on MEDLINE document classification.

Meliha Yetisgen-Yildiz1, Wanda Pratt.   

Abstract

This work explores the effect of text representation techniques on the overall performance of medical text classification. To accomplish this goal, we developed a text classification system that supports the very basic word representation (bag-of-words) and the more complex medical phrase representation (bag-of-phrases). We also combined word and phrase representations (hybrid) for further analysis. Our system extracts medical phrases from text by incorporating a medical knowledge base and natural language processing techniques. We conducted experiments to evaluate the effects of different representations by measuring the change in classification performance with MEDLINE documents from the OHSUMED dataset. We measured classification performance with information retrieval metrics; precision (p), recall (r), and F1-score (F1). In our experiments, we achieved better classification performance with the hybrid approach (p=0.87, r=0.46, F1=0.60) compared to the bag-of-words approach (p=0.85, r=0.44, F1=0.58) and the bag-of-phrases approach (p=0.87, r=0.42, F1=0.57).

Mesh:

Year:  2005        PMID: 16779160      PMCID: PMC1560754     

Source DB:  PubMed          Journal:  AMIA Annu Symp Proc        ISSN: 1559-4076


  1 in total

1.  Free-text medical document retrieval via phrase-based vector space model.

Authors:  Wenlei Mao; Wesley W Chu
Journal:  Proc AMIA Symp       Date:  2002
  1 in total
  24 in total

1.  A bottom-up approach to MEDLINE indexing recommendations.

Authors:  Antonio Jimeno-Yepes; Bartłomiej Wilkowski; James G Mork; Elizabeth Van Lenten; Dina Demner Fushman; Alan R Aronson
Journal:  AMIA Annu Symp Proc       Date:  2011-10-22

2.  An overview of MetaMap: historical perspective and recent advances.

Authors:  Alan R Aronson; François-Michel Lang
Journal:  J Am Med Inform Assoc       Date:  2010 May-Jun       Impact factor: 4.497

3.  Towards automatic recognition of scientifically rigorous clinical research evidence.

Authors:  Halil Kilicoglu; Dina Demner-Fushman; Thomas C Rindflesch; Nancy L Wilczynski; R Brian Haynes
Journal:  J Am Med Inform Assoc       Date:  2008-10-24       Impact factor: 4.497

4.  Optimizing feature representation for automated systematic review work prioritization.

Authors:  Aaron M Cohen
Journal:  AMIA Annu Symp Proc       Date:  2008-11-06

5.  Convolutional Neural Networks for Biomedical Text Classification: Application in Indexing Biomedical Articles.

Authors:  Anthony Rios; Ramakanth Kavuluru
Journal:  ACM BCB       Date:  2015-09

6.  Comparison and combination of several MeSH indexing approaches.

Authors:  Antonio Jose Jimeno Yepes; James G Mork; Dina Demner-Fushman; Alan R Aronson
Journal:  AMIA Annu Symp Proc       Date:  2013-11-16

7.  Unsupervised Medical Subject Heading Assignment Using Output Label Co-occurrence Statistics and Semantic Predications.

Authors:  Ramakanth Kavuluru; Zhenghao He
Journal:  Nat Lang Process Inf Syst       Date:  2013-06

8.  Analyzing the Moving Parts of a Large-Scale Multi-Label Text Classification Pipeline: Experiences in Indexing Biomedical Articles.

Authors:  Anthony Rios; Ramakanth Kavuluru
Journal:  IEEE Int Conf Healthc Inform       Date:  2015-12-10

9.  Automatic identification and classification of surgical margin status from pathology reports following prostate cancer surgery.

Authors:  Leonard W D'Avolio; Mark S Litwin; Selwyn O Rogers; Alex A T Bui
Journal:  AMIA Annu Symp Proc       Date:  2007-10-11

10.  Semi-automated screening of biomedical citations for systematic reviews.

Authors:  Byron C Wallace; Thomas A Trikalinos; Joseph Lau; Carla Brodley; Christopher H Schmid
Journal:  BMC Bioinformatics       Date:  2010-01-26       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.