Literature DB >> 28758168

Unsupervised Medical Subject Heading Assignment Using Output Label Co-occurrence Statistics and Semantic Predications.

Ramakanth Kavuluru1,2, Zhenghao He2.   

Abstract

Librarians at the National Library of Medicine tag each biomedical abstract to be indexed by their Pubmed information system with terms from the Medical Subject Headings (MeSH) terminology. The MeSH terminology has over 26,000 terms and indexers look at each article's full text to assign a set of most suitable terms for indexing it. Several recent automated attempts focused on using the article title and abstract text to identify MeSH terms for the corresponding article. Most of these approaches used supervised machine learning techniques that use already indexed articles and the corresponding MeSH terms. In this paper, we present a novel unsupervised approach using named entity recognition, relationship extraction, and output label co-occurrence frequencies of MeSH term pairs from the existing set of 22 million articles already indexed with MeSH terms by librarians at NLM. The main goal of our study is to gauge the potential of output label co-occurrence statistics and relationships extracted from free text in unsupervised indexing approaches. Especially, in biomedical domains, output label co-occurrences are generally easier to obtain than training data involving document and label set pairs owing to the sensitive nature of textual documents containing protected health information. Our methods achieve a micro F-score that is comparable to those obtained using supervised machine learning techniques with training data consisting of document label set pairs. Baseline comparisons reveal strong prospects for further research in exploiting label co-occurrences and relationships extracted from free text in recommending terms for indexing biomedical articles.

Entities:  

Year:  2013        PMID: 28758168      PMCID: PMC5527755          DOI: 10.1007/978-3-642-38824-8_15

Source DB:  PubMed          Journal:  Nat Lang Process Inf Syst


  8 in total

1.  The NLM Indexing Initiative.

Authors:  A R Aronson; O Bodenreider; H F Chang; S M Humphrey; J G Mork; S J Nelson; T C Rindflesch; W J Wilbur
Journal:  Proc AMIA Symp       Date:  2000

2.  The interaction of domain knowledge and linguistic structure in natural language processing: interpreting hypernymic propositions in biomedical text.

Authors:  Thomas C Rindflesch; Marcelo Fiszman
Journal:  J Biomed Inform       Date:  2003-12       Impact factor: 6.317

3.  An overview of MetaMap: historical perspective and recent advances.

Authors:  Alan R Aronson; François-Michel Lang
Journal:  J Am Med Inform Assoc       Date:  2010 May-Jun       Impact factor: 4.497

4.  The effect of feature representation on MEDLINE document classification.

Authors:  Meliha Yetisgen-Yildiz; Wanda Pratt
Journal:  AMIA Annu Symp Proc       Date:  2005

5.  Optimal training sets for Bayesian prediction of MeSH assignment.

Authors:  Sunghwan Sohn; Won Kim; Donald C Comeau; W John Wilbur
Journal:  J Am Med Inform Assoc       Date:  2008-04-24       Impact factor: 4.497

6.  Beyond synonymy: exploiting the UMLS semantics in mapping vocabularies.

Authors:  O Bodenreider; S J Nelson; W T Hole; H F Chang
Journal:  Proc AMIA Symp       Date:  1998

7.  Indexing consistency in MEDLINE.

Authors:  M E Funk; C A Reid
Journal:  Bull Med Libr Assoc       Date:  1983-04

8.  Recommending MeSH terms for annotating biomedical articles.

Authors:  Minlie Huang; Aurélie Névéol; Zhiyong Lu
Journal:  J Am Med Inform Assoc       Date:  2011-05-25       Impact factor: 4.497

  8 in total
  2 in total

1.  Analyzing the Moving Parts of a Large-Scale Multi-Label Text Classification Pipeline: Experiences in Indexing Biomedical Articles.

Authors:  Anthony Rios; Ramakanth Kavuluru
Journal:  IEEE Int Conf Healthc Inform       Date:  2015-12-10

2.  Leveraging output term co-occurrence frequencies and latent associations in predicting medical subject headings.

Authors:  Ramakanth Kavuluru; Yuan Lu
Journal:  Data Knowl Eng       Date:  2014-09-18       Impact factor: 1.992

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.