Literature DB >> 28747808

Leveraging output term co-occurrence frequencies and latent associations in predicting medical subject headings.

Ramakanth Kavuluru1,2, Yuan Lu2.   

Abstract

Trained indexers at the National Library of Medicine (NLM) manually tag each biomedical abstract with the most suitable terms from the Medical Subject Headings (MeSH) terminology to be indexed by their PubMed information system. MeSH has over 26,000 terms and indexers look at each article's full text while assigning the terms. Recent automated attempts focused on using the article title and abstract text to identify MeSH terms for the corresponding article. Most of these approaches used supervised machine learning techniques that use already indexed articles and the corresponding MeSH terms. In this paper, we present a new indexing approach that leverages term co-occurrence frequencies and latent term associations computed using MeSH term sets corresponding to a set of nearly 18 million articles already indexed with MeSH terms by indexers at NLM. The main goal of our study is to gauge the potential of output label co-occurrences, latent associations, and relationships extracted from free text in both unsupervised and supervised indexing approaches. In this paper, using a novel and purely unsupervised approach, we achieve a micro-F-score that is comparable to those obtained using supervised machine learning techniques. By incorporating term co-occurrence and latent association features into a supervised learning framework, we also improve over the best results published on two public datasets.

Entities:  

Keywords:  Medical subject headings; Multi-label classification; Output label associations; Reflective random indexing

Year:  2014        PMID: 28747808      PMCID: PMC5524140          DOI: 10.1016/j.datak.2014.09.002

Source DB:  PubMed          Journal:  Data Knowl Eng        ISSN: 0169-023X            Impact factor:   1.992


  13 in total

1.  The NLM Indexing Initiative.

Authors:  A R Aronson; O Bodenreider; H F Chang; S M Humphrey; J G Mork; S J Nelson; T C Rindflesch; W J Wilbur
Journal:  Proc AMIA Symp       Date:  2000

2.  The interaction of domain knowledge and linguistic structure in natural language processing: interpreting hypernymic propositions in biomedical text.

Authors:  Thomas C Rindflesch; Marcelo Fiszman
Journal:  J Biomed Inform       Date:  2003-12       Impact factor: 6.317

3.  An overview of MetaMap: historical perspective and recent advances.

Authors:  Alan R Aronson; François-Michel Lang
Journal:  J Am Med Inform Assoc       Date:  2010 May-Jun       Impact factor: 4.497

4.  Reflective random indexing for semi-automatic indexing of the biomedical literature.

Authors:  Vidya Vasuki; Trevor Cohen
Journal:  J Biomed Inform       Date:  2010-04-09       Impact factor: 6.317

5.  Semi-automatic indexing of full text biomedical articles.

Authors:  Clifford W Gay; Mehmet Kayaalp; Alan R Aronson
Journal:  AMIA Annu Symp Proc       Date:  2005

6.  The effect of feature representation on MEDLINE document classification.

Authors:  Meliha Yetisgen-Yildiz; Wanda Pratt
Journal:  AMIA Annu Symp Proc       Date:  2005

7.  Optimal training sets for Bayesian prediction of MeSH assignment.

Authors:  Sunghwan Sohn; Won Kim; Donald C Comeau; W John Wilbur
Journal:  J Am Med Inform Assoc       Date:  2008-04-24       Impact factor: 4.497

8.  Reflective Random Indexing and indirect inference: a scalable method for discovery of implicit connections.

Authors:  Trevor Cohen; Roger Schvaneveldt; Dominic Widdows
Journal:  J Biomed Inform       Date:  2009-09-15       Impact factor: 6.317

9.  Recommending MeSH terms for annotating biomedical articles.

Authors:  Minlie Huang; Aurélie Névéol; Zhiyong Lu
Journal:  J Am Med Inform Assoc       Date:  2011-05-25       Impact factor: 4.497

10.  PubMed related articles: a probabilistic topic-based model for content similarity.

Authors:  Jimmy Lin; W John Wilbur
Journal:  BMC Bioinformatics       Date:  2007-10-30       Impact factor: 3.169

View more
  5 in total

1.  Immune modulators in disease: integrating knowledge from the biomedical literature and gene expression.

Authors:  Nophar Geifman; Sanchita Bhattacharya; Atul J Butte
Journal:  J Am Med Inform Assoc       Date:  2015-12-11       Impact factor: 4.497

2.  Analyzing the Moving Parts of a Large-Scale Multi-Label Text Classification Pipeline: Experiences in Indexing Biomedical Articles.

Authors:  Anthony Rios; Ramakanth Kavuluru
Journal:  IEEE Int Conf Healthc Inform       Date:  2015-12-10

3.  Predicting mental conditions based on "history of present illness" in psychiatric notes with deep neural networks.

Authors:  Tung Tran; Ramakanth Kavuluru
Journal:  J Biomed Inform       Date:  2017-06-10       Impact factor: 6.317

4.  Biomedical articles share annotations with their citation neighbors.

Authors:  Raul Rodriguez-Esteban
Journal:  BMC Bioinformatics       Date:  2021-02-26       Impact factor: 3.169

5.  A consideration of publication-derived immune-related associations in Coronavirus and related lung damaging diseases.

Authors:  Nophar Geifman; Anthony D Whetton
Journal:  J Transl Med       Date:  2020-08-03       Impact factor: 5.531

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.