Literature DB >> 21683802

Improving MeSH classification of biomedical articles using citation contexts.

Bader Aljaber1, David Martinez, Nicola Stokes, James Bailey.   

Abstract

Medical Subject Headings (MeSH) are used to index the majority of databases generated by the National Library of Medicine. Essentially, MeSH terms are designed to make information, such as scientific articles, more retrievable and assessable to users of systems such as PubMed. This paper proposes a novel method for automating the assignment of biomedical publications with MeSH terms that takes advantage of citation references to these publications. Our findings show that analysing the citation references that point to a document can provide a useful source of terms that are not present in the document. The use of these citation contexts, as they are known, can thus help to provide a richer document feature representation, which in turn can help improve text mining and information retrieval applications, in our case MeSH term classification. In this paper, we also explore new methods of selecting and utilising citation contexts. In particular, we assess the effect of weighting the importance of citation terms (found in the citation contexts) according to two aspects: (i) the section of the paper they appear in and (ii) their distance to the citation marker. We conduct intrinsic and extrinsic evaluations of citation term quality. For the intrinsic evaluation, we rely on the UMLS Metathesaurus conceptual database to explore the semantic characteristics of the mined citation terms. We also analyse the "informativeness" of these terms using a class-entropy measure. For the extrinsic evaluation, we run a series of automatic document classification experiments over MeSH terms. Our experimental evaluation shows that citation contexts contain terms that are related to the original document, and that the integration of this knowledge results in better classification performance compared to two state-of-the-art MeSH classification systems: MeSHUP and MTI. Our experiments also demonstrate that the consideration of Section and Distance factors can lead to statistically significant improvements in citation feature quality, thus opening the way for better document feature representation in other biomedical text processing applications.
Copyright © 2011 Elsevier Inc. All rights reserved.

Mesh:

Year:  2011        PMID: 21683802     DOI: 10.1016/j.jbi.2011.05.007

Source DB:  PubMed          Journal:  J Biomed Inform        ISSN: 1532-0464            Impact factor:   6.317


  4 in total

1.  Do peers see more in a paper than its authors?

Authors:  Anna Divoli; Preslav Nakov; Marti A Hearst
Journal:  Adv Bioinformatics       Date:  2012-11-27

2.  Reengineering of MeSH thesauri for term selection to optimize literature retrieval and knowledge reconstruction in support of stem cell research.

Authors:  Yan Su; James Andrews; Hong Huang; Yue Wang; Liangliang Kong; Peter Cannon; Ping Xu
Journal:  BMC Med Inform Decis Mak       Date:  2016-05-23       Impact factor: 2.796

3.  Surveillance for the prevention of chronic diseases through information association.

Authors:  Juliana Tarossi Pollettini; José Augusto Baranauskas; Evandro Seron Ruiz; Maria da Graça Pimentel; Alessandra Alaniz Macedo
Journal:  BMC Med Genomics       Date:  2014-01-30       Impact factor: 3.063

4.  Network-based approach highlighting interplay among anti-hypertensives: target coding-genes: diseases.

Authors:  Reetu Sharma
Journal:  Sci Rep       Date:  2020-11-19       Impact factor: 4.379

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.