Literature DB >> 20426836

Concept-based query expansion for retrieving gene related publications from MEDLINE.

Sérgio Matos1, Joel P Arrais, João Maia-Rodrigues, José Luis Oliveira.   

Abstract

BACKGROUND: Advances in biotechnology and in high-throughput methods for gene analysis have contributed to an exponential increase in the number of scientific publications in these fields of study. While much of the data and results described in these articles are entered and annotated in the various existing biomedical databases, the scientific literature is still the major source of information. There is, therefore, a growing need for text mining and information retrieval tools to help researchers find the relevant articles for their study. To tackle this, several tools have been proposed to provide alternative solutions for specific user requests.
RESULTS: This paper presents QuExT, a new PubMed-based document retrieval and prioritization tool that, from a given list of genes, searches for the most relevant results from the literature. QuExT follows a concept-oriented query expansion methodology to find documents containing concepts related to the genes in the user input, such as protein and pathway names. The retrieved documents are ranked according to user-definable weights assigned to each concept class. By changing these weights, users can modify the ranking of the results in order to focus on documents dealing with a specific concept. The method's performance was evaluated using data from the 2004 TREC genomics track, producing a mean average precision of 0.425, with an average of 4.8 and 31.3 relevant documents within the top 10 and 100 retrieved abstracts, respectively.
CONCLUSIONS: QuExT implements a concept-based query expansion scheme that leverages gene-related information available on a variety of biological resources. The main advantage of the system is to give the user control over the ranking of the results by means of a simple weighting scheme. Using this approach, researchers can effortlessly explore the literature regarding a group of genes and focus on the different aspects relating to these genes.

Entities:  

Mesh:

Year:  2010        PMID: 20426836      PMCID: PMC2873540          DOI: 10.1186/1471-2105-11-212

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  20 in total

1.  Global functional profiling of gene expression.

Authors:  Sorin Draghici; Purvesh Khatri; Rui P Martins; G Charles Ostermeier; Stephen A Krawetz
Journal:  Genomics       Date:  2003-02       Impact factor: 5.736

2.  A gene network for navigating the literature.

Authors:  Robert Hoffmann; Alfonso Valencia
Journal:  Nat Genet       Date:  2004-07       Impact factor: 38.330

3.  Gene name ambiguity of eukaryotic nomenclatures.

Authors:  Lifeng Chen; Hongfang Liu; Carol Friedman
Journal:  Bioinformatics       Date:  2004-08-27       Impact factor: 6.937

4.  A tool for gene expression based PubMed search through combining data sources.

Authors:  Maksym Korotkiy; Rutger Middelburg; Henk Dekker; Frank van Harmelen; Jan Lankelma
Journal:  Bioinformatics       Date:  2004-03-25       Impact factor: 6.937

5.  AliBaba: PubMed as a graph.

Authors:  Conrad Plake; Torsten Schiemann; Marcus Pankalla; Jörg Hakenberg; Ulf Leser
Journal:  Bioinformatics       Date:  2006-07-26       Impact factor: 6.937

6.  EBIMed--text crunching to gather facts for proteins from Medline.

Authors:  Dietrich Rebholz-Schuhmann; Harald Kirsch; Miguel Arregui; Sylvain Gaudan; Mark Riethoven; Peter Stoehr
Journal:  Bioinformatics       Date:  2007-01-15       Impact factor: 6.937

7.  Evaluation of techniques for increasing recall in a dictionary approach to gene and protein name identification.

Authors:  Martijn J Schuemie; Barend Mons; Marc Weeber; Jan A Kors
Journal:  J Biomed Inform       Date:  2006-09-24       Impact factor: 6.317

8.  GoPubMed: exploring PubMed with the Gene Ontology.

Authors:  Andreas Doms; Michael Schroeder
Journal:  Nucleic Acids Res       Date:  2005-07-01       Impact factor: 16.971

9.  Enhancing access to the Bibliome: the TREC 2004 Genomics Track.

Authors:  William R Hersh; Ravi Teja Bhupatiraju; Laura Ross; Phoebe Roberts; Aaron M Cohen; Dale F Kraemer
Journal:  J Biomed Discov Collab       Date:  2006-03-13

10.  Text-mining and information-retrieval services for molecular biology.

Authors:  Martin Krallinger; Alfonso Valencia
Journal:  Genome Biol       Date:  2005-06-28       Impact factor: 13.583

View more
  9 in total

1.  Leveraging concept-based approaches to identify potential phyto-therapies.

Authors:  Vivekanand Sharma; Indra Neil Sarkar
Journal:  J Biomed Inform       Date:  2013-05-09       Impact factor: 6.317

2.  GO2PUB: Querying PubMed with semantic expansion of gene ontology terms.

Authors:  Charles Bettembourg; Christian Diot; Anita Burgun; Olivier Dameron
Journal:  J Biomed Semantics       Date:  2012-09-07

3.  Studying PubMed usages in the field for complex problem solving: Implications for tool design.

Authors:  Barbara Mirel; Jean Song; Jennifer Steiner Tonks; Fan Meng; Weijian Xuan; Rafiqa Ameziane
Journal:  J Am Soc Inf Sci Technol       Date:  2013-05-01

4.  PESCADOR, a web-based tool to assist text-mining of biointeractions extracted from PubMed queries.

Authors:  Adriano Barbosa-Silva; Jean-Fred Fontaine; Elisa R Donnard; Fernanda Stussi; J Miguel Ortega; Miguel A Andrade-Navarro
Journal:  BMC Bioinformatics       Date:  2011-11-09       Impact factor: 3.307

5.  SIDEKICK: Genomic data driven analysis and decision-making framework.

Authors:  Mark S Doderer; Kihoon Yoon; Kay A Robbins
Journal:  BMC Bioinformatics       Date:  2010-12-30       Impact factor: 3.169

6.  pubmed2ensembl: a resource for mining the biological literature on genes.

Authors:  Joachim Baran; Martin Gerner; Maximilian Haeussler; Goran Nenadic; Casey M Bergman
Journal:  PLoS One       Date:  2011-09-29       Impact factor: 3.240

7.  A modular framework for biomedical concept recognition.

Authors:  David Campos; Sérgio Matos; José Luís Oliveira
Journal:  BMC Bioinformatics       Date:  2013-09-24       Impact factor: 3.169

8.  Exploring the Unexplored: Identifying Implicit and Indirect Descriptions of Biomedical Terminologies Based on Multifaceted Weighting Combinations.

Authors:  Sung-Pil Choi
Journal:  Comput Math Methods Med       Date:  2016-09-06       Impact factor: 2.238

9.  Research on Literature Clustering Algorithm for Massive Scientific and Technical Literature Query Service.

Authors:  Chen Zhang
Journal:  Comput Intell Neurosci       Date:  2022-08-21
  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.