Literature DB >> 29456459

How to Interpret PubMed Queries and Why It Matters.

Lana Yeganova1, Donald C Comeau2, Won Kim2, W John Wilbur3.   

Abstract

A significant fraction of queries in PubMed™ are multiterm queries without parsing instructions. Generally, search engines interpret such queries as collections of terms, and handle them as a Boolean conjunction of these terms. However, analysis of queries in PubMed™ indicates that many such queries are meaningful phrases, rather than simple collections of terms. In this study, we examine whether or not it makes a difference, in terms of retrieval quality, if such queries are interpreted as a phrase or as a conjunction of query terms. And, if it does, what is the optimal way of searching with such queries. To address the question, we developed an automated retrieval evaluation method, based on machine learning techniques, that enables us to evaluate and compare various retrieval outcomes. We show that the class of records that contain all the search terms, but not the phrase, qualitatively differs from the class of records containing the phrase. We also show that the difference is systematic, depending on the proximity of query terms to each other within the record. Based on these results, one can establish the best retrieval order for the records. Our findings are consistent with studies in proximity searching.

Entities:  

Year:  2008        PMID: 29456459      PMCID: PMC5815840          DOI: 10.1002/asi.20979

Source DB:  PubMed          Journal:  J Am Soc Inf Sci Technol        ISSN: 1532-2882


  2 in total

1.  Optimal training sets for Bayesian prediction of MeSH assignment.

Authors:  Sunghwan Sohn; Won Kim; Donald C Comeau; W John Wilbur
Journal:  J Am Med Inform Assoc       Date:  2008-04-24       Impact factor: 4.497

2.  Relemed: sentence-level search engine with relevance score for the MEDLINE database of biomedical articles.

Authors:  Mir S Siadaty; Jianfen Shu; William A Knaus
Journal:  BMC Med Inform Decis Mak       Date:  2007-01-10       Impact factor: 2.796

  2 in total
  2 in total

1.  Identifying well-formed biomedical phrases in MEDLINE® text.

Authors:  Won Kim; Lana Yeganova; Donald C Comeau; W John Wilbur
Journal:  J Biomed Inform       Date:  2012-06-08       Impact factor: 6.317

2.  PubMed Phrases, an open set of coherent phrases for searching biomedical literature.

Authors:  Sun Kim; Lana Yeganova; Donald C Comeau; W John Wilbur; Zhiyong Lu
Journal:  Sci Data       Date:  2018-06-12       Impact factor: 6.444

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.