| Literature DB >> 21347188 |
Abstract
Physicians ask many complex questions during the patient encounter. Information retrieval systems that can provide immediate and relevant answers to these questions can be invaluable aids to the practice of evidence-based medicine. In this study, we first automatically identify topic keywords from ad hoc clinical questions with a Condition Random Field model that is trained over thousands of manually annotated clinical questions. We then report on a linear model that assigns query weights based on their automatically identified semantic roles: topic keywords, domain specific terms, and their synonyms. Our evaluation shows that this weighted keyword model improves information retrieval from the Text Retrieval Conference Genomics track data.Entities:
Year: 2009 PMID: 21347188 PMCID: PMC3041568
Source DB: PubMed Journal: Summit Transl Bioinform ISSN: 2153-6430
Average MAP scores (standard deviations in parentheses) of four systems for document retrieval for question answering using the TREC Genomics data.
| Original Words | Query expansion | Reweight | Expansion & Reweight |
|---|---|---|---|
| .042 (.085) | .054 (.117) | .046 (.092) | .053 (.116) |
Improvement in MAP scores of three systems (query expansion, reweight, and expansion & reweight) over the original words system.
| Query Expansion | Reweight | Expansion & Reweight | |
|---|---|---|---|
| Average MAP (St. Dev) | .012 (.051) | .004 (.009) | .011 (.054) |
| p-value | .119 | .183 |
Figure 1:The mean average precision (MAP) scores of 19 TREC Genomics questions for four systems. The original words system takes in all non-stop words of an ad hoc question as bag-of-word queries to return relevant documents. Reweight is built on top of the original words system; it increases the weights of terms that are identified as keywords of the question. Query expansion incorporates synonyms from the UMLS. Expansion & reweight assigns different weights to different groups of query terms as described in Section 3.2.
Figure 2:AskHERMES system components
Figure 3:The outputs of two models, with and without weighted keywords in response to a sample clinical question. The keyword “head trauma” was automatically identified by AskHERMES. Each answer can be linked to its source page. “Human” indicates that the source page is a human study.