| Literature DB >> 27453982 |
Yanqing Ji1, Hao Ying2, John Tran3, Peter Dews4, R Michael Massanari5.
Abstract
BACKGROUND: Finding highly relevant articles from biomedical databases is challenging not only because it is often difficult to accurately express a user's underlying intention through keywords but also because a keyword-based query normally returns a long list of hits with many citations being unwanted by the user. This paper proposes a novel biomedical literature search system, called BiomedSearch, which supports complex queries and relevance feedback.Entities:
Keywords: Association mining; Biomedical literature search; Relevance feedback; UMLS
Mesh:
Year: 2016 PMID: 27453982 PMCID: PMC4959361 DOI: 10.1186/s12859-016-1129-z
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1BiomedSearch System Architecture
Fig. 2Example Phrases and Their Matched Concepts in a MetaMap File
Example article selected as feedback by a user
| Sentence | Concept IDs |
|---|---|
| S1 | 1, 3, 4, 3, 5 |
| S2 | 4, 5, 5, 1 |
| S3 | 3, 5, 1, 3, 1, 6 |
| S4 | 1, 5, 4, 4, 1 |
| S5 | 5, 2, 4, 6, 2 |
Counts and frequencies given the example article and Q
| Sentence |
|
|
|
|---|---|---|---|
| S1 | 1/3 | 1 | 1/3 |
| S2 | 0 | 1 | 0 |
| S3 | 2/3 | 1 | 2/3 |
| S4 | 0 | 1 | 0 |
| S5 | 2/3 | 0 | 0 |
| Sum | 5/3 ( | 4 ( | 1 ( |
Step-by-step calculation of given P = {2, 3, 1, 6, 8} and
| Depth |
|
|
|
|---|---|---|---|
| 1 | 1 | 1/1 = 1 | (0.9)0 × 1 = 1 |
| 2 | 1 | 1/2 = 0.5 | (0.9)1 × 0.5 = 0.45 |
| 3 | 2 | 2/3 = 0.67 | (0.9)2 × 0.67 = 0.54 |
| 4 | 3 | 3/4 = 0.75 | (0.9)3 × 0.75 = 0.55 |
| 5 | 3 | 3/5 = 0.6 | (0.9)4 × 0.6 = 0.39 |
|
| (1 − 0.9) × (1 + 0.45 + 0.54 + 0.55 + 0.39) = 0.29 | ||
Number of relevant documents in top 10 and 20 for each topic in the initial, 2nd-round, and 3rd-round search (k = 30)
| Topic ID | Top 10 results | Top 20 results | ||||
|---|---|---|---|---|---|---|
| Initial | 2nd | 3rd | Initial | 2nd | 3rd | |
| 1 | 7 | 10 | 10 | 14 | 20 | 20 |
| 2 | 1 | 2 | 1 | 2 | 4 | 4 |
| 3 | 4 | 7 | 8 | 8 | 13 | 14 |
| 4 | 1 | 6 | 6 | 1 | 6 | 7 |
| 5 | 1 | 2 | 2 | 1 | 2 | 2 |
| 6 | 5 | 5 | 5 | 12 | 13 | 13 |
| 7 | 10 | 10 | 10 | 18 | 19 | 19 |
| 8 | 6 | 6 | 6 | 10 | 11 | 12 |
| 12 | 1 | 2 | 2 | 2 | 3 | 3 |
| 13 | 1 | 1 | 1 | 2 | 2 | 2 |
| 14 | 0 | N/A | N/A | 1 | 2 | 2 |
| 15 | 9 | 10 | 10 | 19 | 20 | 20 |
| 16 | 3 | 3 | 3 | 6 | 8 | 9 |
| 17 | 2 | 3 | 3 | 2 | 4 | 4 |
| 18 | 0 | N/A | N/A | 1 | 2 | 2 |
| 19 | 6 | 9 | 9 | 12 | 19 | 20 |
| MAP | 0.605 | 0.842 | 0.866 | 0.467 | 0.728 | 0.731 |
MAP@10 and MAP@20 for 2nd-round and 3rd--round search when k takes different values
|
|
|
|
|
| ||
|---|---|---|---|---|---|---|
| MAP@10 | 2nd | 0.807 | 0.827 | 0.842 | 0.813 | 0.806 |
| 3rd | 0.816 | 0.840 | 0.866 | 0.824 | 0.812 | |
| MAP@20 | 2nd | 0.703 | 0.708 | 0.728 | 0.715 | 0.703 |
| 3rd | 0.717 | 0.719 | 0.731 | 0.721 | 0.716 | |
Specific ranks of the documents relevant to topic 4 in the 2nd-round search when k takes different values
| Document ID | Ranks of the Documents Relevant to Topic 4 | ||||
|---|---|---|---|---|---|
|
|
|
|
|
| |
| 15003956 | 1 | 1 | 1 | 1 | 1 |
| 1528178 | 2 | 2 | 2 | 2 | 2 |
| 9302273 | 4 | 5 | 4 | 4 | 4 |
| 12867662 | 6 | 3 | 6 | 6 | 6 |
| 9328480 | 7 | 14 | 7 | 8 | 8 |
| 9700208 | 11 | 10 | 10 | 10 | 11 |
| 9516475 | 96 | 111 | 24 | 24 | 25 |
| 15452128 | 173 | 192 | 174 | 174 | 176 |
Number of relevant documents in top 10 and 20 for each topic in the initial, 2nd-round, and 3rd-round search using highly relevant documents as gold standard (k = 30)
| Topic ID | Top 10 results | Top 20 results | ||||
|---|---|---|---|---|---|---|
| Initial | 2nd | 3rd | Initial | 2nd | 3rd | |
| 1 | 7 | 10 | 10 | 14 | 20 | 20 |
| 2 | 1 | 2 | 2 | 2 | 4 | 4 |
| 3 | 4 | 7 | 8 | 8 | 13 | 14 |
| 4 | 1 | 6 | 6 | 1 | 6 | 7 |
| 5 | 1 | 2 | 2 | 1 | 2 | 2 |
| 6 | 5 | 5 | 5 | 12 | 13 | 13 |
| 7 | 10 | 10 | 10 | 18 | 19 | 19 |
| 8 | 6 | 6 | 6 | 10 | 11 | 12 |
| 12 | 1 | 2 | 2 | 1 | 3 | 3 |
| 13 | 1 | 1 | 1 | 2 | 2 | 2 |
| 14 | 0 | N/A | N/A | 1 | 2 | 2 |
| 15 | 9 | 10 | 10 | 19 | 20 | 20 |
| 16 | 3 | 3 | 3 | 6 | 8 | 9 |
| 17 | 2 | 3 | 3 | 2 | 4 | 4 |
| 18 | 0 | N/A | N/A | 1 | 2 | 2 |
| 19 | 6 | 9 | 9 | 12 | 19 | 20 |
| MAP | 0.605 | 0.842 | 0.866 | 0.467 | 0.728 | 0.731 |