| Literature DB >> 19796404 |
Heiko Dietze1, Michael Schroeder.
Abstract
BACKGROUND: Current search engines are keyword-based. Semantic technologies promise a next generation of semantic search engines, which will be able to answer questions. Current approaches either apply natural language processing to unstructured text or they assume the existence of structured statements over which they can reason.Entities:
Mesh:
Year: 2009 PMID: 19796404 PMCID: PMC2755828 DOI: 10.1186/1471-2105-10-S10-S7
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Comparison of semantic search engines
| ontologies | (1) implicit through RDF, (2) GO, (3) MeSH | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| textmining | (4) NLP, (5) label extraction, (6) Ontology terminology, (7) biomedical entities, (8) Wikipedia terminology | ||||||||
| type of documents | (9) RDF related, (10) web pages, (11) snippets, (12) abstracts, (13) fulltext | ||||||||
| clustering of results | (14) RDF types, (15) extracted categories, (16) textual labels, (17) ontology, (18) answers, (19) query aspects | ||||||||
| result type | (20) RDF resource, (21) extracted text, (22) answer, (23) snippet, (24) sentence, (25) fulltext, (26) cluster, (27) induced ontology, (28) abstract | ||||||||
| Semantic Search Engines | structured/unstructured | ontologies | textmining | number of documents | type of documents | clustering of results | result type | highlighting | scientifically evaluated |
| Swoogle | rdf | 1 | ≫Mio | 9 | 20 | yes | |||
| SWSE | rdf | 1 | ≫Mio | 9 | 14 | 20 | yes | ||
| Sindice | rdf | 1 | ≫Mio | 9 | 20 | yes | |||
| Watson | rdf | 1 | ≫Mio | 9 | 20 | yes | |||
| Falcons | rdf | 1 | ≫Mio | 9 | 14 | 20 | yes | yes | |
| CORESE | rdf | 1 | ≫Mio | 9 | 20 | yes | |||
| WikiDB | rdf | 1 | ≫Mio | 9 | 20 | ||||
| Hakia | txt | 4 | ≫Bio | 10 | 15 | 21 | yes | ||
| START | txt | 4 | ≫Bio | 10 | 22 | yes | |||
| Ask.com | txt | 4 | ≫Bio | 10 | 23 | ||||
| BrainBoost | txt | 4 | ≫Bio | 10 | 24 | yes | |||
| AnswerBus | txt | 4 | ≫Bio | 10 | 25 | yes | |||
| Cuil | txt | 4,8 | ≫Bio | 10 | 15 | 21 | yes | ||
| Clusty | txt | 5 | ≫Bio | 10 | 16 | 23,26 | yes | ||
| Carrot | txt | 5 | ≫Bio | 11 | 16 | 23,26 | yes | yes | |
| PowerSet | wiki | 4,8 | ≫Mio | 10 | 15 | 23,25 | yes | ||
| QuAliM | wiki/txt | 4,8 | ≫Mio | 11,10 | 22 | yes | |||
| GoWeb | txt | 2,3 | 6,7,8 | ≫Bio | 11 | 17 | 23,27 | yes | yes |
| askMedline | xml | 3 | ≫Mio | 12 | 28 | yes | |||
| EAGLi | xml | 2 | 4,6 | ≫Mio | 12 | 18 | 22,28 | yes | yes |
| GoPubMed | xml | 2,3 | 6,7,8 | ≫Mio | 12 | 17 | 23,27,28 | yes | yes |
| ClusterMed | xml | 3 | 5 | ≫Mio | 12 | 16 | 26,28 | yes | yes |
| IHop | xml | 3 | 6,7 | ≫Mio | 12 | 19 | 24,28 | yes | yes |
| EBIMed | xml | 2,3 | 6,7 | ≫Mio | 12 | 17 | 24,27 | yes | yes |
| XplorMed | xml | 3 | 5,6 | ≫Mio | 12 | 17 | 21,28 | yes | yes |
| Textpresso | xml | 2 | 6 | ≫Mio | 13 | 17 | 28 | yes | yes |
| Chilibot | xml | 7 | ≫Mio | 12 | 24 | yes | yes | ||
Figure 1GoWeb screen shot. GoWeb screen shot, shown with example query Fgf8 and selected term "Zebrafish". On the left of the GoWeb website are the semantic filters in the where-what-who-when panels. In the what panel the GO and MeSH are show in a tree representation. For this example the MeSH branch "Organisms" is open. The most relevant concepts in this branch are listed, for instance "Mice" and "Zebrafish". The number of matching search results is given in brackets and is illustrated with a small bar chart. The bar indicates the fraction compared to the overall result set count. The wider the bar the more often it occurs in the search result. On the right side are the search results with the query field and summary on top. The search summary contains information about the current and overall number of search result. The individual search results are presented as a list. Each result has a title and a short text extract. In both keywords and terms are highlighted. The number in front of the title represents the original result position.
Figure 2GoWeb workflow. General workflow for GoWeb showing the main components and the interactions between the external services. The workflow starts with the user submitting a query via the search input field from the GoWeb page (1). The search request is parsed and transformed in a search for the external Yahoo! BOSS (2). The service return a list of results, snippets. The textual content is annotated by GoWeb (3) and the additional external OpenCalais service (4). The search keywords and the identified entities form the annotation are highlighted in the search results. Then the results are rendered and sent to the browser (5). Based on the annotations and the ontology structure the tree representation is induced; top concepts are selected and sent to the browser (6).
Figure 3GoWeb workflow (2). Workflow for a request containing a concept selected from the result tree in the user browser. When a user clicks on a concept in tree from the GoWeb website (1), the browser sends a request to update the search results (2). GoWeb filters all search results. If a result is annotated with the concept or a child of this concept, it is included in the new result list (3). Additionally the results are reranked and the highlighting of the selected concept and children is updated. The new result list is finally send to the browser.
Figure 4GoWeb screen shot. GoWeb screen shot, shown with example query Fgf8 and selected term "Cichlids". Next to the top terms as answers there is for instance the concept "Cichlids" listed under "Organisms". When selected GoWeb retrieves the matching snippets. For this example search the result set is reduced to three articles, which were formerly on positions 265, 739 and 943. The snippets talk about the usage of cichlids in the study of tooth and jaw morphogenesis. To learn more about cichlids, there are the concept definitions available in the tooltips or the exploration of related links to Wikipedia.
Overview of the GoWeb results for the symptoms and diseases benchmark
| Query | GoWeb | Count | |
|---|---|---|---|
| 5 | Acute "Aortic regurgitation" depression abscess | Tree: Endocarditis, Bacterial | 7 (1000) |
| 6 | oesophageal cancer hiccup nausea vomiting | Tree: Adenocarcinoma AND Intestinal Obstruction | 2 (1000) |
| 7 | hypertension "adrenal mass" | Top categories: Cushing Syndrome | 41 (1000) |
| 8 | "hip lesion" child | no, bmj article | 258 |
| 9 | HRCT centrilobular nodule "acute respiratory failure" | Finds the case studies this analysis relies on | 15 |
| 10 | fever bilateral "thigh pain" weakness | no, bmj article | 500 |
| 11 | fever "anterior mediastinal mass" central necrosis | Top categories: Lymphoma | 66 (323) |
| 12 | multiple "spinal tumors" "skin tumors" | Top categories: Neurofibromatoses | 21 (240) |
| 14 | "ulcerative colitis" "blurred vision" fever | Tree: Vascultits | 2 (1000) |
| 15 | nephrotic syndrome "Bence Jones" ventricular failure | Top categories: Amyloidosis | 20 (247) |
| 16 | hypertension papilledema headache "renal mass" | Tree: Pheochromocytoma | 1 (31) |
| 17 | "sickle cell" pulmonary infiltrates "back pain" | Top5 snippet is ACS | 1000 |
| 18 | fibroma astrocytoma tumor leiomyoma scoliosis | no, bmj article | 1 (47) |
| 19 | pulmonary infiltrates "cns lesion" OR "Central nervous system lesion" | no | 87 |
| 22 | CLL encephalitis | Tree: West Nile Fever | 3 (1000) |
| 25 | "portal vein thrombosis" cancer | Tree: Phlebitis | 9 (1000) |
| 26 | "cardiac arrest" exercise young | top categories: Cardiomyopathy, Hypertrophic | 22 (1000) |
| 27 | ataxia confusion insomnia death | Tree: CJD | 17 (1000) |
| 28 | ANCA haematuria haemoptysis | Top categories: Churg-Strauss Syndrome | 3 (126) |
| 29 | myopathy neoplasia dysphagia rash periorbital swelling | Top categories: Dermatomyositis | 4 (32) |
| 30 | "renal transplant" fever cat lymphadenopathy | Top categories: Cat-Scratch Disease | 13 (322) |
| 31 | "buttock rash" "renal failure" edema | no | 120 |
| 33 | polyps telangiectasia epistaxis anemia | Top categories: Telangiectasia, Hereditary Hemorrhagic | 33 (1000) |
| 34 | "bullous skin" "respiratory failure" carbamazepine | Top categories: Epidermal Necrolysis, Toxic | 4 (25) |
| 36 | seizure confusion dysphasia lesions | no | 1000 |
| 37 | cardiac arrest sleep | Tree: Brugada Syndrome | 3 (1000) |
Comparison of Google, GoPubMed and GoWeb for symptoms and diseases benchmark
| Case | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 14 | 15 | 16 | 17 | 18 | 19 | 22 | 25 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||
| GoPubMed | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||
| GoWeb | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||
| Case | 26 | 27 | 28 | 29 | 30 | 31 | 33 | 34 | 36 | 37 | Count | Ratio | ||||
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 16 | 62% | |||||||
| GoPubMed | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 13 | 50% | ||||||||
| GoWeb | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 20 | 77% | ||||||
Summary of TREC Genomics 2006 answering capabilities of GoWeb
| Question | 160 | 161 | 162 | 163 | 164 | 165 | 166 | 167 | 168 | 169 |
|---|---|---|---|---|---|---|---|---|---|---|
| Answered | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| Filter | ✓ | ✓ | ✓ | ✓ | ✓ | |||||
| Question | 170 | 171 | 172 | 173 | 174 | 175 | 176 | 177 | 178 | 179 |
| Answered | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||
| Filter | ✓ | ✓ | ✓ | |||||||
| Question | 180 | 181 | 182 | 183 | 184 | 185 | 186 | 187 | Count | |
| Answered | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 22 | |||
| Filter | ✓ | ✓ | ✓ | ✓ | ✓ | 13 | ||||
TREC Genomics 2006 questions and keywords
| 160 | What is the role of PrnP in mad cow disease? | PrnP |
|---|---|---|
| 161 | What is the role of IDE in Alzheimer's disease? | IDE Alzheimer |
| 162 | What is the role of MMS2 in cancer? | MMS2 |
| 163 | What is the role of APC (adenomatous polyposis coli) in colon cancer? | APC adenomatous polyposis coli |
| 164 | What is the role of Nurr-77 in Parkinson's disease? | Nurr-77 |
| 165 | How do Cathepsin D (CTSD) and apolipoprotein E (ApoE) interactions contribute to Alzheimer's disease? | "Cathepsin D" "apolipoprotein E" |
| 166 | What is the role of Transforming growth factor-beta1 (TGF-beta1) in cerebral amyloid angiopathy (CAA)? | TGF-beta1 cerebral amyloid angiopathy |
| 167 | How does nucleoside diphosphate kinase (NM23) contribute to tumor progression? | NM23 tumor progression |
| 168 | How does BARD1 regulate BRCA1 activity? | BARD1 BRCA1 |
| 169 | How does APC (adenomatous polyposis coli) protein affect actin assembly? | adenomatous polyposis coli actin assembly |
| 170 | How does COP2 contribute to CFTR export from the endoplasmic reticulum? | COP2 CFTR |
| 171 | How does Nurr-77 delete T cells before they migrate to the spleen or lymph nodes and how does this impact autoimmunity? | Nurr-77 T cell |
| 172 | How does p53 affect apoptosis? | p53 apoptosis |
| 173 | How do alpha7 nicotinic receptor subunits affect ethanol metabolism? | alpha7 nicotinic receptor ethanol |
| 174 | How does BRCA1 ubiquitinating activity contribute to cancer? | BRCA1 ubiquitinating |
| 175 | How does L2 interact with L1 to form HPV11 viral capsids? | L1 L2 HPV11 |
| 176 | How does Sec61-mediated CFTR degradation contribute to cystic fibrosis? | Sec61 CFTR |
| 177 | How do Bop-Pes interactions affect cell growth? | Bop Pes cell growth |
| 178 | How do interactions between insulin-like GFs and the insulin receptor affect skin biology? | insulin-like GF insulin receptor |
| 179 | How do interactions between HNF4 and COUP-TF1 suppress liver function? | HNF4 COUP-TF1 |
| 180 | How do Ret-GDNF interactions affect liver development? | Ret GDNF liver |
| 181 | How do mutations in the Huntingtin gene affect Huntington's disease? | Huntingtin gene |
| 182 | How do mutations in Sonic Hedgehog genes affect developmental disorders? | Sonic Hedgehog gene |
| 183 | How do mutations in the NM23 gene affect tracheal development? | NM23 tracheal development |
| 184 | How do mutations in the Pes gene affect cell growth? | Pes gene cell growth |
| 185 | How do mutations in the hypocretin receptor 2 gene affect narcolepsy? | hypocretin receptor 2 narcolepsy |
| 186 | How do mutations in the Presenilin-1 gene affect Alzheimer's disease? | Presenilin-1 Alzheimer |
| 187 | How do mutations in familial hemiplegic migraine type 1 (FHM1) gene affect calcium ion in ux in hippocampal neurons? | FHM1 calcium neuron |
Answers for TREC Genomics 2006 questions 160 to 164
| Concept | original Pos | Evidence |
|---|---|---|
| 160 Encephalopathy, Bovine Spongiform |
| Transmissible Spongiform Encephalopathy (BSE) Bovine spongiform encepalopathy is a transmissible, ... Mutations in the PRNP gene cause prion disease. ... |
| 161 |
| Insulin-Degrading Enzyme as a Downstream Target of Insulin Receptor ... effect relationship between insulin signaling and IDE upregulation. ... P85) was correlated with reduced IDE in Alzheimer's disease (AD) brains and in ... |
|
| Insulin degrading enzyme – Wikipedia, the free encyclopedia 1 IDE and Alzheimer's Disease. 2 IDE Structure and Function. 3 References. 4 External links ... between IDE, A | |
| 162 DNA Damage |
| ... concerted action of RAD5 with UBC13 and MMS2 in DNA damage repair is given by ... Finally, it is shown that MMS2, like UBC13 and many other repair genes, is ... |
| 163 |
| The official name of this gene is "adenomatous polyposis coli." APC is the gene's official symbol. ... adenomatous polyposis – caused by mutations in the APC ... |
| 164 Parkinson Disease |
| The aetiology of idiopathic Parkinson's disease Nurr 1 was first recognised as a transcription factor that was primarily ... Its close structural relation to Nur 77 led to its identification in stimulated ... |
|
| Concise Review: Therapeutic Strategies for Parkinson Disease Based on ... nuclear related receptor 1 (Nurr-1), thereby withdrawing the cells of the cell ... in the SVZ and the substantia nigra of the healthy adult rat brain [77, 98] ... | |
|
| Parkinson's disease: piecing together a genetic jigsaw – Dekker et al ... study decreased rapidly with later onset: 77% of patients with onset of disease ... agenesis of mesencephalic dopaminergic neurons in Nurr-1 deficient mice. ... |