| Literature DB >> 18487273 |
Dean Cheng1, Craig Knox, Nelson Young, Paul Stothard, Sambasivarao Damaraju, David S Wishart.
Abstract
A particular challenge in biomedical text mining is to find ways of handling 'comprehensive' or 'associative' queries such as 'Find all genes associated with breast cancer'. Given that many queries in genomics, proteomics or metabolomics involve these kind of comprehensive searches we believe that a web-based tool that could support these searches would be quite useful. In response to this need, we have developed the PolySearch web server. PolySearch supports >50 different classes of queries against nearly a dozen different types of text, scientific abstract or bioinformatic databases. The typical query supported by PolySearch is 'Given X, find all Y's' where X or Y can be diseases, tissues, cell compartments, gene/protein names, SNPs, mutations, drugs and metabolites. PolySearch also exploits a variety of techniques in text mining and information retrieval to identify, highlight and rank informative abstracts, paragraphs or sentences. PolySearch's performance has been assessed in tasks such as gene synonym identification, protein-protein interaction identification and disease gene identification using a variety of manually assembled 'gold standard' text corpuses. Its f-measure on these tasks is 88, 81 and 79%, respectively. These values are between 5 and 50% better than other published tools. The server is freely available at http://wishart.biology.ualberta.ca/polysearch.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18487273 PMCID: PMC2447794 DOI: 10.1093/nar/gkn296
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.A screenshot montage of PolySearch's query interface and result display showing: (A) the PolySearch query interface; (B) the query refinement page; (C) the PolySearch result table where both PubMed and OMIM were searched and (D) the sentence and keyword display view obtained by clicking on the PubMed citation links in the result table.
A detailed listing of all allowed ‘basic’ queries in PolySearch
| Given | |||||||||
|---|---|---|---|---|---|---|---|---|---|
| Disease | Gene/protein | Drug | Metabolite | Text word | Pathway | Tissue | SNP (RS#) | Gene/protein sequence | |
| Find | |||||||||
| Disease | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |
| Gene/protein | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| Drug | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |
| Metabolite | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||
| Tissue | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |
| Organ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |
| Subcellular localization | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |
| Pathway | ✓ | ✓ | ✓ | ✓ | ✓ | ||||
| Text word | ✓ | ||||||||
| SNP | ✓ | ✓ | |||||||
| PCR primers | ✓ | ✓ | |||||||