| Literature DB >> 21347122 |
Graciela Gonzalez1, Juan C Uribe, Brock Armstrong, Wendy McDonough, Michael E Berens.
Abstract
With the overwhelming volume of genomic and molecular information available on many databases nowadays, researchers need from bioinformaticians more than encouragement to refine their searches. We present here GeneRanker, an online system that allows researchers to obtain a ranked list of genes potentially related to a specific disease or biological process by combining gene-disease (or genebiological process) associations with protein-protein interactions extracted from the literature, using computational analysis of the protein network topology to more accurately rank the predicted associations. GeneRanker was evaluated in the context of brain cancer research, and is freely available online at http://www.generanker.org.Entities:
Year: 2008 PMID: 21347122 PMCID: PMC3041521
Source DB: PubMed Journal: Summit Transl Bioinform ISSN: 2153-6430
Figure 1.Initial set of genes. After the user types a disease or biological process, the GeneRanker system web interface (available at www.generanker.org) displays an initial set of genes obtained from relevant gene-disease associations extracted by CBioC from biomedical literature. Accuracy is estimated at 75% for this initial set.
Figure 3.GeneRanker, showing the top ranked genes for glioblastoma. The interface allows users to apply the method to any disease or biological process, obtaining a ranked list of potentially related genes. Precision reaches 91 to 94% for the top 100 genes in the list.
Results of comparing GeneRanker to text extraction in finding genes associated to a specific disease. True positives (TP) are genes that are either associated to the disease in OMIM or that were found to co-occur in PubMed abstracts with an overrepresentation index greater than 1. GeneRanker exceeds the performance of text extraction by up to 17%.
| Text extraction list 1 | 153 | 47 | 77% |
| Text extraction list 2 | 159 | 41 | 80% |
| Text extraction list 3 | 150 | 50 | 75% |
| | 154.0(4.6) | 46.0(4.6) | |
| | |||
| | |||
Evaluation against glioblastoma (GBM) dataset. The percentage of probes with 2-fold differential expression (up or down) in the GBM dataset are noted for (a) a set of 10 random lists of 300 genes, (b) a list of genes obtained from gene ontology (GO) annotations for cell-cell adhesion, and (c) the top 300 genes from GeneRanker. The effect size of the later with respect to the GO list (and therefore, wrt the random list) is highly significant.
| Random gene list 1 | 16.4% |
| Random gene list 2 | 13.6% |
| Random gene list 3 | 17.4% |
| Random gene list 4 | 14.5% |
| Random gene list 5 | 14.8% |
| Random gene list 6 | 17.9% |
| Random gene list 7 | 11.6% |
| Random gene list 8 | 18.8% |
| Random gene list 9 | 16.1% |
| Random gene list 10 | 14.8% |
| | |
| GO list for cell-cell adhesion | 19.5% |
| | |
| | |
| | |