| Literature DB >> 23409969 |
Tobias Wittkop1, Emily TerAvest, Uday S Evani, K Mathew Fleisch, Ari E Berman, Corey Powell, Nigam H Shah, Sean D Mooney.
Abstract
BACKGROUND: Gene Ontology (GO) enrichment analysis remains one of the most common methods for hypothesis generation from high throughput datasets. However, we believe that researchers strive to test other hypotheses that fall outside of GO. Here, we developed and evaluated a tool for hypothesis generation from gene or protein lists using ontological concepts present in manually curated text that describes those genes and proteins.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23409969 PMCID: PMC3635999 DOI: 10.1186/1471-2105-14-53
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Overview of the computational workflow of STOP. The left side illustrates the backend of the STOP software, i.e. the automatic annotation pipelin: (1) The genome and proteome of all included species is retrieved from UniProt and Entrez Gene respectively and subsequently (2) descriptions for all genes/proteins are collected from UniProt and Entrez Gene and finally (3) submitted to the NCBO annotator web service. The information is stored in a MySQL database and can be accessed by the frontend, which is displayed here on the right side. The real-time analysis pipeline requires a list of genes as input and calculates for each term of the 200+ ontologies whether it is enriched in the given gene list. The results are subsequently presented as a tag cloud or in list form
Summary of comparison between STOP and GO annotations
| Entrez Gene/Entrez Gene | 0.993 | 0.678 | 0.806 | |
| Entrez Gene/GOA | 0.979 | 0.674 | 0.798 | |
| | UniProt/GOA | 0.998 | 0.608 | 0.756 |
| Entrez Gene/Entrez Gene | 0.990 | 0.791 | 0.879 | |
| Entrez Gene/MGI | 0.990 | 0.791 | 0.879 | |
| | UniProt/GOA | 0.999 | 0.746 | 0.854 |
| Entrez Gene/Entrez Gene | 0.987 | 0.724 | 0.835 | |
| Entrez Gene/RGD | 0.959 | 0.713 | 0.818 | |
| | UniProt/GOA | 0.999 | 0.736 | 0.847 |
| Entrez Gene/Entrez Gene | 0.987 | 0.767 | 0.863 | |
| Entrez Gene/FlyBase | 0.978 | 0.762 | 0.857 | |
| | UniProt/GOA | 0.992 | 0.751 | 0.855 |
| Entrez Gene/Entrez Gene | 0.998 | 0.783 | 0.878 | |
| Entrez Gene/WormBase | 0.998 | 0.783 | 0.878 | |
| | UniProt/GOA | 0.999 | 0.788 | 0.881 |
| Entrez Gene/Entrez Gene | 0.994 | 0.798 | 0.885 | |
| Entrez Gene/SGD | 0.994 | 0.798 | 0.885 | |
| | UniProt/GOA | 0.998 | 0.630 | 0.773 |
| Entrez Gene/Entrez Gene | 1.000 | 0.611 | 0.758 | |
| Entrez Gene/EcoCyc | 0.340 | 0.354 | 0.347 | |
| UniProt/GOA | 0.964 | 0.826 | 0.890 |
Annotations based on Entrez Gene descriptions are compared against the gene2go annotations from Entrez Gene and species-specific databases where the annotations have been downloaded from http://www.geneontology.org, and STOP annotations based on UniProt descriptions are compared against GOA annotations. Recall and Precision are calculated for each gene and subsequently averaged. The F-measure is the harmonic mean of these average Recall and Precision values.
Figure 2Top 30 enriched terms for DAVID and STOP analysis of Htt interacting proteins and STOP analysis of Parkinson’s genes. Fifty-nine genes from the HPRD database known to interact with the Human Huntingin (HTT) gene were analyzed using STOP and DAVID (GO). 14 proteins known to be involved in Parkinson’s disease were analyzed with STOP. (A) The list of HTT interacting proteins was submitted to DAVID, and enrichment analysis carried out with GO_all using SwissProt Human as the background. The top 30 annotations are shown. (B) The same proteins were also submitted to STOP with the same background, and the results were limited to annotations from the preferred ontologies. (C) The Parkinson’s related proteins were similarly analyzed with STOP; again limited to annotations from the preferred ontologies. The top 30 categories are shown along with their significance. Significance is defined as the –log(Benjamini-Hochberg corrected p-values). For reference, p = 0.01 is equivalent to 2
Figure 3The STOP website showing results in a bar graph. On the left the navigation interface with previously performed jobs is shown and on the right the enriched categories for the Huntingtin primary interactors that are present in our list of preferred networks are displayed