| Literature DB >> 19192299 |
Eran Eden1, Roy Navon, Israel Steinfeld, Doron Lipson, Zohar Yakhini.
Abstract
BACKGROUND: Since the inception of the GO annotation project, a variety of tools have been developed that support exploring and searching the GO database. In particular, a variety of tools that perform GO enrichment analysis are currently available. Most of these tools require as input a target set of genes and a background set and seek enrichment in the target set compared to the background set. A few tools also exist that support analyzing ranked lists. The latter typically rely on simulations or on union-bound correction for assigning statistical significance to the results.Entities:
Mesh:
Year: 2009 PMID: 19192299 PMCID: PMC2644678 DOI: 10.1186/1471-2105-10-48
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1How to use the . To use the GOrilla web interface, the user is required to perform the following four simple steps: (i) choose an organism; (ii) choose a running mode (either flexible threshold or fixed threshold mode) (iii) copy and paste a list (or upload a file) of genes in the case of a flexible threshold or two lists of genes – a target and a background – in the case of a fixed cutoff; (iv) choose an ontology.
Figure 2An example of the . 14,565 genes from the van't Veer dataset were ranked according to their differential expression and given as input to GOrilla. The resulting enriched GO terms are visualized using a DAG graphical representation with color coding reflecting their degree of enrichment. Nodes in the graph are clickable and give additional information on the GO terms and genes attributing to the enrichment. N is the total number of genes; B is the total number of genes associated with a specific GO term; n is the flexible cutoff, i.e. the automatically determined number of genes in the 'target set' and b is the number of genes in the 'target set' that are associated with a specific GO term. Enrichment is defined as (b/n)/(B/N).
A comparison of web-based GO enrichment tools.
| GOrilla | Exact mHG p-value computation (no need for simulations) | + | + | + | 7 Sec |
| Fatiscan [ | Fischer Exact (FDR corrected for number of thresholds) | + | - | + | 30 Min |
| GO-stat [ | Wilcoxon Rank-Sum/Kolmogorov Smirnov | + | - | + | 2 Min |
| GOEAST [ | Hypergeometric | - | + | + | 20 Min |
| SGD [ | Hypergeometric | - | + | - | 2 Min |
| DAVID [ | Modified Fischer Exact | - | - | + | 2 Min |
| GOTM [ | Hypergeometric | - | + | + | 2 Min |
| GoMiner [ | Fisher Exact | - | - | + | 7 Min |
Different GO enrichment tools employ a wide range of statistics and yield different performances. The main features of five different web-based tools are compared to GOrilla. To enable a fair comparison all tools were used using default parameters via their web interfaces and applied on the van't Veer dataset. One exception is the SGD tool that only runs on yeast data and was therefore tested on a set of 543 yeast genes and the default background for running time characterization. The GOrilla running time for this yeast dataset was also 7 seconds. The running time was measured for the entire analysis, including uploading files and getting the results.