| Literature DB >> 17204154 |
Pedro Carmona-Saez1, Monica Chagoyen, Francisco Tirado, Jose M Carazo, Alberto Pascual-Montano.
Abstract
We present GENECODIS, a web-based tool that integrates different sources of information to search for annotations that frequently co-occur in a set of genes and rank them by statistical significance. The analysis of concurrent annotations provides significant information for the biologic interpretation of high-throughput experiments and may outperform the results of standard methods for the functional analysis of gene lists. GENECODIS is publicly available at http://genecodis.dacya.ucm.es/.Entities:
Mesh:
Year: 2007 PMID: 17204154 PMCID: PMC1839127 DOI: 10.1186/gb-2007-8-1-r3
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Figure 1Overview of the methodology. (a) Annotations from several sources are assigned to genes in the input list. (b) The apriori algorithm is applied to find sets of annotations that frequently co-occur in the input list. (c) The statistical significance of each annotation or set of concurrent annotations is calculated based on its frequency in the input and reference sets. The figure illustrates an example in which a list of yeast genes is annotated with Gene Ontology (GO) terms for 'cellular component' and KEGG pathways. In the output table only the annotations that co-occur in more than five genes are shown.
Figure 2Screenshot depicting results of the analysis of yeast genes. The 'Annotation/s' column represents the Gene Ontology codes of annotations found in the list. The '# list' and '# reference' columns represent the number of genes in the input list and reference list for a given annotation, respectively. The 'Genes' column represents the set of genes in the input list showing a given annotation. The 'Description/s' column represents the textual description of annotations. CC refers to 'cellular component' and BP to 'biological process' categories. Only annotations with corrected P values ≤ 0.05 are shown. P values were calculated using the hypergeometric distribution and were corrected using the simulation-based approach.
Figure 3Screenshot depicting results of the analysis of human genes. GENECODIS results from the analysis of Gene Ontology CC ('cellular component') and InterPro motifs in the human gene set. Only annotations with corrected P values ≤ 0.05 are shown.