| Literature DB >> 21646343 |
Jüri Reimand1, Tambet Arak, Jaak Vilo.
Abstract
Functional interpretation of candidate gene lists is an essential task in modern biomedical research. Here, we present the 2011 update of g:Profiler (http://biit.cs.ut.ee/gprofiler/), a popular collection of web tools for functional analysis. g:GOSt and g:Cocoa combine comprehensive methods for interpreting gene lists, ordered lists and list collections in the context of biomedical ontologies, pathways, transcription factor and microRNA regulatory motifs and protein-protein interactions. Additional tools, namely the biomolecule ID mapping service (g:Convert), gene expression similarity searcher (g:Sorter) and gene homology searcher (g:Orth) provide numerous ways for further analysis and interpretation. In this update, we have implemented several features of interest to the community: (i) functional analysis of single nucleotide polymorphisms and other DNA polymorphisms is supported by chromosomal queries; (ii) network analysis identifies enriched protein-protein interaction modules in gene lists; (iii) functional analysis covers human disease genes; and (iv) improved statistics and filtering provide more concise results. g:Profiler is a regularly updated resource that is available for a wide range of species, including mammals, plants, fungi and insects.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21646343 PMCID: PMC3125778 DOI: 10.1093/nar/gkr378
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Summary of g:Profiler features in 2007 and novel additions in 2011
| Feature | g:Profiler 2007 | Added in g:Profiler 2011 |
|---|---|---|
| Supported species | 31 species Ensembl | 85 species Ensembl and Ensembl genomes, mammals, fungi, insects, plants, etc |
| Biological evidence considered | Gene ontology Pathways databases (KEGG, Reactome) Transcription factor motifs (Transfac) Gene expression similarity search (GEO) | microRNA target sites (MicroCosm) Disease genes (HPO) Protein–protein interactions (BioGrid) Gene expression similarity search (ArrayExpress) |
| Input for enrichment analysis | Gene sets Ordered gene lists Functional groups | Chromosomal regions Multiple gene lists or regions |
| Methods | Hypergeometric test Multiple testing (g:SCS, FDR, Bonferroni) | Customized background set Enrichment of interaction network modules |
| Tools | g:GOSt—gene group functional profiling g:Convert—gene ID converter g:Sorter—expression similarity search g:Orth—orthology search | g:Cocoa—compact comparison of gene annotations |
Figure 1.g:Profiler tools (A–E) and data streams (arrows) for constructing analysis pipelines. (A) g:GOSt—functional profiling of gene lists, with sources of input shown on the left [also see (F)]. Output: genes are shown horizontally and annotations vertically, colours denote classes of functional evidence. (B) g:Cocoa—functional profiling of multiple gene lists. Output: gene lists are shown horizontally and annotations vertically, colours (intensity of red) denote strength of enrichment. (C) g:Convert—ID mapping service for kinds of molecules and databases. (D) g:Sorter—gene-based expression similarity search from microarrays. (E) g:Orth—mapping homologous genes of related species. (F) Identification of enriched protein–protein interaction modules in gene lists. A fragment of g:GOSt output includes a module of interacting query genes (core, red) and non-query genes interacting with the core module (neighbourhood, black).
Figure 2.Functional analysis of positively selected regions in modern human genome versus Neanderthal genome. Input genomic loci were grouped by chromosome and analysed with g:Cocoa. R was used for visualization. Enriched annotations are shown vertically and chromosomes horizontally. Coloured cells indicate statistically significant enrichments of corresponding functions (P < 0.05, red tones represent greater significance). Annotation axis labels are grouped and coloured according to data source. To avoid redundant annotations, only most significant function of any hierarchically related group is shown.
Figure 3.Comparison of standard, global background gene list (left plot) and customized list (right plot) in functional enrichment analysis of nine core cell cycle transcription factors in yeast. Top 10 enriched categories are shown vertically and log-scale P-values horizontally. R was used for visualization. The global analysis reveals general functional enrichments of transcriptional regulation (black bars), while focused analysis with the custom background of all yeast TFs shows specific cell cycle-related terms (grey bars).