Literature DB >> 19237447

ClueGO: a Cytoscape plug-in to decipher functionally grouped gene ontology and pathway annotation networks.

Gabriela Bindea1, Bernhard Mlecnik, Hubert Hackl, Pornpimol Charoentong, Marie Tosolini, Amos Kirilovsky, Wolf-Herman Fridman, Franck Pagès, Zlatko Trajanoski, Jérôme Galon.   

Abstract

We have developed ClueGO, an easy to use Cytoscape plug-in that strongly improves biological interpretation of large lists of genes. ClueGO integrates Gene Ontology (GO) terms as well as KEGG/BioCarta pathways and creates a functionally organized GO/pathway term network. It can analyze one or compare two lists of genes and comprehensively visualizes functionally grouped terms. A one-click update option allows ClueGO to automatically download the most recent GO/KEGG release at any time. ClueGO provides an intuitive representation of the analysis results and can be optionally used in conjunction with the GOlorize plug-in.

Entities:  

Mesh:

Year:  2009        PMID: 19237447      PMCID: PMC2666812          DOI: 10.1093/bioinformatics/btp101

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 INTRODUCTION

Since the number of genes that can be analyzed by high-throughput experiments by far exceeded what can be interpreted by a single person, different attempts have been initiated in order to capture biological information and systematically organize the wealth of data. For example Gene Ontology (GO) (Ashburner et al., 2000) annotates genes to biological/cellular/molecular terms in a hierarchically structured way, whereas Kyoto encyclopedia of genes and genomes (KEGG) (Kanehisa et al., 2002) and BioCarta assigns genes to functional pathways. Several functional enrichment analysis tools (e.g. Boyle et al., 2004; Huang et al., 2007; Maere et al., 2005; Ramos et al., 2008; Zeeberg et al., 2003) and algorithms (e.g. Li et al., 2008) were developed to enhance data interpretation. As most of these tools mainly present their results as long lists or complex hierarchical trees, we aimed to develop ClueGO a Cytoscape (Shannon et al., 2003) plug-in to facilitate the biological interpretation and to visualize functionally grouped terms in the form of networks and charts. Other tools like BiNGO (Maere et al., 2005) or PIPE (Ramos et al., 2008) assess overrepresented GO terms and reconstruct the hierarchical ontology tree, whereas ClueGO uses kappa statistics to link the terms in the network. Compared with the approach of Ramos et al. (2008) which creates an in silico annotation network based on pathways and protein interaction data and maps the gene list of interest afterwards, ClueGO generates a dynamical network structure by already initially considering the gene lists of interest. ClueGO integrates GO terms as well as KEGG/BioCarta pathways and creates a functionally organized GO/pathway term network. A variety of flexible restriction criteria allow for visualizations in different levels of specificity. In addition, ClueGO can compare clusters of genes and visualizes their functional differences. ClueGO takes advantage of Cytoscape's versatile visualization framework and can be used in conjunction with the GOlorize plug-in (Garcia et al., 2007).

2 METHODS AND IMPLEMENTATION

ClueGO has two major features: it can be either used for the visualization of terms corresponding to a list of genes, or the comparison of functional annotations of two clusters.

2.1 Data import

Gene identifier sets can be directly uploaded in simple text format or interactively derived from gene network graphs visualized in Cytoscape. ClueGO supports several gene identifiers and organisms by default and is easy extendable for additional ones in a plug-in like manner (Supplementary Material).

2.2 Annotation sources

To allow a fast analysis, ClueGO uses precompiled annotation files including GO, KEGG and BioCarta for a wide range of organisms. A one-click update feature automatically downloads the latest ontology and annotation sources and creates new precompiled files that are added to the existing ones. This ensures an up-to-date functional analysis. Additionally ClueGO can easily integrate new annotation sources in a plug-in like way (Supplementary Material).

2.3 Enrichment tests

ClueGO offers the possibility to calculate enrichment/depletion tests for terms and groups as left-sided (Enrichment), right-sided (Depletion) or two-sided (Enrichment/Depletion) tests based on the hypergeometric distribution. Furthermore it provides options to calculate mid-P-values and doubling for two-sided tests to deal with discreetness and conservatism effects as suggested by (Rivals et al., 2007). To correct the P-values for multiple testing several standard correction methods are proposed (Bonferroni, Bonferroni step-down and Benjamini-Hochberg).

2.4 Network generation and visualization

To create the annotations network ClueGO provides predefined functional analysis settings ranging from general to very specific ones. Furthermore, the user can adjust the analysis parameters to focus on terms, e.g. in certain GO level intervals, with particular evidence codes or with a certain number and percentage of associated genes. An optional redundancy reduction feature (Fusion) assesses GO terms in a parent–child relation sharing similar associated genes and preserves the more representative parent or child term. The relationship between the selected terms is defined based on their shared genes in a similar way as described by Huang et al. (2007). ClueGO creates first a binary gene-term matrix with the selected terms and their associated genes. Based on this matrix, a term–term similarity matrix is calculated using chance corrected kappa statistics to determine the association strength between the terms. Since the term–term matrix is of categorical origin, kappa statistic was found to be the most suitable method. Finally, the created network represents the terms as nodes which are linked based on a predefined kappa score level. The kappa score level threshold can initially be adjusted on a positive scale from 0 to 1 to restrict the network connectivity in a customized way. The size of the nodes reflects the enrichment significance of the terms. The network is automatically laid out using the Organic layout algorithm supported by Cytoscape. The functional groups are created by iterative merging of initially defined groups based on the predefined kappa score threshold. The final groups are fixed or randomly colored and overlaid with the network. Functional groups represented by their most significant (leading) term are visualized in the network providing an insightful view of their interrelations. Also other ways of selecting the group leading term, e.g. based on the number or percentage of genes per term are provided. As an alternative to the kappa score grouping the GO hierarchy using parent–child relationships can be used to create functional groups. When comparing two gene clusters, another original feature of ClueGO allows to switch the visualization of the groups on the network to the cluster distribution over the terms. Besides the network, ClueGO provides overview charts showing the groups and their leading term as well as detailed term histograms for both, cluster specific and common terms. Like BiNGO, ClueGO can be used in conjuntion with GOlorize for functional analysis of a Cytoscape gene network. The created networks, charts and analysis results can be saved as project in a specified folder and used for further analysis.

3 CASE STUDY

To demonstrate how ClueGO assesses and compares biological functions for clusters of genes we selected up- and down-regulated natural killer (NK) cell genes in healthy donors from an expression profile of human peripheral blood lymphocytes (GSE6887, Gene Expression Omnibus). For upregulated NK genes ClueGO revealed specific terms like ‘Natural killer cell mediated cytotoxicity’ in the group ‘Cellular defense response’. Downregulated in NK cells compared with the reference (a pool of all immune cell types) were genes involved in the innate immune response (Macrophages), but also in the adaptive immune response (T and B cell). The common functionality refers to characteristics of leukocytes (chemotaxis), besides other terms involved in cell division and metabolism (Fig. 1).
Fig. 1.

ClueGO example analysis of up- and down-regulated NK cell genes in peripheral blood from healthy human donors. (a) GO/pathway terms specific for upregulated genes. The bars represent the number of genes associated with the terms. The percentage of genes per term is shown as bar label. (b) Overview chart with functional groups including specific terms for upregulated genes. (c) Functionally grouped network with terms as nodes linked based on their kappa score level (≥0.3), where only the label of the most significant term per group is shown. The node size represents the term enrichment significance. Functionally related groups partially overlap. Not grouped terms are shown in white. (d) The distribution of two clusters visualized on network (c). Terms with up/downregulated genes are shown in red/green, respectively. The color gradient shows the gene proportion of each cluster associated with the term. Equal proportions of the two clusters are represented in white.

ClueGO example analysis of up- and down-regulated NK cell genes in peripheral blood from healthy human donors. (a) GO/pathway terms specific for upregulated genes. The bars represent the number of genes associated with the terms. The percentage of genes per term is shown as bar label. (b) Overview chart with functional groups including specific terms for upregulated genes. (c) Functionally grouped network with terms as nodes linked based on their kappa score level (≥0.3), where only the label of the most significant term per group is shown. The node size represents the term enrichment significance. Functionally related groups partially overlap. Not grouped terms are shown in white. (d) The distribution of two clusters visualized on network (c). Terms with up/downregulated genes are shown in red/green, respectively. The color gradient shows the gene proportion of each cluster associated with the term. Equal proportions of the two clusters are represented in white.

4 SUMMARY

ClueGO is a user friendly Cytoscape plug-in to analyze interrelations of terms and functional groups in biological networks. A variety of flexible adjustments allow for a profound exploration of gene clusters in annotation networks. Our tool is easily extendable to new organisms and identifier types as well as new annotation sources which can be included in a transparent, plug-in like manner. Furthermore, the one-click update feature of ClueGO ensures an up-to-date analysis at any time.
  11 in total

1.  Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.

Authors:  M Ashburner; C A Ball; J A Blake; D Botstein; H Butler; J M Cherry; A P Davis; K Dolinski; S S Dwight; J T Eppig; M A Harris; D P Hill; L Issel-Tarver; A Kasarskis; S Lewis; J C Matese; J E Richardson; M Ringwald; G M Rubin; G Sherlock
Journal:  Nat Genet       Date:  2000-05       Impact factor: 38.330

2.  The KEGG databases at GenomeNet.

Authors:  Minoru Kanehisa; Susumu Goto; Shuichi Kawashima; Akihiro Nakaya
Journal:  Nucleic Acids Res       Date:  2002-01-01       Impact factor: 16.971

3.  Cytoscape: a software environment for integrated models of biomolecular interaction networks.

Authors:  Paul Shannon; Andrew Markiel; Owen Ozier; Nitin S Baliga; Jonathan T Wang; Daniel Ramage; Nada Amin; Benno Schwikowski; Trey Ideker
Journal:  Genome Res       Date:  2003-11       Impact factor: 9.043

4.  GO::TermFinder--open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes.

Authors:  Elizabeth I Boyle; Shuai Weng; Jeremy Gollub; Heng Jin; David Botstein; J Michael Cherry; Gavin Sherlock
Journal:  Bioinformatics       Date:  2004-08-05       Impact factor: 6.937

5.  BiNGO: a Cytoscape plugin to assess overrepresentation of gene ontology categories in biological networks.

Authors:  Steven Maere; Karel Heymans; Martin Kuiper
Journal:  Bioinformatics       Date:  2005-06-21       Impact factor: 6.937

6.  Enrichment or depletion of a GO category within a class of genes: which test?

Authors:  Isabelle Rivals; Léon Personnaz; Lieng Taing; Marie-Claude Potier
Journal:  Bioinformatics       Date:  2006-12-20       Impact factor: 6.937

7.  GOlorize: a Cytoscape plug-in for network visualization with Gene Ontology-based layout and coloring.

Authors:  Olivier Garcia; Cosmin Saveanu; Melissa Cline; Micheline Fromont-Racine; Alain Jacquier; Benno Schwikowski; Tero Aittokallio
Journal:  Bioinformatics       Date:  2006-11-24       Impact factor: 6.937

8.  The protein information and property explorer: an easy-to-use, rich-client web application for the management and functional analysis of proteomic data.

Authors:  H Ramos; P Shannon; R Aebersold
Journal:  Bioinformatics       Date:  2008-07-16       Impact factor: 6.937

9.  A global pathway crosstalk network.

Authors:  Yong Li; Pankaj Agarwal; Dilip Rajagopalan
Journal:  Bioinformatics       Date:  2008-04-23       Impact factor: 6.937

10.  The DAVID Gene Functional Classification Tool: a novel biological module-centric algorithm to functionally analyze large gene lists.

Authors:  Da Wei Huang; Brad T Sherman; Qina Tan; Jack R Collins; W Gregory Alvord; Jean Roayaei; Robert Stephens; Michael W Baseler; H Clifford Lane; Richard A Lempicki
Journal:  Genome Biol       Date:  2007       Impact factor: 13.583

View more
  2000 in total

1.  Enhancing fatty acid oxidation negatively regulates PPARs signaling in the heart.

Authors:  ZhengLong Liu; Jeffrey Ding; Timothy S McMillen; Outi Villet; Rong Tian; Dan Shao
Journal:  J Mol Cell Cardiol       Date:  2020-06-24       Impact factor: 5.000

2.  MicroRNA profiles and their control of male gametophyte development in rice.

Authors:  Hua Peng; Jun Chun; Tao-bo Ai; Yong-ao Tong; Rong Zhang; Ming-ming Zhao; Fang Chen; Sheng-hua Wang
Journal:  Plant Mol Biol       Date:  2012-03-09       Impact factor: 4.076

3.  clusterProfiler: an R package for comparing biological themes among gene clusters.

Authors:  Guangchuang Yu; Li-Gen Wang; Yanyan Han; Qing-Yu He
Journal:  OMICS       Date:  2012-03-28

4.  Identification of epidermal progenitors for the Merkel cell lineage.

Authors:  Seung-Hyun Woo; Magda Stumpfova; Uffe B Jensen; Ellen A Lumpkin; David M Owens
Journal:  Development       Date:  2010-11-01       Impact factor: 6.868

5.  Identification and differential expression of microRNAs in 1, 25-dihydroxyvitamin D3-induced osteogenic differentiation of human adipose-derived mesenchymal stem cells.

Authors:  Huijie Gu; Jun Xu; Zhongyue Huang; Liang Wu; Kaifeng Zhou; Yiming Zhang; Jiong Chen; Jiangni Xia; Xiaofan Yin
Journal:  Am J Transl Res       Date:  2017-11-15       Impact factor: 4.060

6.  Rare Disease Mechanisms Identified by Genealogical Proteomics of Copper Homeostasis Mutant Pedigrees.

Authors:  Stephanie A Zlatic; Alysia Vrailas-Mortimer; Avanti Gokhale; Lucas J Carey; Elizabeth Scott; Reid Burch; Morgan M McCall; Samantha Rudin-Rush; John Bowen Davis; Cortnie Hartwig; Erica Werner; Lian Li; Michael Petris; Victor Faundez
Journal:  Cell Syst       Date:  2018-01-31       Impact factor: 10.304

7.  Expression of LLT1 and its receptor CD161 in lung cancer is associated with better clinical outcome.

Authors:  Véronique M Braud; Jérôme Biton; Etienne Becht; Samantha Knockaert; Audrey Mansuet-Lupo; Estelle Cosson; Diane Damotte; Marco Alifano; Pierre Validire; Fabienne Anjuère; Isabelle Cremer; Nicolas Girard; Dominique Gossot; Agathe Seguin-Givelet; Marie-Caroline Dieu-Nosjean; Claire Germain
Journal:  Oncoimmunology       Date:  2018-01-29       Impact factor: 8.110

8.  Immune heterogeneity and clinicopathologic characterization of IGFBP2 in 2447 glioma samples.

Authors:  Jinquan Cai; Qun Chen; Yuqiong Cui; Jiawei Dong; Meng Chen; Pengfei Wu; Chuanlu Jiang
Journal:  Oncoimmunology       Date:  2018-02-13       Impact factor: 8.110

9.  Transcriptomic analysis of rice (Oryza sativa) endosperm using the RNA-Seq technique.

Authors:  Yi Gao; Hong Xu; Yanyue Shen; Jianbo Wang
Journal:  Plant Mol Biol       Date:  2013-01-16       Impact factor: 4.076

10.  Role of microRNAs in resveratrol-mediated mitigation of colitis-associated tumorigenesis in Apc(Min/+) mice.

Authors:  Ibrahim Altamemi; E Angela Murphy; James F Catroppo; Elizabeth E Zumbrun; Jiajia Zhang; Jamie L McClellan; Udai P Singh; Prakash S Nagarkatti; Mitzi Nagarkatti
Journal:  J Pharmacol Exp Ther       Date:  2014-05-09       Impact factor: 4.030

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.