Literature DB >> 23758478

Redundancy control in pathway databases (ReCiPa): an application for improving gene-set enrichment analysis in Omics studies and "Big data" biology.

Juan C Vivar1, Priscilla Pemu, Ruth McPherson, Sujoy Ghosh.   

Abstract

Abstract Unparalleled technological advances have fueled an explosive growth in the scope and scale of biological data and have propelled life sciences into the realm of "Big Data" that cannot be managed or analyzed by conventional approaches. Big Data in the life sciences are driven primarily via a diverse collection of 'omics'-based technologies, including genomics, proteomics, metabolomics, transcriptomics, metagenomics, and lipidomics. Gene-set enrichment analysis is a powerful approach for interrogating large 'omics' datasets, leading to the identification of biological mechanisms associated with observed outcomes. While several factors influence the results from such analysis, the impact from the contents of pathway databases is often under-appreciated. Pathway databases often contain variously named pathways that overlap with one another to varying degrees. Ignoring such redundancies during pathway analysis can lead to the designation of several pathways as being significant due to high content-similarity, rather than truly independent biological mechanisms. Statistically, such dependencies also result in correlated p values and overdispersion, leading to biased results. We investigated the level of redundancies in multiple pathway databases and observed large discrepancies in the nature and extent of pathway overlap. This prompted us to develop the application, ReCiPa (Redundancy Control in Pathway Databases), to control redundancies in pathway databases based on user-defined thresholds. Analysis of genomic and genetic datasets, using ReCiPa-generated overlap-controlled versions of KEGG and Reactome pathways, led to a reduction in redundancy among the top-scoring gene-sets and allowed for the inclusion of additional gene-sets representing possibly novel biological mechanisms. Using obesity as an example, bioinformatic analysis further demonstrated that gene-sets identified from overlap-controlled pathway databases show stronger evidence of prior association to obesity compared to pathways identified from the original databases.

Entities:  

Mesh:

Year:  2013        PMID: 23758478      PMCID: PMC3727566          DOI: 10.1089/omi.2012.0083

Source DB:  PubMed          Journal:  OMICS        ISSN: 1536-2310


  32 in total

1.  The Gene Ontology (GO) database and informatics resource.

Authors:  M A Harris; J Clark; A Ireland; J Lomax; M Ashburner; R Foulger; K Eilbeck; S Lewis; B Marshall; C Mungall; J Richter; G M Rubin; J A Blake; C Bult; M Dolan; H Drabkin; J T Eppig; D P Hill; L Ni; M Ringwald; R Balakrishnan; J M Cherry; K R Christie; M C Costanzo; S S Dwight; S Engel; D G Fisk; J E Hirschman; E L Hong; R S Nash; A Sethuraman; C L Theesfeld; D Botstein; K Dolinski; B Feierbach; T Berardini; S Mundodi; S Y Rhee; R Apweiler; D Barrell; E Camon; E Dimmer; V Lee; R Chisholm; P Gaudet; W Kibbe; R Kishore; E M Schwarz; P Sternberg; M Gwinn; L Hannick; J Wortman; M Berriman; V Wood; N de la Cruz; P Tonellato; P Jaiswal; T Seigfried; R White
Journal:  Nucleic Acids Res       Date:  2004-01-01       Impact factor: 16.971

2.  Analyzing gene expression data in terms of gene sets: methodological issues.

Authors:  Jelle J Goeman; Peter Bühlmann
Journal:  Bioinformatics       Date:  2007-02-15       Impact factor: 6.937

3.  Significance levels for studies with correlated test statistics.

Authors:  Jianxin Shi; Douglas F Levinson; Alice S Whittemore
Journal:  Biostatistics       Date:  2007-12-18       Impact factor: 5.899

4.  A novel signaling pathway impact analysis.

Authors:  Adi Laurentiu Tarca; Sorin Draghici; Purvesh Khatri; Sonia S Hassan; Pooja Mittal; Jung-Sun Kim; Chong Jai Kim; Juan Pedro Kusanovic; Roberto Romero
Journal:  Bioinformatics       Date:  2008-11-05       Impact factor: 6.937

5.  Dual role of isocitrate lyase 1 in the glyoxylate and methylcitrate cycles in Mycobacterium tuberculosis.

Authors:  Ty A Gould; Helmus van de Langemheen; Ernesto J Muñoz-Elías; John D McKinney; James C Sacchettini
Journal:  Mol Microbiol       Date:  2006-08       Impact factor: 3.501

6.  Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles.

Authors:  Aravind Subramanian; Pablo Tamayo; Vamsi K Mootha; Sayan Mukherjee; Benjamin L Ebert; Michael A Gillette; Amanda Paulovich; Scott L Pomeroy; Todd R Golub; Eric S Lander; Jill P Mesirov
Journal:  Proc Natl Acad Sci U S A       Date:  2005-09-30       Impact factor: 11.205

7.  Early alterations of the immune transcriptome in cultured progenitor cells from obese African-American women.

Authors:  Priscilla E Pemu; Leonard Anderson; Beatrice E Gee; Elizabeth O Ofili; Sujoy Ghosh
Journal:  Obesity (Silver Spring)       Date:  2012-01-19       Impact factor: 5.002

8.  Centrality-based pathway enrichment: a systematic approach for finding significant pathways dominated by key genes.

Authors:  Zuguang Gu; Jialin Liu; Kunming Cao; Junfeng Zhang; Jin Wang
Journal:  BMC Syst Biol       Date:  2012-06-06

9.  AgBase: a unified resource for functional analysis in agriculture.

Authors:  Fiona M McCarthy; Susan M Bridges; Nan Wang; G Bryce Magee; W Paul Williams; Dawn S Luthe; Shane C Burgess
Journal:  Nucleic Acids Res       Date:  2006-11-29       Impact factor: 16.971

10.  Microarray-based gene set analysis: a comparison of current methods.

Authors:  Sarah Song; Michael A Black
Journal:  BMC Bioinformatics       Date:  2008-11-27       Impact factor: 3.169

View more
  14 in total

Review 1.  H3Africa and the African life sciences ecosystem: building sustainable innovation.

Authors:  Collet Dandara; Farah Huzair; Alexander Borda-Rodriguez; Shadreck Chirikure; Ikechi Okpechi; Louise Warnich; Collen Masimirembwa
Journal:  OMICS       Date:  2014-12

2.  Network analyses of sperm-egg recognition and binding: ready to rethink fertility mechanisms?

Authors:  Nicola Bernabò; Alessandra Ordinelli; Raffaele Di Agostino; Mauro Mattioli; Barbara Barboni
Journal:  OMICS       Date:  2014-12

3.  Graph Algorithms for Condensing and Consolidating Gene Set Analysis Results.

Authors:  Sara R Savage; Zhiao Shi; Yuxing Liao; Bing Zhang
Journal:  Mol Cell Proteomics       Date:  2019-05-29       Impact factor: 5.911

4.  SIGNAL: A web-based iterative analysis platform integrating pathway and network approaches optimizes hit selection from genome-scale assays.

Authors:  Samuel Katz; Jian Song; Kyle P Webb; Nicolas W Lounsbury; Clare E Bryant; Iain D C Fraser
Journal:  Cell Syst       Date:  2021-03-24       Impact factor: 11.091

Review 5.  Beyond standard pipeline and p < 0.05 in pathway enrichment analyses.

Authors:  Wentian Li; Andrew Shih; Yun Freudenberg-Hua; Wen Fury; Yaning Yang
Journal:  Comput Biol Chem       Date:  2021-02-12       Impact factor: 3.737

6.  Pathway Network Analyses for Autism Reveal Multisystem Involvement, Major Overlaps with Other Diseases and Convergence upon MAPK and Calcium Signaling.

Authors:  Ya Wen; Mohamad J Alshikho; Martha R Herbert
Journal:  PLoS One       Date:  2016-04-07       Impact factor: 3.240

7.  Detecting gene subnetworks under selection in biological pathways.

Authors:  Alexandre Gouy; Joséphine T Daub; Laurent Excoffier
Journal:  Nucleic Acids Res       Date:  2017-09-19       Impact factor: 16.971

8.  A study on multi-omic oscillations in Escherichia coli metabolic networks.

Authors:  Francesco Bardozzo; Pietro Lió; Roberto Tagliaferri
Journal:  BMC Bioinformatics       Date:  2018-07-09       Impact factor: 3.169

9.  The Pathway Coexpression Network: Revealing pathway relationships.

Authors:  Yered Pita-Juárez; Gabriel Altschuler; Sokratis Kariotis; Wenbin Wei; Katjuša Koler; Claire Green; Rudolph E Tanzi; Winston Hide
Journal:  PLoS Comput Biol       Date:  2018-03-19       Impact factor: 4.475

10.  Using set theory to reduce redundancy in pathway sets.

Authors:  Ruth Alexandra Stoney; Jean-Marc Schwartz; David L Robertson; Goran Nenadic
Journal:  BMC Bioinformatics       Date:  2018-10-19       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.