Tianlei Xu1, Peng Jin2, Zhaohui S Qin3. 1. Department of Mathematics and Computer Science, Emory University, Atlanta, GA 30322, USA. 2. Department of Human Genetics, Emory University School of Medicine, Atlanta, GA 30322, USA. 3. Department of Biostatics and Bioinformatics, Rollins School of Public Health, Emory University, Atlanta, GA 30322, USA.
Abstract
MOTIVATION: Annotating a given genomic locus or a set of genomic loci is an important yet challenging task. This is especially true for the non-coding part of the genome which is enormous yet poorly understood. Since gene set enrichment analyses have demonstrated to be effective approach to annotate a set of genes, the same idea can be extended to explore the enrichment of functional elements or features in a set of genomic intervals to reveal potential functional connections. RESULTS: In this study, we describe a novel computational strategy named loci2path that takes advantage of the newly emerged, genome-wide and tissue-specific expression quantitative trait loci (eQTL) information to help annotate a set of genomic intervals in terms of transcription regulation. By checking the presence or the absence of millions of eQTLs in a set of input genomic intervals, combined with grouping eQTLs by the pathways or gene sets that their target genes belong to, loci2path build a bridge connecting genomic intervals to functional pathways and pre-defined biological-meaningful gene sets, revealing potential for regulatory connection. Our method enjoys two key advantages over existing methods: first, we no longer rely on proximity to link a locus to a gene which has shown to be unreliable; second, eQTL allows us to provide the regulatory annotation under the context of specific tissue types. To demonstrate its utilities, we apply loci2path on sets of genomic intervals harboring disease-associated variants as query. Using 1 702 612 eQTLs discovered by the Genotype-Tissue Expression (GTEx) project across 44 tissues and 6320 pathways or gene sets cataloged in MSigDB as annotation resource, our method successfully identifies highly relevant biological pathways and revealed disease mechanisms for psoriasis and other immune-related diseases. Tissue specificity analysis of associated eQTLs provide additional evidence of the distinct roles of different tissues played in the disease mechanisms. AVAILABILITY AND IMPLEMENTATION: loci2path is published as an open source Bioconductor package, and it is available at http://bioconductor.org/packages/release/bioc/html/loci2path.html. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Annotating a given genomic locus or a set of genomic loci is an important yet challenging task. This is especially true for the non-coding part of the genome which is enormous yet poorly understood. Since gene set enrichment analyses have demonstrated to be effective approach to annotate a set of genes, the same idea can be extended to explore the enrichment of functional elements or features in a set of genomic intervals to reveal potential functional connections. RESULTS: In this study, we describe a novel computational strategy named loci2path that takes advantage of the newly emerged, genome-wide and tissue-specific expression quantitative trait loci (eQTL) information to help annotate a set of genomic intervals in terms of transcription regulation. By checking the presence or the absence of millions of eQTLs in a set of input genomic intervals, combined with grouping eQTLs by the pathways or gene sets that their target genes belong to, loci2path build a bridge connecting genomic intervals to functional pathways and pre-defined biological-meaningful gene sets, revealing potential for regulatory connection. Our method enjoys two key advantages over existing methods: first, we no longer rely on proximity to link a locus to a gene which has shown to be unreliable; second, eQTL allows us to provide the regulatory annotation under the context of specific tissue types. To demonstrate its utilities, we apply loci2path on sets of genomic intervals harboring disease-associated variants as query. Using 1 702 612 eQTLs discovered by the Genotype-Tissue Expression (GTEx) project across 44 tissues and 6320 pathways or gene sets cataloged in MSigDB as annotation resource, our method successfully identifies highly relevant biological pathways and revealed disease mechanisms for psoriasis and other immune-related diseases. Tissue specificity analysis of associated eQTLs provide additional evidence of the distinct roles of different tissues played in the disease mechanisms. AVAILABILITY AND IMPLEMENTATION: loci2path is published as an open source Bioconductor package, and it is available at http://bioconductor.org/packages/release/bioc/html/loci2path.html. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Cory Y McLean; Dave Bristor; Michael Hiller; Shoa L Clarke; Bruce T Schaar; Craig B Lowe; Aaron M Wenger; Gill Bejerano Journal: Nat Biotechnol Date: 2010-05-02 Impact factor: 54.908
Authors: Sven Heinz; Christopher Benner; Nathanael Spann; Eric Bertolino; Yin C Lin; Peter Laslo; Jason X Cheng; Cornelis Murre; Harinder Singh; Christopher K Glass Journal: Mol Cell Date: 2010-05-28 Impact factor: 17.970
Authors: Philip E Stuart; Rajan P Nair; Eva Ellinghaus; Jun Ding; Trilokraj Tejasvi; Johann E Gudjonsson; Yun Li; Stephan Weidinger; Bernadette Eberlein; Christian Gieger; H Erich Wichmann; Manfred Kunz; Robert Ike; Gerald G Krueger; Anne M Bowcock; Ulrich Mrowietz; Henry W Lim; John J Voorhees; Gonçalo R Abecasis; Michael Weichenthal; Andre Franke; Proton Rahman; Dafna D Gladman; James T Elder Journal: Nat Genet Date: 2010-10-17 Impact factor: 38.330
Authors: Ryan Lister; Ronan C O'Malley; Julian Tonti-Filippini; Brian D Gregory; Charles C Berry; A Harvey Millar; Joseph R Ecker Journal: Cell Date: 2008-05-02 Impact factor: 41.582
Authors: Joseph K Pickrell; John C Marioni; Athma A Pai; Jacob F Degner; Barbara E Engelhardt; Everlyne Nkadori; Jean-Baptiste Veyrieras; Matthew Stephens; Yoav Gilad; Jonathan K Pritchard Journal: Nature Date: 2010-03-10 Impact factor: 49.962