Literature DB >> 26969411

HiView: an integrative genome browser to leverage Hi-C results for the interpretation of GWAS variants.

Zheng Xu1,2,3, Guosheng Zhang2,4, Qing Duan2, Shengjie Chai4, Baqun Zhang5, Cong Wu6, Fulai Jin7, Feng Yue8, Yun Li9,10,11, Ming Hu12.   

Abstract

BACKGROUND: Genome-wide association studies (GWAS) have identified thousands of genetic variants associated with complex traits and diseases. However, most of them are located in the non-protein coding regions, and therefore it is challenging to hypothesize the functions of these non-coding GWAS variants. Recent large efforts such as the ENCODE and Roadmap Epigenomics projects have predicted a large number of regulatory elements. However, the target genes of these regulatory elements remain largely unknown. Chromatin conformation capture based technologies such as Hi-C can directly measure the chromatin interactions and have generated an increasingly comprehensive catalog of the interactome between the distal regulatory elements and their potential target genes. Leveraging such information revealed by Hi-C holds the promise of elucidating the functions of genetic variants in human diseases.
RESULTS: In this work, we present HiView, the first integrative genome browser to leverage Hi-C results for the interpretation of GWAS variants. HiView is able to display Hi-C data and statistical evidence for chromatin interactions in genomic regions surrounding any given GWAS variant, enabling straightforward visualization and interpretation.
CONCLUSIONS: We believe that as the first GWAS variants-centered Hi-C genome browser, HiView is a useful tool guiding post-GWAS functional genomics studies. HiView is freely accessible at: http://www.unc.edu/~yunmli/HiView .

Entities:  

Keywords:  GWAS variants; Hi-C data; Integrative genome browser

Mesh:

Year:  2016        PMID: 26969411      PMCID: PMC4788823          DOI: 10.1186/s13104-016-1947-0

Source DB:  PubMed          Journal:  BMC Res Notes        ISSN: 1756-0500


Findings

The eukaryotic genome is organized at multiple levels ranging from chromosomal territories to topologically associated domains. Such hierarchical three-dimensional organization is closely related to genome function [1]. Historically, the study of genome organization has relied on microscopy-based techniques, which suffers from low resolution and low throughput. Recently, a series of technologies based on chromatin conformation capture (3C) [2], such as Hi-C [3] and in situ Hi-C [4], have been developed, enabling a high resolution genome-wide view of chromosomal architecture. Data from 3C-based technologies can shed light on the structural and functional mechanisms, including non-coding variants identified for complex trait associations in genome-wide association studies (GWAS). GWAS has been resoundingly successful, identifying thousands of variants associated with complex traits. However, only a small proportion (7–12 %) of these variants fall into protein coding regions [5], making the interpretation of non-coding variants imperative. With the help of 3C-based technologies, a recent study [6] identified long-range (at megabase distances) interactions between the obesity-associated intronic variants in FTO gene and the promoter region of homeobox gene IRX3, demonstrating it is the expression of IRX3 rather than FTO that is directly linked to body mass and composition. This study showcased the power of 3C-based technologies for elucidating the functional mechanisms of genetic variants implicated by GWAS. As 3C-derived technologies have been increasingly widely used, multiple visualization tools have been devised recently, such as Hi-C data browser [3] and 3D genome browser [7]. In addition, WashU EpiGenome browser is widely utilized for simultaneous visualization of Hi-C and other epigenetic data from the Roadmap Epigenomics project [8]. Most recently, Juicebox has been developed for visualizing the in situ Hi-C data [4]. Meanwhile, HiBrowse [9] has been developed to facilitate statistical analysis of Hi-C data. Although many useful visualization tools have been developed, none of them is able to display 3C-based data with a focus on GWAS variants interpretation, preventing researchers from fully mining rich information, generating testable hypothesis, and visually validating biological findings. In addition, few of them incorporates peak calling results from 3C-based data or shows the magnitude of statistical evidence, making the interpretation of the statistical significance of 3C-based data extremely challenging. To fill in the above gaps, we present HiView, the first genome browser for GWAS-variant centered visualization of Hi-C data. Additional file 1: Figure S1 shows the user interface of HiView. Users can select and extract genomic annotation of a GWAS variant by selecting the marker type and specifying the marker name. HiView displays raw and expected count data, and measures of statistical significance from several state-of-the-art Hi-C peak callers, such as AFC [10], Fit-Hi-C [11] and a hidden Markov random field (HMRF) based Hi-C peak caller [12]. By creating an ensemble of peak calling results from different approaches, users can have more robust data interpretations. For gene annotation, HiView incorporates three gene annotation tracks: (1) Ensembl genes, (2) UCSC genes and (3) RefSeq genes. Users can configure HiView for customized visualization in many ways (detailed in the online tutorial) including but not limited to (1) selecting tracks to display, (2) specifying the order of displayed tracks, (3) moving the viewing window upstream and downstream, zooming in and out, and specifying the range of the viewing window, (4) specifying the genomic regions to highlight, (5) specifying the text and color used for each track and (6) specifying the picture size and width. HiView also provides a table of numerical values of Hi-C data and peak calling results that can be downloaded by users. Figures 1 and Additional file 1: Figure S2 show an example of HiView figure and HiView table, respectively. A detailed tutorial to generate Fig. 1 can be found in the Additional file 1: Section S1.
Fig. 1

HiView snapshot of GWAS variant rs1447295. The left and right light blue bars highlight the location of GWAS variant rs1447295 and gene MYC, respectively. Using Hi-C data from human IMR90 cells, we observe five paired-end reads spanning between rs1447295 and the transcription start site of gene MYC, while the expected contact frequency is 0.8281. Such long-range chromatin interaction is statistically significant, with p-value 0.0016. Therefore, we hypothesize that gene MYC is a potential target of this likely regulatory GWAS variant rs1447295

HiView snapshot of GWAS variant rs1447295. The left and right light blue bars highlight the location of GWAS variant rs1447295 and gene MYC, respectively. Using Hi-C data from human IMR90 cells, we observe five paired-end reads spanning between rs1447295 and the transcription start site of gene MYC, while the expected contact frequency is 0.8281. Such long-range chromatin interaction is statistically significant, with p-value 0.0016. Therefore, we hypothesize that gene MYC is a potential target of this likely regulatory GWAS variant rs1447295 Here is an example of using HiView to leverage Hi-C results for the interpretation of GWAS variants. Multiple studies [13, 14] have identified rs1447295 to be associated with the risk of prostate cancer. Although rs1447295 was mapped as an intronic variant in CASC8 lncRNA, its functional mechanisms are still unknown. Both RgulomeDB [15] and HaploReg [16] identify this variant as an enhancer for multiple cell lines, indicating its potential regulatory role. Using the high resolution fragment level Hi-C data from human IMR90 lung fibroblastic cells [10], we observed statistically significant long-range chromatin interactions between rs1447295 and the transcription start site of the MYC gene with p value 0.0016 (Fig. 1). Therefore, we hypothesized that MYC gene is a potential target of this likely regulatory GWAS variant rs1447295 [17]. In this work, the Hi-C data and GWAS variant were collected from different cell types. It would be more informative to integrative Hi-C data and GWAS variants from the same cancer cell line, to fully understand the mechanistic relationship. As Hi-C data from more tissue and cell types are generated, we will have a more comprehensive understanding of tissue or cell type specific target genes. The HiView interface is implemented using PHP, HTML and cascading styling sheets (CSS) languages. Hi-C and GWAS data are stored in a MySQL database in the UNC Linux server. HiView is compatible with Internet Explorer, Chrome and Firefox. HiView also allows users to upload their own Hi-C dataset for customized comparison and visualization. In summary, we present HiView, a visualization tool that integrates raw Hi-C data and chromatin interactions identified by various peak callers for the interpretation of GWAS variants. HiView is the first genetic GWAS-variant centered visualization tool for Hi-C data. The resulting one-dimensional view allows close examination of interactions between each GWAS variant and all genes in the region the variant resides. We believe that HiView will facilitate the interpretation of GWAS variants, particularly the identification of their potential target genes.

Availability and requirements

Project name: HiView. Project home page: http://www.unc.edu/~yunmli/HiView. Operating system(s): Platform independent. Programming language: PHP, HTML and cascading styling sheets (CSS) languages. Other requirements: browser such as Internet Explorer, Chrome and Firefox. License: GNU GPL (version 3, 06/29/2007). Any restriction to use by non-academics: none.

Availability of supporting data

Original raw data used in Fig. 1, Additional file 1: Figures S1 and S2 were retrieved from the NCBI Gene Expression Omnibus repository (GSE43070: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE43070).
  16 in total

1.  A hidden Markov random field-based Bayesian method for the detection of long-range chromosomal interactions in Hi-C data.

Authors:  Zheng Xu; Guosheng Zhang; Fulai Jin; Mengjie Chen; Terrence S Furey; Patrick F Sullivan; Zhaohui Qin; Ming Hu; Yun Li
Journal:  Bioinformatics       Date:  2015-11-04       Impact factor: 6.937

2.  Potential etiologic and functional implications of genome-wide association loci for human diseases and traits.

Authors:  Lucia A Hindorff; Praveen Sethupathy; Heather A Junkins; Erin M Ramos; Jayashri P Mehta; Francis S Collins; Teri A Manolio
Journal:  Proc Natl Acad Sci U S A       Date:  2009-05-27       Impact factor: 11.205

3.  Gene regulation in the third dimension.

Authors:  Job Dekker
Journal:  Science       Date:  2008-03-28       Impact factor: 47.728

4.  Capturing chromosome conformation.

Authors:  Job Dekker; Karsten Rippe; Martijn Dekker; Nancy Kleckner
Journal:  Science       Date:  2002-02-15       Impact factor: 47.728

5.  Genome-wide association study identifies a second prostate cancer susceptibility variant at 8q24.

Authors:  Julius Gudmundsson; Patrick Sulem; Andrei Manolescu; Laufey T Amundadottir; Daniel Gudbjartsson; Agnar Helgason; Thorunn Rafnar; Jon T Bergthorsson; Bjarni A Agnarsson; Adam Baker; Asgeir Sigurdsson; Kristrun R Benediktsdottir; Margret Jakobsdottir; Jianfeng Xu; Thorarinn Blondal; Jelena Kostic; Jielin Sun; Shyamali Ghosh; Simon N Stacey; Magali Mouy; Jona Saemundsdottir; Valgerdur M Backman; Kristleifur Kristjansson; Alejandro Tres; Alan W Partin; Marjo T Albers-Akkers; Javier Godino-Ivan Marcos; Patrick C Walsh; Dorine W Swinkels; Sebastian Navarrete; Sarah D Isaacs; Katja K Aben; Theresa Graif; John Cashy; Manuel Ruiz-Echarri; Kathleen E Wiley; Brian K Suarez; J Alfred Witjes; Mike Frigge; Carole Ober; Eirikur Jonsson; Gudmundur V Einarsson; Jose I Mayordomo; Lambertus A Kiemeney; William B Isaacs; William J Catalona; Rosa B Barkardottir; Jeffrey R Gulcher; Unnur Thorsteinsdottir; Augustine Kong; Kari Stefansson
Journal:  Nat Genet       Date:  2007-04-01       Impact factor: 38.330

6.  Comprehensive mapping of long-range interactions reveals folding principles of the human genome.

Authors:  Erez Lieberman-Aiden; Nynke L van Berkum; Louise Williams; Maxim Imakaev; Tobias Ragoczy; Agnes Telling; Ido Amit; Bryan R Lajoie; Peter J Sabo; Michael O Dorschner; Richard Sandstrom; Bradley Bernstein; M A Bender; Mark Groudine; Andreas Gnirke; John Stamatoyannopoulos; Leonid A Mirny; Eric S Lander; Job Dekker
Journal:  Science       Date:  2009-10-09       Impact factor: 47.728

7.  Annotation of functional variation in personal genomes using RegulomeDB.

Authors:  Alan P Boyle; Eurie L Hong; Manoj Hariharan; Yong Cheng; Marc A Schaub; Maya Kasowski; Konrad J Karczewski; Julie Park; Benjamin C Hitz; Shuai Weng; J Michael Cherry; Michael Snyder
Journal:  Genome Res       Date:  2012-09       Impact factor: 9.043

8.  HaploReg: a resource for exploring chromatin states, conservation, and regulatory motif alterations within sets of genetically linked variants.

Authors:  Lucas D Ward; Manolis Kellis
Journal:  Nucleic Acids Res       Date:  2011-11-07       Impact factor: 16.971

9.  Topological domains in mammalian genomes identified by analysis of chromatin interactions.

Authors:  Jesse R Dixon; Siddarth Selvaraj; Feng Yue; Audrey Kim; Yan Li; Yin Shen; Ming Hu; Jun S Liu; Bing Ren
Journal:  Nature       Date:  2012-04-11       Impact factor: 49.962

10.  A high-resolution map of the three-dimensional chromatin interactome in human cells.

Authors:  Fulai Jin; Yan Li; Jesse R Dixon; Siddarth Selvaraj; Zhen Ye; Ah Young Lee; Chia-An Yen; Anthony D Schmitt; Celso A Espinoza; Bing Ren
Journal:  Nature       Date:  2013-10-20       Impact factor: 49.962

View more
  5 in total

1.  3DIV: A 3D-genome Interaction Viewer and database.

Authors:  Dongchan Yang; Insu Jang; Jinhyuk Choi; Min-Seo Kim; Andrew J Lee; Hyunwoong Kim; Junghyun Eom; Dongsup Kim; Inkyung Jung; Byungwook Lee
Journal:  Nucleic Acids Res       Date:  2018-01-04       Impact factor: 16.971

2.  3Disease Browser: A Web server for integrating 3D genome and disease-associated chromosome rearrangement data.

Authors:  Ruifeng Li; Yifang Liu; Tingting Li; Cheng Li
Journal:  Sci Rep       Date:  2016-10-13       Impact factor: 4.379

3.  Chromosome contacts in activated T cells identify autoimmune disease candidate genes.

Authors:  Oliver S Burren; Arcadio Rubio García; Biola-Maria Javierre; Daniel B Rainbow; Jonathan Cairns; Nicholas J Cooper; John J Lambourne; Ellen Schofield; Xaquin Castro Dopico; Ricardo C Ferreira; Richard Coulson; Frances Burden; Sophia P Rowlston; Kate Downes; Steven W Wingett; Mattia Frontini; Willem H Ouwehand; Peter Fraser; Mikhail Spivakov; John A Todd; Linda S Wicker; Antony J Cutler; Chris Wallace
Journal:  Genome Biol       Date:  2017-09-04       Impact factor: 13.583

4.  The genomic landscape of human cellular circadian variation points to a novel role for the signalosome.

Authors:  Ludmila Gaspar; Cedric Howald; Konstantin Popadin; Bert Maier; Daniel Mauvoisin; Ermanno Moriggi; Maria Gutierrez-Arcelus; Emilie Falconnet; Christelle Borel; Dieter Kunz; Achim Kramer; Frederic Gachon; Emmanouil T Dermitzakis; Stylianos E Antonarakis; Steven A Brown
Journal:  Elife       Date:  2017-09-04       Impact factor: 8.140

5.  Analysis of genetic and nongenetic factors influencing triglycerides-lowering drug effects based on paired observations.

Authors:  Zheng Xu; Qing Duan; Juan Cui; Yumou Qiu; Qidong Jia; Cong Wu; Jennifer Clarke
Journal:  BMC Proc       Date:  2018-09-17
  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.