| Literature DB >> 36060805 |
Wujuan Zhong1, Weifang Liu2, Jiawen Chen2, Quan Sun2, Ming Hu3, Yun Li2,4,5.
Abstract
Genome-wide association studies (GWAS) have identified a vast number of variants associated with various complex human diseases and traits. However, most of these GWAS variants reside in non-coding regions producing no proteins, making the interpretation of these variants a daunting challenge. Prior evidence indicates that a subset of non-coding variants detected within or near cis-regulatory elements (e.g., promoters, enhancers, silencers, and insulators) might play a key role in disease etiology by regulating gene expression. Advanced sequencing- and imaging-based technologies, together with powerful computational methods, enabling comprehensive characterization of regulatory DNA interactions, have substantially improved our understanding of the three-dimensional (3D) genome architecture. Recent literature witnesses plenty of examples where using chromosome conformation capture (3C)-based technologies successfully links non-coding variants to their target genes and prioritizes relevant tissues or cell types. These examples illustrate the critical capability of 3D genome organization in annotating non-coding GWAS variants. This review discusses how 3D genome organization information contributes to elucidating the potential roles of non-coding GWAS variants in disease etiology.Entities:
Keywords: 3D genome organization; FIREs; GWAS variants; Hi-C; TADs; chromatin interactions; non-coding DNA variation
Year: 2022 PMID: 36060805 PMCID: PMC9437546 DOI: 10.3389/fcell.2022.957292
Source DB: PubMed Journal: Front Cell Dev Biol ISSN: 2296-634X
FIGURE 1Illustrations of sequencing- and microscopy-based methods. (A) [Adapted from Figure 1A in (Fang et al., 2016)] Sequencing-based PLAC-seq method captures chromatin interactions mediated by a protein of interest; (B) [Adapted from Figure 1A in (Su et al., 2020)] Microscopy-based DNA MERFISH method allows multiplexed genome-scale imaging. Each square on the left of the arrow represents one round of imaging where each circle represents one locus imaged. In each round, multiple loci are simultaneously imaged. After many rounds of imaging, genome-scale imaging can be obtained. Note that the number of rounds required to image the same number of loci is inversely proportional to the number of loci imaged simultaneously, with substantially reduced number of rounds compared to the sequencing imaging strategy where only one locus is imaged in each round.
Review papers and collections of computational approaches for chromatin interactions and domains.
| Title | Category | Description | Year | References |
|---|---|---|---|---|
| A critical assessment of topologically associating domain prediction tools | TADs | Compared seven TAD calling methods | 2017 |
|
| Comparison of computational methods for Hi-C data analysis | TADs and chromatin interactions | Compared seven TAD calling methods and six chromatin interaction callers | 2017 |
|
| Comparison of computational methods for the identification of topologically associating domains | TADs | Compared 20 TAD calling methods | 2018 |
|
| Computational methods for analyzing genome-wide chromosome conformation capture data | General pipeline | Reviewed pipelines and methods for 3C-based data | 2018 |
|
| Computational methods for assessing chromatin hierarchy | General pipeline | Reviewed computational tools for assessing chromatin hierarchy | 2018 |
|
| Computational methods for analyzing and modeling genome structure and organization | General pipeline | Reviewed analytic and modeling techniques for 3C-based methods | 2018 |
|
| Hi-C analysis: from data generation to integration | General pipeline | Reviewed methods for Hi-C data analysis | 2019 |
|
| Comparison of computational methods for 3D genome analysis at single-cell Hi-C level | General pipeline | Compared the performance of Hi-C methods on ultra-sparse Hi-C data | 2020 |
|
| Computational methods for the prediction of chromatin interaction and organization using sequence and epigenomic profiles | Prediction | Summarized 48 computational methods for predicting chromatin interactions and spatial organization features | 2021 |
|
| A comparison of topologically associating domain callers over mammals at high resolution | TADs | Compared 27 TAD calling methods | 2022 |
|
| A comparison of topologically associating domain callers based on Hi-C data | TADs | Compared 26 TAD calling methods | 2022 |
|
| Bacon: a comprehensive computational benchmarking framework for evaluating targeted chromatin conformation capture-specific methodologies | Chromatin interactions | Benchmarked 12 computational pipelines for HiChIP/PLAC-seq and/or ChIA-PET data | 2022 |
|
| Hi-C data analysis tools and papers | General pipeline | A collection of Hi-C tools and papers | Accessed on 05/27/2022 |
|
| 4DN Software | General pipeline | A collection of data analysis and visualization tools for studying the 3D genome | Accessed on 05/27/2022 |
|
FIGURE 2Different types of TAD boundary alteration and the EPHA4 example. (A) Wild type (WT), removal, inversion, and duplication of TAD boundary. (B) The normal status of TAD boundaries at the EPHA4 locus. (C) With an inversion genetic variant, aberrant TAD boundaries at the EPHA4 locus were observed in F-syndrome patients. The enhancer and TAD boundary to the left of EPHA4 are inverted, resulting in repression of EPHA4 expression and activation of WNT6 expression.
FIGURE 3Triglycerides-GWAS signals near a liver-specific FIRE region. (A) Locuszoom plot of GWAS results for triglycerides (Willer et al., 2013). (B) FIRE scores across 21 human cell lines and primary tissues examined in Schmitt et al. Each color represents a tissue or cell line. GM12878: the GM12878 lymphoblastoid cell line (LCL), H1: the H1 human embryonic stem cell line, IMR90: the IMR90 human lung fibroblast cell line, MES: the human mesendoderm cell line, MSC: the human mesenchymal stem cell lines, NPC: the human neural progenitor cell line, TRO: the human trophoblasts-like cell line, AD: the human adrenal gland tissue, AO: the human aorta tissue, BL: the human bladder tissue, CO: the human dorsolateral prefrontal cortex tissue, HC: the human hippocampus tissue, LG: the human lung tissue, LI: the human liver tissue, LV: the human left ventricle tissue, OV: the human ovary tissue, PA: the human pancreas tissue, PO: the human psoas muscle tissue, RV: the human right ventricle tissue, SB: the human small bowel tissue, SX: the human spleen tissue.
FIGURE 4(A) Chromatin interaction between rs12740374, an LDL GWAS variant, and promoter of the SORT1 gene, reported by Fulco et al. (2019); (B) Virtual 4C plot from HUGIn (Martin et al., 2017), for the same region in Panel A, shows a significant chromatin interaction between the anchor bin harbor rs12740374 (the gray highlighted region) the and the promoter of the SORT1 gene (green highlight), in human liver tissue. The top panel shows gene expression levels and the bottom panel includes three lines quantifying chromatin interactions between the anchor bin and all other bins in the region: black line denotes the observed counts, red line denotes the expected counts, and blue line denotes the -log10 (p value).
FIGURE 5eSCAN workflow. (A) eSCAN takes genotype and phenotype as well as a list of predefined enhancer (En1-En6 in the illustration) regions as input. (B) Aggregation-based association tests are performed in the enhancer-screening step to identify significant enhancer(s). In this illustration, En2 (green), En3 (yellow), and En6 (turquoise) are deemed significant. (C) eSCAN performs dynamic sliding window scanning within the significant enhancer region(s) to further narrow down the associated region. For example, En2* is the associated sub-region within En2 after narrowing down via dynamic scanning. Similar for En3* and En6*.
FIGURE 6Cell deconvolution methods take bulk Hi-C contact matrices as input to infer cell-type proportion in each sample and cell-type-specific profiles.