| Literature DB >> 32241291 |
Guangyu Wang1,2,3, Qingshu Meng4,5, Bo Xia1,2,3, Shuo Zhang1,2,3, Jie Lv1,2,3, Dongyu Zhao1,2,3, Yanqiang Li1,2,3, Xin Wang1,2,3, Lili Zhang2,3, John P Cooke2,3, Qi Cao6,7, Kaifu Chen8,9,10.
Abstract
We present TADsplimer, the first computational tool to systematically detect topologically associating domain (TAD) splits and mergers across the genome between Hi-C samples. TADsplimer recaptures splits and mergers of TADs with high accuracy in simulation analyses and defines hundreds of TAD splits and mergers between pairs of different cell types, such as endothelial cells and fibroblasts. Our work reveals a key role for TAD remodeling in epigenetic regulation of transcription and delivers the first tool for the community to perform dynamic analysis of TAD splits and mergers in numerous biological and disease models.Entities:
Keywords: Bioinformatics; Chromatin conformation; Computational biology; Epigenomics; Hi-C; Histone modification; Topologically associating domains
Mesh:
Substances:
Year: 2020 PMID: 32241291 PMCID: PMC7114812 DOI: 10.1186/s13059-020-01992-7
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Fig. 1Develop the TADsplimer algorithm to detect TAD splits and mergers with high accuracy. a–d Heatmaps showing the chromatin interactions in a fibroblast TAD that was split in HUVEC (a, b) and in a HUVEC TAD that was split in fibroblast (c, d). In each heatmap, the top right triangle area indicates data for the fibroblast IMR90, and the bottom left triangle area indicates data for HUVEC. HUVEC data generated by the same lab for 3 donors was indicated by 3 heatmaps in a and c. HUVEC data generated by an additional lab was indicated in b and d. All heatmaps in a and b indicate data from the same genomic region, whereas all heatmaps in c and d indicate data from another genomic region. The blue circle indicates chromatin loops that were not disrupted by the TAD splits. Color scales for each heatmap were indicated in the top right and bottom left corners. e Cartoons showing steps I to IV for TAD identification in TADsplimer. f Cartoons showing the two steps to define TAD split or merger in one sample relative to another sample in TADsplimer. g ROC curve showing the performance of four alternative methods in TADsplimer for scoring TAD splits. h ROC curve showing the influence of five TAD identification methods on the detection of TAD splits. i Heatmaps showing the simulated frequency of chromatin interaction at a sequencing depth of 400 million (top) or 25 million (bottom) reads. j ROC curve distance to top left corner is plotted against Hi-C sequencing depth to show the performance of the four alternative methods in TADsplimer for scoring TAD splits. k ROC curve distance to top left corner is plotted against Hi-C sequencing depth to show the influence of the five TAD identification methods on detection of TAD splits
Fig. 2TADsplimer successfully detected TAD splits. a Chromosome map showing the genomic locations of fibroblast (IMR90) TADs that were split in HUVEC (blue) and HUVEC TADs that were split in IMR90 (red). b Heatmaps showing the average frequency of chromatin interaction in 6 aggregates: merged in HUVEC (top left) and split in IMR90 (top right), split in HUVEC (middle left) and merged in IMR90 (middle right), and all adjoint TADs in HUVEC (bottom left) and IMR90 (bottom right). c Violin plot of TAD sizes for merged, split, and regular TADs from HUVEC and IMR90 cells. d Boxplots showing the binding frequency of CTCF at the individual group of TAD boundaries. e Heatmaps showing the number of split TADs between cell types. f Boxplots showing the Jaccard index of split TADs between replicates. Results were plotted for individual TAD split identification methods. g Boxplots showing the Jaccard index of identified TADs between replicates. Results were plotted for individual TAD calling methods. h Enrichment of representative pathways in genes associated with split or merged TADs. P value was determined by Fisher’s exact test and adjusted by the B-H method. i Heatmaps and barplot showing the enriched pathways and the number of enriched pathways, respectively, for genes in split and merged TADs defined by alternative methods. P value was determined by Fisher’s exact test and adjusted by the B-H method
Fig. 3TAD splits and mergers are associated with changes in chromosome state. a Heatmaps of chromatin interactions determined by Hi-C (top panels) and Genome Browser tracks of ChIP-Seq signal for histone modifications (bottom panels) in HUVEC and IMR90. Vertical dash lines indicate TAD split sites. Color scales for each heatmap were indicated in the top right and bottom left corners. b Boxplot showing the difference in each histone modification between the two sides of TAD boundaries in HUVEC and IMR90 cells. c Heatmaps of chromatin interactions determined by Hi-C and DNase-Seq signal around a TAD split site at five stages of T cell lineage specification. d Heatmap of TAD split score (top panels) and fold difference of DNase-Seq signal between the two sides (bottom panels) of individual TAD merge sites (left panels) and split sites (right panels). e, f Percentage of TAD split sites associated with each category of histone modification change between IMR90 and HUVEC (e) or DNase-Seq signal change across the 8 stages of T cell lineage specification (f). “↑,” “↓,” and “−” denote increase, decrease, and no change of a histone modification or DNase-Seq signal at one side of the split site in response to the splitting. Each category of change is defined by changes at the two sides of the split site. For T cell lineage specification, the TAD splits are defined between DP and HSPC cells
Fig. 4TAD splits and mergers are associated with changes in gene expression. a Boxplot indicating the expression level of genes at the two sides of TAD split sites before and after splitting. b Arc plot showing the chromatin interactions (top) and gene expression (bottom) at 5 stages of T cell lineage specification. One gene from each side of the split site was indicated. c Heatmaps of split scores for individual split sites (top panels) and fold difference of gene expression between the two sides of the split site (bottom panels) at 8 stages of T cell lineage specification. Genes with an expression value (FPKM) larger than 1 were analyzed