| Literature DB >> 26699896 |
Stephan Busche1,2, Xiaojian Shao3,4, Maxime Caron5, Tony Kwan6,7, Fiona Allum8,9, Warren A Cheung10,11, Bing Ge12, Susan Westfall13, Marie-Michelle Simon14,15, Amy Barrett16, Jordana T Bell17, Mark I McCarthy18,19,20, Panos Deloukas21,22, Mathieu Blanchette23, Guillaume Bourque24,25, Timothy D Spector26, Mark Lathrop27,28, Tomi Pastinen29,30, Elin Grundberg31,32.
Abstract
BACKGROUND: CpG methylation variation is involved in human trait formation and disease susceptibility. Analyses within populations have been biased towards CpG-dense regions through the application of targeted arrays. We generate whole-genome bisulfite sequencing data for approximately 30 adipose and blood samples from monozygotic and dizygotic twins for the characterization of non-genetic and genetic effects at single-site resolution.Entities:
Mesh:
Year: 2015 PMID: 26699896 PMCID: PMC4699357 DOI: 10.1186/s13059-015-0856-1
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Fig. 1Comparison of sequencing-derived and array-derived methylation data. 450 K methylation array data were available for adipose tissue, and methylation values for CpG-sites jointly interrogated by both whole-genome bisulfite sequencing and the array were extracted for each individual. Shown are Pearson’s correlation coefficients derived from comparing array to sequencing methylation data for jointly interrogated CpG-sites for (a) samples at indicated mean genome coverage (black diamonds), with blue crosses indicating the number of jointly detected sites per sample, (b) CpG-sites at indicated coverage across all samples, (c) CpG-sites displaying 0 to 20 % (blue crosses), >20 to 40 % (red diamonds), >40 to 60 % (green points), or >60–100 % (purple triangles) methylation difference between forward and reverse strands at indicated sequencing coverage
Fig. 2DNA methylation footprint in adipose tissue and blood. Datasets were merged within tissues, keeping only sites covered ≥12-fold, and the MethylseekR software [26] was employed to identify unmethylated and low-methylated footprints in adipose and blood. Shown are Venn diagrams describing tissue-specificity of identified (a) unmethylated regions (UMR) and (b) low-methylated regions (LMR). For overlapping areas, numbers indicate percentage of overlap based on adipose tissue and blood as indicated. c Histograms display the number of tissue-shared and adipose-specific UMRs overlapping with ranked H3K4me3 bins (see text; top 5 % are shown). d Adipose LMRs and adipose-specific UMRs overlapped with ranked H3K4me1 bins (top 5 % are shown)
Fig. 3Impact of confounding sequence variants on differential methylation and percentage of population differentially methylated CpGs (pDMCs) on total CpGs. a Significant DMCs (Fisher’s exact test p < 0.05) determined in filtered blacklisted region-removed but SNP-containing datasets were overlapped with dbSNP137-annoated variants for all samples. Shown is the percentage of DMCs overlapping with annotated SNPs in indicated differential methylation increments. b The bar plot displays the percentage of pDMCs on total detected CpGs for adipose and blood in variant-removed datasets. Shared pDMCs are indicated in purple
Fig. 4Genomic feature association of adipose-specific population differentially methylated regions (pDMRs), and transcription factor binding site (TFBS) motif analysis for adipose-specific low-methylated pDMRs. We determined pDMRs using a novel algorithm that weights the variance of CpGs methylation across individuals as well as the consistency of CpGs within the region (see “Methods”). a We associated the top 10 % of adipose-specific pDMRs with genomic features. The fold change of pDMR versus background (all CpGs detected in three or more individuals) is shown for each genomic feature. The red line demarks the border between enrichment (relative change > 1) and depletion (relative change < 1). b TFBS analysis was carried out for adipose-specific low-methylated pDMRs using the Homer software [33]. Shown are selected TFBS motifs including overall rank in the Homer analysis, p-value, forward consensus binding sequence, associated function, and references. The complete TFBS motif analysis result is available in Additional file 1: Table S6
Fig. 5Characterization of differentially methylated CpGs of non-shared environmental origin (eDMR). a eDMRs identified in adipose were associated with genes by overlapping with RefSeq-annotated genes (defined as genic regions including 10 kb downstream of the transcription start site); DeSeq normalized total stranded RNA-seq data were used to determine fold expression changes for RefSeq genes between subcutaneous adipose tissue and T-cells, monocytes, and B-cells. The number of genes significantly (DeSeq-adjusted p < 0.05) overexpressed in adipose versus each hematopoietic cell type (>1-fold, >4-fold, >16-fold higher expression in adipose) was determined for eDMR-associated and all RefSeq genes; the latter to identify dynamic adipose-specific genes. Shown is the ratio of eDMR-associated versus RefSeq genes averaged across the three hematopoietic cell types corrected for background at indicated fold overexpression levels in adipose. Stars indicate z-score-derived p-values <0.001. b Same as (a) but comparing eDMR-associated genes overexpressed in hematopoietic cells to adipose. c (top) UCSC-derived scheme of the HSPA12B RefSeq gene locus and CGIs in the gene region (hg19). (bottom) Zoom into HSPA12B gene region overlapping CGI CpG: 121. For each twin within MZ2, methylation levels are shown for each detected CpG site. The trendl ine was determined using a moving average (period = 2). Methylation values highlighted in black in co-twin 1 indicate significant eDMCs (Fisher’s exact test p < 0.05), the associated eDMR is highlighted in rose. d (top) UCSC-derived scheme of the FAM171A2 RefSeq gene locus and CGIs in the gene region (hg19). (bottom) Zoom into FAM171A2 gene region overlapping CGI CpG: 136. Illustration as in (c)