| Literature DB >> 34530905 |
Michael Scherer1,2,3,4, Gilles Gasparoni1, Souad Rahmouni5, Tatiana Shashkova6,7, Marion Arnoux1, Edouard Louis8, Arina Nostaeva9, Diana Avalos10,11,12, Emmanouil T Dermitzakis10,11,12, Yurii S Aulchenko6,9,13,14, Thomas Lengauer2, Paul A Lyons15,16, Michel Georges5, Jörn Walter17.
Abstract
BACKGROUND: Understanding the influence of genetic variants on DNA methylation is fundamental for the interpretation of epigenomic data in the context of disease. There is a need for systematic approaches not only for determining methylation quantitative trait loci (methQTL), but also for discriminating general from cell type-specific effects.Entities:
Keywords: Computational biology; DNA methylation; Quantitative trait loci; Tissue specificity
Mesh:
Substances:
Year: 2021 PMID: 34530905 PMCID: PMC8444396 DOI: 10.1186/s13072-021-00415-6
Source DB: PubMed Journal: Epigenetics Chromatin ISSN: 1756-8935 Impact factor: 4.954
Fig. 1Cell type-specific DNA methylation patterns in the discovery data set. A Heatmap (blue low, red high DNA methylation levels) of the 1000 most variably methylated genome-wide bins of size 5 kb. Hierarchical clustering of samples and bins was performed using Euclidean distance and complete linkage. B PCA plot of genome-wide DNA methylation data at the single-CpG level. The first two principal components are displayed. C Boxplots depicting the distributions of LUMP estimates for the overall immune cell content of the different cell types/tissues. The P-value was computed using a two-sided t-test
Fig. 2Overview of MAGAR and methQTL results. A MAGAR is an R-package utilizing a two-stage protocol. After data import via established software packages, CpGs are clustered into CpG-correlation blocks in a four-step procedure. In the second stage, methQTLs are called for each correlation block separately. B Number of methQTLs identified by MAGAR for T cells, B cells, ileum, and rectum samples. Overlap between the methQTLs identified per tissue/cell type with methQTLs identified in the validation cohort (C) and in published methQTLs from blood [12] and fetal brain samples [38] (D). The methQTLs were reduced to those methQTLs affecting CpGs present on the 450k microarray
Fig. 3Common and tissue-specific methQTLs identified through colocalization analysis. A To define tissue specificity, we employed MAGAR on the four tissues/cell types independently to obtain methQTL statistics. These were used in pairwise colocalization analyses to define common and tissue-specific methQTL, as well as methQTLs shared across several tissues. B Number of tissue-specific methQTLs per tissue and methQTLs shared across different tissues according to the colocalization analysis. Common methQTLs were shared according to the colocalization analysis and had methQTL P-values below the cutoff in all tissues. C Examples of four common methQTLs located in vicinity to PON1, LGR6, LCE3D, and RIBC2
Fig. 4Validation of methQTL at PON1 locus using ultra-deep bisulfite sequencing. A Bisulfite sequencing read pattern maps for three individuals with genotypes homozygous for the reference allele (AA), heterozygous (AG), or homozygous for the alternative allele (GG) for B cells and T cells, respectively. Each line is a sequencing read, where the red color indicates a cytosine, i.e., a methylated cytosine before bisulfite conversion, and blue a thymine, i.e., an unmethylated cytosine before bisulfite conversion. All cytosines within the amplicon are shown in the pattern map and the CpG and CpA dinucleotides are marked. The genotype at rs705379 per sequencing read is indicated on the right. Shown is the common methQTL at the PON1 locus at chr7:94,953,722–94,954,184 (hg19). B Average DNA methylation levels across all samples of the same genotype and standard deviations across the samples. The barplots are shown for all 22 CpGs present in the amplicon. C Average DNA methylation levels across all sequencing reads per sample for the three CpGs that were associated with the SNP genotype in the microarray data analysis for B cells and T cells
Fig. 5Properties of methQTLs shared across the tissues and tissue-specific methQTLs. A Distance between the CpG and the SNP, the effect size (slope of the regression) of the methQTL, and the negative common logarithm of the methQTL P-value are visualized. MethQTLs were classified as either shared or tissue-specific. B Enrichment analysis of shared (top) or tissue-specific methQTLs (bottom) in different functional annotations of the genome. Visualized is the common logarithm of the odds ratio and the associated Fisher exact test P-value was computed. P-values below 0.01 are indicated by a bold outline. C LOLA [44] enrichment analysis of the methQTL SNPs for the shared and tissue-specific methQTLs, respectively. ESC embryonic stem cell, AML acute myeloid leukemia
Details on bisulfite amplicons screened in the study
| Gene locus | chr | SNP position | PCR primers (5′–3′)a | CpG ID | MethQTL distance (bp) | MethQTL deltab |
|---|---|---|---|---|---|---|
| 7 | 94,953,895 rs705379 | TCTTTCCCTACACGACGCTCTTCCGATCTgattggtggtttttgaagagtgttagtttt GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTccataatcaaactactaaatctctctaaaac | cg01874867, cg20119798 | 164 249 | + 14.9%, + 39.7% | |
| 19 | 44,488,352 rs62116613 | TCTTTCCCTACACGACGCTCTTCCGATCTggttgataggttagaatttataggttt GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTcacatacttaactcaaaccacctt | cg23456212, cg20451226 | 182 171 | + 7.8%, + 13.3% | |
| 5 | 139,340,779 rs6580323 | TCTTTCCCTACACGACGCTCTTCCGATCTtttatgaattttgaagaagttgttaggt GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTcacatacaaaactaaaacctaaatcc | cg22710094 | 85 | − 5.1%, − 20.1% |
aCapital letters are NGS-compatible tags
bAbsolute methylation change of homozygote minor versus major individuals