| Literature DB >> 28184257 |
Rachel D Edgar1, Meaghan J Jones1, Wendy P Robinson1, Michael S Kobor1.
Abstract
BACKGROUND: Population based epigenetic association studies of disease and exposures are becoming more common with the availability of economical genome-wide technologies for interrogation of the methylome, such as the Illumina 450K Human Methylation Array (450K). Often, the expected small number of differentially methylated cytosine-guanine pairs (CpGs) in studies of the human methylome presents a statistical challenge, as the large number of CpGs measured on the 450K necessitates careful multiple test correction. While the 450K is a highly useful tool for population epigenetic studies, many of the CpGs tested are not variable and thus of limited information content in the context of the study and tissue. CpGs with observed lack of variability in the tissue under study could be removed to reduce the data dimensionality, limit the severity of multiple test correction and allow for improved detection of differential DNA methylation.Entities:
Keywords: 450K; DNA methylation; Dimensionality reduction; Filter; Multiple-test correction; Non-variable; Power; Tissue
Mesh:
Year: 2017 PMID: 28184257 PMCID: PMC5290610 DOI: 10.1186/s13148-017-0320-z
Source DB: PubMed Journal: Clin Epigenetics ISSN: 1868-7075 Impact factor: 6.551
Fig. 1Quality control of samples from GEO for each tissue type. a Heat maps showing sample-sample correlation values. Side colours show the study ID of each sample, and samples are ordered by study ID. b Plots of the average sample-sample correlation for each sample to show possible outliers and studies with overall low average sample-sample correlation
Fig. 2Non-variable CpGs had similar characteristics in all tissues. a Venn diagram showing the overlap of non-variable CpGs between tissues. b Methylation levels of representative non-variable CpGs from each tissue
Fig. 3Non-variable CpGs were enriched in CpG island and promoters. All plots show the enrichment fold change of non-variable CpGs compared to all CpGs available for a tissue. Each pair of plots shows the fold changes in gene regions (top) and CpG resort features (bottom). a Blood non-variable CpGs. b Buccal epithelial cell non-variable CpGs. c Placenta non-variable CpGs
Fig. 4Multiple test corrected p values were lower in the filtered EWAS. a Volcano plot of the differential methylation analysis between smoking and non-smoking samples, with no filtering of non-variable CpGs. Vertical lines indicate a DNAm difference between 0.1. The horizontal line represents an FDR corrected p value of 0.05. Points are coloured to highlight CpGs exceeding both the biological and statistical cutoffs. Points with a black outline are CpGs found to be non-variable in blood. b Shown are the multiple test corrected p values (FDR) for the two CpGs of interest in AHRR. Lines connect FDR values between paired permutation sub-samples to show the trend between paired cohorts. The horizontal line shows the FDR values of 0.05