| Literature DB >> 30326963 |
Lisa M McEwen1, Meaghan J Jones2, David Tse Shen Lin2, Rachel D Edgar2, Lucas T Husquin3,4,5, Julia L MacIsaac2, Katia E Ramadori2, Alexander M Morin2, Christopher F Rider6, Chris Carlsten6, Lluís Quintana-Murci3,4,5, Steve Horvath7, Michael S Kobor2.
Abstract
BACKGROUND: The capacity of technologies measuring DNA methylation (DNAm) is rapidly evolving, as are the options for applicable bioinformatics methods. The most commonly used DNAm microarray, the Illumina Infinium HumanMethylation450 (450K array), has recently been replaced by the Illumina Infinium HumanMethylationEPIC (EPIC array), nearly doubling the number of targeted CpG sites. Given that a subset of 450K CpG sites is absent on the EPIC array and that several tools for both data normalization and analyses were developed on the 450K array, it is important to assess their utility when applied to EPIC array data. One of the most commonly used 450K tools is the pan-tissue epigenetic clock, a multivariate predictor of biological age based on DNAm at 353 CpG sites. Of these CpGs, 19 are missing from the EPIC array, thus raising the question of whether EPIC data can be used to accurately estimate DNAm age. We also investigated a 71-CpG epigenetic age predictor, referred to as the Hannum method, which lacks 6 probes on the EPIC array. To evaluate these epigenetic clocks in EPIC data properly, a prior assessment of the effects of data preprocessing methods on DNAm age is also required.Entities:
Keywords: 450K; DNA methylation; DNA methylation age; EPIC; Epigenetic age; Epigenetic clock; Human; Microarray; Preprocessing
Mesh:
Year: 2018 PMID: 30326963 PMCID: PMC6192219 DOI: 10.1186/s13148-018-0556-2
Source DB: PubMed Journal: Clin Epigenetics ISSN: 1868-7075 Impact factor: 6.551
Fig. 1DNA methylation age comparison between 450K or EPIC Monocyte data across preprocessing methods. Identical samples were assayed on both the 450K and EPIC arrays, and then each preprocessed in one of four ways prior to calculating DNA methylation (DNAm) age: raw unprocessed, GenomeStudio color correction/background subtraction (GS), normal exponential out-of-band (noob) normalization, or quantile normalization. Solid colored line represents corresponding group regression line. For each regression, the Pearson’s correlation coefficient, error (median absolute error between EPIC DNAm age and 450K DNAm age), R2 value, and p value corresponding to the correlation coefficient are shown
Fig. 2EPIC DNA methylation age estimated in control samples from the Diesel Exhaust III Study across three tissues. DNA methylation (DNAm) age was estimated using the EPIC 334-CpG model from GenomeStudio background-subtracted and color-channel-adjusted EPIC data. Linear regression line shown with 95% confidence intervals is shown in gray. Error is the median absolute difference between EPIC DNAm age and chronological age. Pearson’s correlation coefficients (r) and corresponding p value are shown for each tissue. BAL = bronchoalveolar lavage, PBMC = peripheral blood mononuclear cells, brush = bronchial brushing
Fig. 3DNA methylation age acceleration variation across preprocessing methods. a Scatter plot of EPIC DNA methylation (DNAm) age calculated from raw and data from three different preprocessing methods: quantile, GenomeStudio (GS), and normal exponential out-of-band (noob) normalization. Colored regression lines and surrounding shaded gray areas represent 95% confidence interval for each group. b Boxplot of estimated DNA methylation (DNAm) age—chronological age (acceleration difference) for each preprocessing method. c Boxplot of residuals from a linear regression (DNAm age~chronological age) across methods. The median is illustrated by horizontal line with upper and lower hinges representing the 25th and 75th percentiles, upper and lower whiskers extend no further than the inter-quartile range multiplied by 1.5. Colored data points represent individual samples for each group
Fig. 4Absolute difference between technical replicate pairs for each processing method. The y axis represents the absolute difference between each technical replicate pair’s DNA methylation (DNAm) age. Each processing method is represented on the x axis. The median difference is indicated by the red cross for each group. The color of data points represents one technical replicate pair for ease of interpretation across methods. Error refers to the median absolute difference between DNAm age and chronological age