| Literature DB >> 23990268 |
Sarah Dedeurwaerder, Matthieu Defrance, Martin Bizet, Emilie Calonne, Gianluca Bontempi, François Fuks.
Abstract
Infinium HumanMethylation450 beadarray is a popular technology to explore DNA methylomes in health and disease, and there is a current explosion in the use of this technique. Despite experience acquired from gene expression microarrays, analyzing Infinium Methylation arrays appeared more complex than initially thought and several difficulties have been encountered, as those arrays display specific features that need to be taken into consideration during data processing. Here, we review several issues that have been highlighted by the scientific community, and we present an overview of the general data processing scheme and an evaluation of the different normalization methods available to date to guide the 450K users in their analysis and data interpretation.Entities:
Keywords: Epigenomics; Genome-wide DNA methylation technology
Mesh:
Substances:
Year: 2013 PMID: 23990268 PMCID: PMC4239800 DOI: 10.1093/bib/bbt054
Source DB: PubMed Journal: Brief Bioinform ISSN: 1467-5463 Impact factor: 11.622
Figure 1:Overview of the general Infinium HumanMethylation450 data processing scheme with highlights on the different points to check during the processing to ensure an accurate analysis and interpretation. DMP, differentially methylated positions; DMR, differentially methylated regions.
Figure 2:CpGs with high average signal intensity display lower concordance with BPS data. Plot illustrating the difference between methylation values obtained from Infinium HumanMethylation450 and BPS as a function of the average signal intensity and the β-value. The absolute difference between the two techniques is proportional to the circle radius (blue: type I probes; red: type II probes). The plot was generated using 450K and matched BPS data from 22 tissues described in [9] (352 points). A colour version of this figure is available at BIB online: http://bib.oxfordjournals.org.
Freely available packages/pipelines for Infinium 450K data preprocessing and analysis
| Package | Description | References |
|---|---|---|
R-Package. Pipeline function available: allows probe filtering (detection p-values, SNPs, … ) and identification of differentially methylated sites. Proposed normalization methods: type I/II bias correction PBC and quantile normalization between arrays (pipeline options). | [ | |
Bioconductor R-package. No pipeline function. Proposed normalization methods: background corrections (using or not negative controls), color bias adjustment and between-array normalization methods (‘smooth quantile’ or ‘shift and scaled normalization’). | [ | |
Bioconductor R-package. No pipeline function. Proposed normalization methods: type I/II bias correction SWAN and dye bias equalization (originally proposed in the Genome Studio software). | ||
Bioconductor R-package. Pipeline function available: pipeline proposed by Touleimat and Tost. Other proposed normalization methods: wide-range of within-array normalization methods, including the type I/II bias corrections PBC, BMIQ and SWAN, and between-array normalization methods, such as Nasen. | ||
Bioconductor R-package. No pipeline function. Proposed normalization methods: dye bias equalization and several background correction methods, including the recently developed Noob method. | ||
R-package. Pipeline function available: allows probe filtering, quality control, estimation of batch effect and identification of differentially methylated sites. Proposed normalization methods: type I/II bias correction SWAN and dye bias equalization. | ||
Mathlab package dedicated to the identification of DNA methylation markers. No pipeline function. Proposed normalization methods: type I/II bias correction PBC. | [ |
Figure 3:Comparison of the different within-array normalization methods using BPS data as referential data. Boxplots show the distribution of the absolute difference between DNA methylation measurements obtained from Infinium HumanMethylation450 and BPS, when Infinium data are subjected (white) or not (dark gray) to within-array normalization, for HCT116 and Roessler’s data sets. Blue, orange and red indicate Infinium typeI/II bias correction methods, color bias adjustment and background correction methods, respectively. Raw: Infinium raw data; IMA-PBC: PBC from the IMA package; Minfi-SWAN: Subset quantile Within-Array Normalization from the minfi package; Tost-SQN(within): categorical SQN from Touleimat and Tost pipeline (this boxplot is highlighted in light gray to indicate that each sample has been normalized individually to apply only the within-array normalization component of this method); BMIQ: Beta-Mixture Quantile Normalization; Lumi-Smooth: color bias adjustment from the lumi package (smooth quantile normalization); MethyLumi-NMLS: dye bias equalization (normalizeMethyLumiSet) of the methylumi package (method originally proposed in the Genome Studio software); Lumi-lumiMethyB: background correction from the lumi package; MethyLumi-Noob: background correction based on normal exponential convolution model using out-of-band intensities as controls from the methylumi package; MethyLumi-Normexp: same as MethyLumi-Noob but controls used are negative probes present on the array (*On the Roessler’s data set, this method was applied instead of the ‘Noob’ method because we do not have access to the IDAT files of these samples). A colour version of this figure is available at BIB online: http://bib.oxfordjournals.org.
Figure 4:Comparison of the different between-array normalization methods using BPS data as referential data. Boxplots show the distribution of the absolute difference between DNA methylation measurements obtained from Infinium HumanMethylation450 and BPS, when Infinium data are subjected (white) or not (dark gray) to between-array normalization, for HCT116 and Roessler’s data sets. Raw: Infinium raw data; Lumi-Smooth: Smooth quantile normalization on intensities from the lumi package; Lumi-SSN: Shift and Scaling Normalization on the intensities from the lumi package; IMA-QN: Quantile normalization on β-values from the IMA package; Tost-SQN: categorical SQN from Touleimat and Tost pipeline (this boxplot is highlighted in light gray to indicate that the normalization method comprises a within-array normalization component in addition to the between-array component); wateRmelon-Nasen: Nasen method from the wateRmelon package.
Figure 5:Comparison of the different between-array normalization methods using the variation between technical replicates as criterion. Boxplots show the distribution of the median of the absolute differences between DNA methylation measurements obtained with Infinium HumanMethylation450 from three replicates of HCT116 WT cells (left panel) or three replicates of HCT116 DKO cells (right panel), when data are subjected (white) or not subjected (dark gray) to between-array normalization. Raw: Infinium raw data; Lumi-Smooth: Smooth quantile normalization on intensities from the lumi package; Lumi-SSN: Shift and Scaling Normalization on the intensities from the lumi package; IMA-QN: Quantile normalization on β-values from the IMA package; Tost-SQN: categorical SQN from Touleimat and Tost pipeline (this boxplot is highlighted in light gray to indicate that the normalization method comprises a within-array normalization component in addition to the between-array component); wateRmelon-Nasen: Nasen method from the wateRmelon package. For clarity reasons, the boxplots are drawn using whiskers that extend to the most extreme data point, which is no more than 1.5 times the interquartile range from the box.
Figure 6:Small differences of methylation can be observed by chance due to technical variations. Density plot of the Δβ (difference of methylation) between two technical replicates of HCT116 WT cells (in gray) and between one HCT116 WT sample and one HCT116 DKO sample (in purple). The dashed region (<−0.09) indicates the area were random differences are lower than biological differences. A colour version of this figure is available at BIB online: http://bib.oxfordjournals.org.