| Literature DB >> 26036609 |
Ting Wang1, Weihua Guan, Jerome Lin, Nadia Boutaoui, Glorisa Canino, Jianhua Luo, Juan Carlos Celedón, Wei Chen.
Abstract
DNA methylation plays an important role in disease etiology. The Illumina Infinium HumanMethylation450 (450K) BeadChip is a widely used platform in large-scale epidemiologic studies. This platform can efficiently and simultaneously measure methylation levels at ∼480,000 CpG sites in the human genome in multiple study samples. Due to the intrinsic chip design of 2 types of chemistry probes, data normalization or preprocessing is a critical step to consider before data analysis. To date, numerous methods and pipelines have been developed for this purpose, and some studies have been conducted to evaluate different methods. However, validation studies have often been limited to a small number of CpG sites to reduce the variability in technical replicates. In this study, we measured methylation on a set of samples using both whole-genome bisulfite sequencing (WGBS) and 450K chips. We used WGBS data as a gold standard of true methylation states in cells to compare the performances of 8 normalization methods for 450K data on a genome-wide scale. Analyses on our dataset indicate that the most effective methods are peak-based correction (PBC) and quantile normalization plus β-mixture quantile normalization (QN.BMIQ). To our knowledge, this is the first study to systematically compare existing normalization methods for Illumina 450K data using novel WGBS data. Our results provide a benchmark reference for the analysis of DNA methylation chip data, particularly in white blood cells.Entities:
Keywords: DNA methylation; Illumina 450K; epigenetics; microarray; normalization; whole-genome bisulfite sequencing (WGBS)
Mesh:
Substances:
Year: 2015 PMID: 26036609 PMCID: PMC4623491 DOI: 10.1080/15592294.2015.1057384
Source DB: PubMed Journal: Epigenetics ISSN: 1559-2294 Impact factor: 4.528
Figure 1.Comparison of different normalization methods for 450K data using WGBS data at a single CpG level. (A) β distributions of WGBS methylation data for CpGs that overlapped with those obtained using the 450K platform. (B) Numbers of overlapped CpGs between WGBS and 450K data when choosing different cutoffs of sequencing depth. Mean correlations between WGBS and each normalized 450K data for all overlapped 450K CpGs (C); for Type I probe CpGs (E); and for Type II probe CpGs (G). Mean absolute differences between WGBS data and each normalized 450K data for all overlapped 450K CpGs (D); for Type I probe CpGs (F); and for Type II probe CpGs (H).
Figure 2.Comparison of CpG methylation differences using different normalization methods for 450K data. (A) Cluster dendrogram of CpG methylation differences. Pearson correlation coefficient (PCC) between WGBS- and 450K-based methylation differences for sample.378 (B) and sample.911 (C).Overlaps between top differentially methylated CpGs between WGBS and each 450K-based data for sample.378 (D) and sample.911 (E).
Figure 3.Comparison of different normalization methods for 450K data using WGBS data on a CpG regional level. β distributions, numbers of overlapped CpGs, mean correlations, and mean absolute differences between WGBS and 450K data, for CpG Island (A), N_Shelf (B), N_Shore (C), S_Shelf (D), and S_Shore (E) regions.
Figure 4.Comparison of variability reduction in technical replicates after using different normalization methods on 450K data. Pearson correlation coefficients (A) and mean absolute differences (B) between duplicate samples.
Figure 5.Comparison of the reduction of probe type bias after different normalizations. Density distributions of 2 probe type data.
Normalization methods for Illumina 450K data
| Method | Instruction | R Package |
|---|---|---|
| QN | Quantile normalization, 2003 | lumi |
| PBC | Peak-based correction, 2011 | wateRmelon |
| SWAN | Subset-quantile within array normalization, 2012 | minfi |
| SQN | Subset quantile normalization, 2012 | wateRmelon |
| Dasen | Data-driven separate normalization, 2013 | wateRmelon |
| BMIQ | Beta-mixture quantile normalization, 2013 | wateRmelon |
| ASMN | All sample mean normalization, 2013 | asmn |
| QN.BMIQ | QN then BMIQ, 2013 | lumi + wateRmelon |