| Literature DB >> 21575229 |
Shuying Sun1, Yi-Wen Huang, Pearlly S Yan, Tim Hm Huang, Shili Lin.
Abstract
BACKGROUND: DNA methylation plays a very important role in the silencing of tumor suppressor genes in various tumor types. In order to gain a genome-wide understanding of how changes in methylation affect tumor growth, the differential methylation hybridization (DMH) protocol has been developed and large amounts of DMH microarray data have been generated. However, it is still unclear how to preprocess this type of microarray data and how different background correction and normalization methods used for two-color gene expression arrays perform for the methylation microarray data. In this paper, we demonstrate our discovery of a set of internal control probes that have log ratios (M) theoretically equal to zero according to this DMH protocol. With the aid of this set of control probes, we propose two LOESS (or LOWESS, locally weighted scatter-plot smoothing) normalization methods that are novel and unique for DMH microarray data. Combining with other normalization methods (global LOESS and no normalization), we compare four normalization methods. In addition, we compare five different background correction methods.Entities:
Year: 2011 PMID: 21575229 PMCID: PMC3118966 DOI: 10.1186/1756-0381-4-13
Source DB: PubMed Journal: BioData Min ISSN: 1756-0381 Impact factor: 2.522
Examples of three types of DNA fragments.
| Before MSRE digestion | After MSRE digestion | Probe signals |
|---|---|---|
| 1). No MSRE cutting sites ATCGTCCAGCCGATTTAAACCCGTATCGTA | Not being restricted/cut, saved for hybridization | Contribute to the final probe signals |
| 2). All MSRE cutting sites are methylated AT | Not being restricted/cut, saved for hybridization | Contribute to the final probe signals |
| 3) At least one MSRE cutting site is not methylated AT | Being cut and will not be hybridized onto the array | Do not contribute to the final probe signals |
The first column contains examples of three different types of DNA fragments before they are digested by MSREs. In this column, all MSREs are underlined. "CmG" means there is methylation at this CG site, otherwise, "CG" simply means a regular CG site and it has not been methylated. The second column contains the results after MSREs digestion. The third column explains whether a type of DNA fragments can contribute to the final signals of the probe it covers.
Figure 1MA plot of one array with three LOESS curves. The blue line is the LOESS curve based on all biological probes. The red dots are 199 internal control probes. The red line is the LOESS curve obtained only using these internal control probes. The cyan line is the weighted LOESS curve (i.e., composite LOESS) curve based on both all biological probes and 199 internal control probes.
Breast cancer T.stat measurement table
Each row is for one p-value cutoff point. Within each row, the first column contains a p-value cutoff point; the second column contains a sub-table of 20 numbers corresponding to the T.stat measurement results of 20 preprocessing methods. The underlined bold numbers are the top 3 largest numbers in each sub-table of the second column.
Breast cancer AUC measurement table
Each row is for one p-value cutoff point. Within each row, the first column contains a p-value cutoff point; the second column contains a sub-table of 20 numbers corresponding to the AUC measurement results of 20 preprocessing methods. The underlined bold numbers are the top 3 largest numbers in each sub-table of the second column.
Ovarian cancer T.stat measurement table
Each row is for one p-value cutoff point. Within each row, the first column contains a p-value cutoff point; the second column contains a sub-table of 20 numbers corresponding to the T.stat measurement results of 20 preprocessing methods. The underlined bold numbers are the top 3 largest numbers in each sub-table of the second column.
Ovarian cancer AUC measurement table
Each row is for one p-value cutoff point. Within each row, the first column contains a p-value cutoff point; the second column contains a sub-table of 20 numbers corresponding to the AUC measurement results of 20 preprocessing methods. The underlined bold numbers are the top 3 largest numbers in each sub-table of the second column.
Figure 2Breast cancer mean differences of different normalization and background correction methods. The two plots in the top panel are the results of comparing four normalization methods using two statistical measurements. The two plots in the bottom panel are the results of comparing five background correction methods using two statistical measurements.
Figure 3Ovarian cancer mean differences of different normalization and background correction methods. The two plots in the top panel are the results of comparing four normalization methods using two statistical measurements. The two plots in the bottom panel are the results of comparing five background correction methods using two statistical measurements.