| Literature DB >> 20441598 |
Hailong Meng1, Andrew R Joyce, Daniel E Adkins, Priyadarshi Basu, Yankai Jia, Guoya Li, Tapas K Sengupta, Barbara K Zedler, E Lenn Murrelle, Edwin J C G van den Oord.
Abstract
BACKGROUND: High-throughput DNA methylation arrays are likely to accelerate the pace of methylation biomarker discovery for a wide variety of diseases. A potential problem with a standard set of probes measuring the methylation status of CpG sites across the whole genome is that many sites may not show inter-individual methylation variation among the biosamples for the disease outcome being studied. Inclusion of these so-called "non-variable sites" will increase the risk of false discoveries and reduce statistical power to detect biologically relevant methylation markers.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20441598 PMCID: PMC2876131 DOI: 10.1186/1471-2105-11-227
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1The distribution of probe correlations. The distribution of probe-level correlations across technical replicates for each probe is shown. Pearson correlation coefficients were calculated for the 1505 CpG probes using 126 replicate biosamples distributed across five methylation matrices.
Figure 2The distribution of sample correlations. The distribution of sample correlations across technical replicates for each probe is shown. Pearson correlation coefficients were calculated for the 1505 CpG probes using 126 replicate biosamples distributed across five methylation matrices.
Figure 3a - Posterior probability distribution from mixture model. The posterior probability distribution, indicating the likelihood of a probe belonging to the subset of highly correlated, informative probes. b - The number of probes selected at different posterior probability thresholds. The number of probes (y-axis) that will remain at different posterior probability thresholds (x-axis) calculated from the two-class mixture model.
p0 estimates using test results from regression analyses
| Outcome | Before probe selection | After probe selection |
|---|---|---|
| Age decline | 0.9996 | 0.9781 |
| Pack-years decline | 0.9992 | 0.9986 |
| CPD × Age decline | 0.9970 | 0.9715 |
| Baseline lung function | 1.0009 | 0.9904 |
CPD, cigarettes per day.
Figure 4pFDR analyses. (A) The number of significant probes detected at different q-value thresholds from the regression analyses between DNA methylation changes and four outcome measures of lung function or decline prior to probe selection described herein. (B) The number of significant probes detected at different q-value thresholds after probe selection. A greater number of significant probes was identified for a given q-value cutoff for Age decline, CPD × Age decline and Baseline lung function outcomes after probe selection.