| Literature DB >> 32425094 |
Jerry Guintivano1, Andrey A Shabalin2, Robin F Chan3, David R Rubinow1, Patrick F Sullivan1,4,5, Samantha Meltzer-Brody1, Karolina A Aberg3, Edwin J C G van den Oord3.
Abstract
Recent years have seen a surge of methylome-wide association studies (MWAS). We observed that many of these studies suffer from test statistic inflation that is most likely caused by commonly used quality control (QC) pipelines not going far enough to remove technical artefacts. To support this claim, we reanalysed GEO datasets with an improved QC pipeline that reduced test-statistic inflation parameter lambda from the original mean/median of 20.16/15.17 to 3.07/1.14. Furthermore, the mean/median number of methylome-wide significant findings was reduced by 65,688/57,805 loci after more thorough QC. To avoid such false positives we argue for more extensive QC and that reporting the test-statistic inflation parameter lambda become standard for all MWAS allowing readers to better assess the risk of false discoveries.Entities:
Keywords: DNA methylation; epigenetics; reproducibility
Year: 2020 PMID: 32425094 PMCID: PMC7595582 DOI: 10.1080/15592294.2020.1758382
Source DB: PubMed Journal: Epigenetics ISSN: 1559-2294 Impact factor: 4.528
GEO Datasets included in systematic review
| GEO Accession | PMID | n | Platform | Tissue | Outcome | Initial Lambda | Recalculated Lambda | Permutation Median Lambda (95% CI) | Bonferroni Δ | Source of Initial Lambda |
|---|---|---|---|---|---|---|---|---|---|---|
| GSE56046 | 25404168 | 1202 | 450 k | Blood | age | 3.12 | - | - | MWAS of processed data | |
| GSE87571 | 23826282 | 732 | 450 k | Blood | age | 17.68 | 7.69 | 0.97 (0.86–1.38) | 92,591 | Calculated from summary statistics |
| GSE72774 | 26655927 | 508 | 450 k | Blood (all samples) | age | 5.48 | - | - | MWAS of processed data | |
| Blood (controls only) | age | 2.16 | - | - | ||||||
| GSE72778 | 27479945 | 475 | 450 k | Brain (meta-analysis) | age | 7.30 | - | - | Reported in manuscript | |
| Brain (frontal lobe) | age | 3.50 | - | - | ||||||
| Brain (parietal lobe) | age | 3.00 | - | - | ||||||
| Brain (occipital lobe) | age | 2.20 | - | - | ||||||
| GSE55763 | 25853392 | 2711 | 450 k | Blood | Case-control | 1.00 | - | - | Reported in manuscript | |
| GSE84727 | 27572077 | 847 | 450 k | Blood | Case-control | 1.53 | - | - | MWAS of processed data | |
| GSE42861 | 23334450 | 689 | 450 k | Blood | Case-control | 15.17 | 1.14 | 1.00 (0.96–1.08) | 57,805 | MWAS of processed data |
| GSE74193 | 26619358 | 675 | 450 k | Brain | Case-control | 64.65 | 4.37 | 0.98 (0.78–1.35) | 178,241 | Calculated from summary statistics |
| GSE80417 | 27572077 | 675 | 450 k | Blood | Case-control | 1.32 | - | - | MWAS of processed data | |
| GSE111629 | 28851441 | 572 | 450 k | Blood | Case-control | 1.77 | 1.01 | 1.00 (0.93–1.11) | 7 | MWAS of processed data |
| GSE87648 | 27886173 | 384 | 450 k | Blood | Case-control | 1.54 | 1.13 | 1.01 (0.82–1.35) | −202 | MWAS of processed data |
| GSE100264 | 29269866 | 386 | 450 k | Blood | Ordinal (drug use) | 1.08 | - | - | Reported in manuscript |
Figure 1.Effects of covariate inclusion on test-statistic inflation. quantile-quantile (QQ) plots for the outcomes postpartum depression case status (a) and age (b). Each colour denotes a different level of quality control used in association testing: no covariates (red), commonly used pipeline (green), and our proposed iterative quality control process (blue)