| Literature DB >> 34545148 |
Jack W O'Sullivan1,2, John P A Ioannidis3,4.
Abstract
With the establishment of large biobanks, discovery of single nucleotide variants (SNVs, also known as single nucleotide polymorphisms (SNVs)) associated with various phenotypes has accelerated. An open question is whether genome-wide significant SNVs identified in earlier genome-wide association studies (GWAS) are replicated in later GWAS conducted in biobanks. To address this, we examined a publicly available GWAS database and identified two, independent GWAS on the same phenotype (an earlier, "discovery" GWAS and a later, "replication" GWAS done in the UK biobank). The analysis evaluated 136,318,924 SNVs (of which 6289 reached P < 5e-8 in the discovery GWAS) from 4,397,962 participants across nine phenotypes. The overall replication rate was 85.0%; although lower for binary than quantitative phenotypes (58.1% versus 94.8% respectively). There was a 18.0% decrease in SNV effect size for binary phenotypes, but a 12.0% increase for quantitative phenotypes. Using the discovery SNV effect size, phenotype trait (binary or quantitative), and discovery P value, we built and validated a model that predicted SNV replication with area under the Receiver Operator Curve = 0.90. While non-replication may reflect lack of power rather than genuine false-positives, these results provide insights about which discovered associations are likely to be replicated across subsequent GWAS.Entities:
Mesh:
Year: 2021 PMID: 34545148 PMCID: PMC8452698 DOI: 10.1038/s41598-021-97896-y
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Replication across phenotypes.
| Disease | Total sample size | Number of genome-wide significant SNVs | Number of SNVs that are replicated (%) |
|---|---|---|---|
| Asthma | 225,309 | 889 | 494 (56%) |
| SBP | 430,797 | 110 | 107 (97%) |
| Eczema | 330,142 | 640 | 337 (53%) |
| BMI | 613,900 | 1835 | 1756 (96%) |
| Waist Circumference | 618,033 | 937 | 827 (89%) |
| Hip circumference | 598,925 | 1083 | 1043 (96%) |
| Coronary Artery Disease/IHD | 387,786 | 159 | 149 (94%) |
| Resting Heart rate/Pulse Rate | 447,198 | 549 | 547 (99%) |
| DBP | 430,806 | 87 | 83 (95%) |
Total sample size is the sample size of the discovery and replication GWAS collectively.
Figure 1Replication of SNVs across traits.
Figure 2Replication of SNVs across P values.
Figure 3Replication of SNVs across odds ratios.
Replication across P values and odds ratios.
| Metric | Category | Replication rate (95% CI) |
|---|---|---|
| 5e−8 to > 5e−9 | 72% (69% to 74%) | |
| 5e−9 to > 5e−10 | 78% (75% to 80%) | |
| 5e−10 to > 5e−11 | 81% (77% to 83%) | |
| < 5e−11 | 94% (93% to 95%) | |
| Odds ratio | 1–1.05 | 94.3% (93.5% to 95.0%) |
| 1.05–1.1 | 70.0% (66.8% to 72.9%) | |
| 1.1–1.15 | 62.5% (59.4% to 65.6%) | |
| 1.15–1.2 | 69.3% (64.3% to 73.9%) | |
| 1.2–1.3 | 98.7% (91.0% to 99.8%) | |
| 1.3–1.4 | 100%* | |
| > 1.4 | 100%* |
*Paucity of data prevented formal meta-analysis.
Figure 4Replication of SNVs across odds ratios between Binary and Quantitative traits.