| Literature DB >> 31322649 |
Miriam S Udler1,2,3,4, Mark I McCarthy5,6,7, Jose C Florez1,2,3,4, Anubha Mahajan6.
Abstract
During the last decade, there have been substantial advances in the identification and characterization of DNA sequence variants associated with individual predisposition to type 1 and type 2 diabetes. As well as providing insights into the molecular, cellular, and physiological mechanisms involved in disease pathogenesis, these risk variants, when combined into a polygenic score, capture information on individual patterns of disease predisposition that have the potential to influence clinical management. In this review, we describe the various opportunities that polygenic scores provide: to predict diabetes risk, to support differential diagnosis, and to understand phenotypic and clinical heterogeneity. We also describe the challenges that will need to be overcome if this potential is to be fully realized.Entities:
Year: 2019 PMID: 31322649 PMCID: PMC6760294 DOI: 10.1210/er.2019-00088
Source DB: PubMed Journal: Endocr Rev ISSN: 0163-769X Impact factor: 19.871
Figure 1.How polygenic scores are derived. The orange dashed line in the graph represents the threshold for genome-wide significance in a GWAS study. The filled red dots in the rsPS and gePS sections represent genetic variants reaching genome-wide significance, and the filled blue dots variants that have not reached genome-wide significance. In the pPS section, open dots reflect variants that have been assigned to one of the four groups of partitioned loci. For full explanation see text.
Comparison of Three Published Global, Extended Polygenic Scores for T2D
| Study | ||||
|---|---|---|---|---|
| Khera | Mahajan | 23andMe ( | ||
| Discovery GWAS | Number of cases | 26,676 | 55,005 | 80,792 |
| Number of controls | 132,532 | 400,308 | 1,479,116 | |
| Reference | Scott | Mahajan | Multhaup | |
| Optimization data set | Methods | LDpred | Pruning and thresholding | Predetermined cutoffs |
| Number of cases | 2785 | 5639 | 48,028 | |
| Number of controls | 120,280 | 112,307 | 893,692 | |
|
| — | 0.1 | 1 × 10−5 | |
| LD pruning threshold | — |
| 50-kb window | |
| Tuning parameter |
| — | — | |
| Polymorphisms in risk score | 6,917,436 | 171,249 | 1244 | |
| Reference | UK Biobank | UK Biobank | 23andMe | |
| Testing data set | Number of cases | 5853 | 13,480 | 9008 |
| Number of controls | 288,978 | 311,390 | 167,622 | |
| Reference | UK Biobank | UK Biobank | 23andMe | |
| AUROC in testing data set (Europeans) | Not adjusted for age and sex | 0.64 | 0.66 | 0.65 |
| Adjusted for age and sex | 0.73 | 0.73 | — | |
| OR of top 5% bin vs remainder population | 2.75 | 2.75 without age and sex adjustment | 2.76 | |
| 4.52 with age and sex adjustment | ||||
For the LDpred algorithm, the tuning parameter ρ reflects the proportion of polymorphisms assumed to be causal for the disease. For the pruning and thresholding strategy, r2 reflects the degree of independence from other variants in the linkage disequilibrium, and the P value reflects the P value threshold used for selecting variants from the discovery GWAS.
Abbreviation: LD, linkage disequilibrium.
Discovery GWAS from Mahajan et al., 2018 (9) after removing UK Biobank samples. Note the difference in testing data set sample size from the published results in Mahajan et al., 2018 (9). Results presented here are based on reanalysis of data after splitting UK Biobank samples into optimization and testing sets.
Subset of GWAS samples.
Logistic model adjusted for other technical covariates such as principal components.
Obtained through private communication with authors.
Figure 2.Comparison of rsPS and gePS for T2D using data from Mahajan et al. (9). rsPS and gePS were generated using a T2D GWAS meta-analysis of 455,313 European individuals and used to predict incident T2D in 13,480 cases and 311,390 controls from the UK Biobank. (a) AUROC curves for models predicting incident T2D: each model was adjusted for genotyping array and the first six principal components of ancestry. (b) Prevalence of T2D according to 40 groups binned according to the polygenic scores, with each grouping representing 2.5% of the population. (c) Distribution of rsPS and gePS in the cases and controls. The x-axis represents polygenic score, with values scaled to a mean of 0 and standard deviation of 1. Both rsPS and gePS in UK Biobank individuals is normally distributed with a shift toward the right, observed for T2D cases.
Partitioned Polygenic Score Clusters Capturing Etiological Heterogeneity in T2D
| Physiological Impact | Phenotypic Features | Cluster Name | |||
|---|---|---|---|---|---|
| Udler | Mahajan | Examples of T2D Loci | |||
| Adverse impact on | High proinsulin | Low fasting insulin (+ high proinsulin) |
| Insulin secretion 1 |
|
| Low proinsulin | Low fasting insulin (+ low proinsulin) | Proinsulin | Insulin secretion 2 |
| |
| Reduced insulin sensitivity | Mediation with fat distribution | High fasting insulin + low BMI + low WC + high TG | Lipodystrophy | Insulin action |
|
| Mediation via obesity | High fasting insulin + high BMI + high WC | Obesity | Adiposity |
| |
| Mediation via lipid metabolism | Low TG | Liver/lipid | Dyslipidemia |
| |
| Undetermined | No striking phenotype association | No assignment | Mixed features |
| |
Comparison of pPS clusters identified by Mahajan et al. (20) and Udler et al. (38).
Abbreviations: TG, triglyceride; WHR, waist/hip ratio.