| Literature DB >> 26839113 |
Bent Müller1, Arndt Wilcke2,3, Anne-Laure Boulesteix4, Jens Brauer5, Eberhard Passarge6,7, Johannes Boltze2,3,8,9, Holger Kirsten2,3,10.
Abstract
Reliable risk assessment of frequent, but treatable diseases and disorders has considerable clinical and socio-economic relevance. However, as these conditions usually originate from a complex interplay between genetic and environmental factors, precise prediction remains a considerable challenge. The current progress in genotyping technology has resulted in a substantial increase of knowledge regarding the genetic basis of such diseases and disorders. Consequently, common genetic risk variants are increasingly being included in epidemiological models to improve risk prediction. This work reviews recent high-quality publications targeting the prediction of common complex diseases. To be included in this review, articles had to report both, numerical measures of prediction performance based on traditional (non-genetic) risk factors, as well as measures of prediction performance when adding common genetic variants to the model. Systematic PubMed-based search finally identified 55 eligible studies. These studies were compared with respect to the chosen approach and methodology as well as results and clinical impact. Phenotypes analysed included tumours, diabetes mellitus, and cardiovascular diseases. All studies applied one or more statistical measures reporting on calibration, discrimination, or reclassification to quantify the benefit of including SNPs, but differed substantially regarding the methodological details that were reported. Several examples for improved risk assessments by considering disease-related SNPs were identified. Although the add-on benefit of including SNP genotyping data was mostly moderate, the strategy can be of clinical relevance and may, when being paralleled by an even deeper understanding of disease-related genetics, further explain the development of enhanced predictive and diagnostic strategies for complex diseases.Entities:
Mesh:
Substances:
Year: 2016 PMID: 26839113 PMCID: PMC4759222 DOI: 10.1007/s00439-016-1636-z
Source DB: PubMed Journal: Hum Genet ISSN: 0340-6717 Impact factor: 4.132
Fig. 1Search strategy for the inclusion of studies in the analysis. This figure provides an overview of the search strategy and the numbers of eligible studies meeting the inclusion criteria
Overview of the number and type of genetic data used for prediction
| Predicted phenotype | Study | Type of genetic variants used |
|---|---|---|
| Breast cancer | Mealiffe et al. ( | 7 GWAS-SNPs with independent replication |
| Wacholder et al. ( | 10 GWAS-SNPs | |
| Dite et al. ( | 7 GWAS-SNPs with independent replication | |
| Darabi et al. ( | 18 GWAS-SNPs | |
| Vachon et al. ( | 76 GWAS-SNPs | |
| Prostate cancer | Zheng et al. ( | 5 SNPs from candidate association studies |
| Nam et al. ( | 19 SNPs from candidate association studies | |
| Salinas et al. ( | 5 SNPs from GWAS/candidate studies showing independent cumulative association | |
| Aly et al. ( | 35 GWAS-SNPs with independent replication | |
| Johansson et al. ( | 33 GWAS-SNPS | |
| Kader et al. ( | 33 GWAS-SNPs with independent replication | |
| Klein et al. ( | 49 SNPs associated with prostate cancer or PS/1 SNP associated with breast cancer | |
| Lindström et al. ( | 25 GWAS-SNPs | |
| Helfand et al. ( | 4 SNPs from candidate association studies replicated in GWAS | |
| Butoescu et al. ( | 9 GWAS-SNPSs | |
| Esophageal squamous cell carcinoma (ESCC) | Chang et al. ( | 25 GWAS-SNPs |
| Melanoma | Fang et al. ( | 11 GWAS-SNPs |
| Cardiovascular disease (CVD) related events | Humphries et al. ( | 4 SNPs and 7 SNPs with interaction terms from candidate association studies |
| Morrison et al. ( | 10 SNPs for whites and 11 SNPs for blacks from candidate association studies | |
| Kathiresan et al. ( | 9 GWAS-SNPs (associated with LDL and HDL) | |
| Paynter et al. ( | 1 SNP from a candidate association studies with independent replication | |
| Davies et al. ( | 13 GWAS-SNPs | |
| Paynter et al. ( | 101 GWAS-SNPs associated with CVD or an intermediate phenotype and 12 GWAS-SNPs associated with CVD but no intermediate phenotype | |
| Ripatti et al. ( | 13 GWAS-SNPs | |
| Brautbar et al. ( | 13 SNPs associated with CVD but no intermediate phenotype from GWAS/candidate studies with independent replication | |
| Hernesniemi et al. ( | 24 GWAS-SNPs (associated with CAD) | |
| Hughes et al. ( | 11 SNPs and 2 haplotype from GWAS with independent replication and 15 SNPs from GWAS with independent replication | |
| Lluis-Ganella et al. ( | 8 GWAS-SNPs associated with CVD but not with intermediate phenotypes | |
| Bolton et al. ( | 27 SNPs from meta-analysis of GWAS-SNPs | |
| Ganna et al. ( | 395 SNPs (associated with CHD and intermediate phenotypes) from GWAS and 46 SNPs (directly associated with CHD) from GWAS | |
| Isaacs et al. ( | 95 SNPs from meta-GWAS on TC, LDL-C, HDL-C, and TG | |
| Tikkanen et al. ( | 28 SNPs from GWAS | |
| Ibrahim-Verbaas et al. ( | 324 GWAS-SNPs (associated with all stroke risk domains) | |
| Beaney et al. ( | 13 GWAS-SNPs from CARDIoGRAMplusC4D and 19 SNPs from GWAS and candidate studies | |
| de Vries et al. ( | 49 genome-wide significant SNPs and 103 SNPs at FDR <10 % | |
| Atrial fibrillation | Everett et al. ( | 12 GWAS-SNPs |
| Tada et al. ( | 12 GWAS-SNPs | |
| Venous thrombosis | de Haan et al. ( | 5 SNPs from GWAS/candidate studies with independent replication |
| Bruzelius et al. ( | 7 SNPs from GWAS and candidate association studies | |
| Type 2 diabetes | Balkau et al. ( | 2 SNPs from candidate association studies |
| Lyssenko et al. ( | 11 GWAS-SNPs | |
| Meigs et al. ( | 18 GWAS-SNPs | |
| van Hoek et al. ( | 18 GWAS-SNPs with independent replication | |
| Lin et al. ( | 15 GWAS-SNPs | |
| Schulze et al. ( | 20 GWAS-SNPs | |
| Talmud et al. ( | 20 GWAS-SNPs with independent replication | |
| Wang et al. ( | 19 SNPs from candidate studies with independent replication | |
| de Miguel-Yanes et al. ( | 40 GWAS-SNPs | |
| Vassy et al. ( | 38 GWAS-SNPs | |
| Vassy et al. ( | 38 GWAS-SNPs | |
| Mühlenbruch et al. ( | 42 GWAS-SNPs | |
| Tam et al. ( | 14 GWAS-SNPs | |
| Vassy et al. ( | 62 GWAS-SNPs | |
| Walford et al. ( | 62 GWAS-SNPs | |
| Parkinson disease | Hall et al. ( | 4 GWAS-SNPs |
“GWAS-SNPs” were identified in previous genome-wide association studies. In contrast, “SNPs” were derived from previous locus-wide association studies
“SNPs with independent replication” are defined as SNPs identified in a previous study, replicated in another previous association study, and finally used in the prediction study. Otherwise, SNPs resulted from at least one independent discovery study
Fig. 2Overview of methods as to how genetic data were included in the prediction model. “Sum score”: from all SNPs a single predictor reflecting the genetic burden was created and used as single parameter in the prediction model, “individual SNPs”: SNPs were included as individual covariates in the model used for prediction, “weighted”: risk alleles of SNPs were weighted according to the respective odds ratio, and “unweighted”: risk alleles of SNPs were counted without weighting. Note that Brautbar et al. (2012), Everett et al. (2013) and Talmud et al. (2010) used weighted as well as unweighted sum scores in their analyses and thus appear in both categories in the figure
Fig. 3Overview of the discrimination improvement due to inclusion of genetic data across all included 100 analyses. An AUC of 1.0 indicates perfect discrimination between cases and controls, 0.5 is equivalent to random guessing. Studies are stratified according to their predicted phenotype. Each reported analysis is depicted in form of an arrow with the arrow start indicating the AUC when using traditional risk factors only and the arrowhead indicating the AUC of the model including genetic data. The colour of the arrow illustrates significance of reclassification measures with blue statistically significant, orange not statistically significant, and grey not tested. Solid lines indicate GWAS-derived SNPs and dashed lines all other SNPs. The figure clearly illustrates it is generally harder to improve discrimination of a prediction model by including the genetic data in cases where the baseline model already performs well. Nevertheless, in some cases significant reclassification can be observed even for high baseline AUC values. For numbers and additional details on studies, please also refer to Supplemental Table 1
Fig. 4Take-home messages for predicting complex diseases with common genetic markers