| Literature DB >> 29492197 |
Yingjie Zhao1, Gong Chen2, Hongjie Yu1,3, Lingna Hu1, Yunmeng Bian1, Dapeng Yun1, Juxiang Chen4, Ying Mao2, Hongyan Chen1, Daru Lu1.
Abstract
Over 14 common single nucleotide polymorphisms (SNP) have been consistently identified from genome-wide association studies (GWAS) as associated with glioma risk in European background. The extent to which and how these genetic variants can improve the prediction of glioma risk has was not been investigated. In this study, we employed three independent case-control datasets in Chinese populations, tested GWAS signals in dataset1, validated association results in dataset2, developed prediction models in dataset2 for the consistently replicated SNPs, refined the consistently replicated SNPs in dataset3 and developed tailored models for Chinese populations. For model construction, we aggregated the contribution of multiple SNPs into genetic risk scores (count GRS and weighed GRS) or predicted risks from logistic regression analyses (PRFLR). In dataset2, the area under receiver operating characteristic curves (AUC) of the 5 consistently replicated SNPs by PRFLR(SNPs) was 0.615, higher than those of all GRSs(ranging from 0.607 to 0.611, all P>0.05). The AUC of genetic profile significantly exceeded that of family history (fmc) alone (AUC=0.535, all P<0.001). The best model in our study comprised "PRURA +fmc" (AUC=0.646) in dataset3. Further model assessment analyses provided additional evidence. This study indicates that genetic markers have potential value for risk prediction of glioma.Entities:
Keywords: genetic risk score; genome wide association study; glioma; prediction risk from logistic regression analyses; risk prediction
Year: 2016 PMID: 29492197 PMCID: PMC5823595 DOI: 10.18632/oncotarget.10882
Source DB: PubMed Journal: Oncotarget ISSN: 1949-2553
Five consistently replicated SNPs for model development in dataset2
| SNP | CHR. | Nearest gene | Region | Location on | Non-risk | Risk | Dataset1 | Dataset2 | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Risk allele frequency | OR (95%CI)b | Risk allele frequency | OR (95% CI)b | ||||||||||||
| Cases | Controls | Cases | Controls | ||||||||||||
| rs2736100 | 5 | Intron | 1339516 | T | G | 0.479 | 0.413 | 1.30(1.17-1.46) | 3.96E-06 | 0.482 | 0.418 | 1.29(1.13-1.49) | 2.69E-04 | ||
| rs1077236 | 8 | Intergenic | 130709683 | A | C | 0.698 | 0.677 | 1.10(0.95-1.28) | 0.219 | 0.725 | 0.688 | 1.20(1.03-1.39) | 0.021 | ||
| rs2157719 | 9 | Intron | 22023366 | T | C | 0.140 | 0.113 | 1.28(1.08-1.51) | 4.19E-03 | 0.141 | 0.111 | 1.32(1.07-1.62) | 9.23E-03 | ||
| rs498872 | 11 | UTR-5 | 117982577 | A | G | 0.303 | 0.272 | 1.23(1.08-1.39) | 1.19E-03 | 0.349 | 0.285 | 1.35(1.16-1.56) | 7.81E-05 | ||
| rs6010620 | 20 | Intron | 61780283 | T | C | 0.304 | 0.266 | 1.21(1.07-1.37) | 2.39E-03 | 0.330 | 0.267 | 1.35(1.16-1.57) | 9.25E-05 | ||
a, based on NCBI Build 36; b, Odds ratios (ORs), 95% confidence interval (95%CI) and P values were calculated from univariate logistic regression analyses based on additive model.
Seven independent SNPs for model development in dataset3
| SNP | CHR | Nearest gene | Region | Location on | Non-risk | Risk | Risk allele frequency | OR (95%CI)b | ||
|---|---|---|---|---|---|---|---|---|---|---|
| Cases | Controls | |||||||||
| rs2853677 | 5 | Intron | 1287194 | T | C | 0.449 | 0.375 | 1.36(1.20-1.55) | 2.70E-06 | |
| rs2735948 | 5 | Intergenic | 1299213 | C | T | 0.170 | 0.146 | 1.20(1.01-1.42) | 0.044 | |
| rs6589664 | 11 | Exon | 117910014 | G | A | 0.310 | 0.271 | 1.21(1.05-1.39) | 6.80E-03 | |
| rs494560 | 11 | 118026759 | A | G | 0.801 | 0.746 | 0.73(1.18-1.60) | 4.04E-05 | ||
| rs17748 | 11 | 118033634 | C | T | 0.327 | 0.263 | 1.36(1.18-1.56) | 1.57E-05 | ||
| rs3761121 | 20 | Intron | 62342695 | A | G | 0.269 | 0.202 | 1.45(1.25-1.69) | 9.85E-07 | |
| rs1058319 | 20 | UTR-3 | 62374389 | T | C | 0.354 | 0.256 | 1.59(1.39-1.83) | 4.76E-11 | |
a, based on NCBI Build 36; b, ORs, 95%CI and P values were calculated from univariate logistic regression analysis based on additive model.
Figure 1Frequency distribution of number of risk alleles in glioma cases and controls in dataset2 and 3
Footnotes: Five SNPs included in dataset 2: rs2736100 at 5p15.33, rs2157719 at 9p21.3, rs498872 at 11q23.3, rs6010620 at 20q13.33, rs1077236 at 8q24.21. Seven SNPs included in dataset3:rs2853677 and rs2735948 at 5p15.33, rs6589664, rs494560 and rs17748 at11q23.3, rs3761121 and rs1058319 at 20q13.33.
Association between the cumulative effect of 5 independent SNPs and glioma risk in dataset2
| Risk prediction models | Cases (%) | Controls (%) | OR(95%CI) | Trend | |
|---|---|---|---|---|---|
| 743 | 900 | ||||
| counts Genetic Risk Score (cGRS) | |||||
| 0-1 | 24(3.23) | 54(6.00) | 1.00(reference) | ||
| 2 | 77(10.36) | 142(15.78) | 1.22(0.70-2.13) | 0.482 | |
| 3 | 146(19.65) | 275(30.56) | 1.20(0.71-2.01) | 0.504 | |
| 4 | 220(29.61) | 215(23.89) | 2.30(1.37-3.86) | 0.002 | |
| 5 | 166(22.34) | 135(15.00) | 2.77(1.63-4.71) | 1.76E-04 | |
| ≥6 | 110(14.80) | 79(8.78) | 3.13(1.79-5.49) | 6.62E-05 | 2.73E-12 |
| weighed Genetic Risk Score (wGRS)1 | |||||
| 0(<Q25) | 120(16.15) | 259( 28.78) | 1.00(reference) | ||
| 1(Q25∼Q50) | 107(14.40) | 198(22.00) | 1.17(0.85-1.61) | 0.345 | |
| 2(Q50∼Q75) | 229(30.82) | 219(24.33) | 2.26(1.70-3.00) | 2.14E-08 | |
| 3(≥Q75) | 287(38.63) | 224(24.89) | 2.77(2.10-3.65) | 7.67E-13 | 3.63E-15 |
| weighed Genetic Risk Score (wGRS)2 | |||||
| 0(<Q25) | 120(16.15) | 259(28.78) | 1.00(reference) | ||
| 1(Q25∼Q50) | 107(14.40) | 199(22.11) | 1.16(0.84-1.60) | 0.361 | |
| 2(Q50∼Q75) | 226(30.42) | 219(24.33) | 2.24(1.68-2.98) | 3.12E-08 | |
| 3(≥Q75) | 287(39.03) | 223(24.78) | 2.78(2.10-3.70) | 6.25E-13 | 3.13E-15 |
Association between the cumulative effect of the 7 independent SNPs and glioma risk in dataset3
| Risk prediction models | Cases (%) | Controls (%) | OR(95%CI) | Trend | |
|---|---|---|---|---|---|
| 934 | 995 | ||||
| counts Genetic Risk Score (cGRS) | |||||
| 0-1 | 36 (3.85) | 93(9.35) | 1.00(reference) | ||
| 2 | 114 (12.21) | 182(18.29) | 1.62(1.03-2.54) | 0.036 | |
| 3 | 192 (20.56) | 232(23.32) | 2.14(1.39-3.29) | 0.001 | |
| 4 | 207 (22.16) | 221(22.21) | 2.42(1.58-3.72) | 5.39E-05 | |
| 5 | 198(21.20) | 157(15.78) | 3.26(2.10-5.05) | 1.26E-07 | |
| 6 | 122(13.06) | 77(7.74) | 4.09(2.54-6.61) | 8.06E-09 | |
| ≥7 | 65(6.96) | 33(3.32) | 5.09(2.88-8.99) | 2.07E-08 | 2.58E-12 |
| weighed Genetic Risk Score (wGRS)1 | |||||
| 0(<Q25) | 133(14.24) | 250(25.13) | 1.00(reference) | ||
| 1(Q25∼Q50) | 191(20.45) | 252(25.33) | 1.43(1.07-1.89) | 0.014 | |
| 2(Q50∼Q75) | 267(28.59) | 257(25.83) | 1.95(1.49-2.56) | 1.33E-06 | |
| 3(≥Q75) | 343 (36.72) | 235(23.62) | 2.74(2.09-3.59) | 1.55E-13 | 2.85E-13 |
| weighed Genetic Risk Score (wGRS)2 | |||||
| 0(<Q25) | 130(13.92) | 250(25.13) | 1.00(reference) | ||
| 1(Q25∼Q50) | 192 (20.56) | 249(25.03) | 1.48(1.12-1.97) | 0.006 | |
| 2(Q50∼Q75) | 262(28.05) | 251(25.23) | 2.00(1.53-2.64) | 6.01E-07 | |
| 3(≥Q75) | 350 (37.47) | 244(24.52) | 2.76(2.11-3.61) | 1.08E-13 | 3.08E-13 |
Prediction performance of genetic risk score and family history for glioma risk
| Datasets | No. of subjects | AUC(95%CI)a | H-L testb
|
|---|---|---|---|
| fmc | 1453 | 0.535(0.515-0.554) | 1.000 |
| cGRS | 1643 | 0.607(0.581-0.644) | 0.386 |
| wGRS1 | 1643 | 0.610(0.583-0.638) | 0.111 |
| wGRS2 | 1643 | 0.611(0.584-0.639) | 0.049 |
| PRFLR(SNPs) | 1453 | 0.615(0.586-0.644) | 0.051 |
| cGRS+fmc | 1453 | 0.620(0.591-0.649) | 0.816 |
| wGRS1+fmc | 1453 | 0.623(0.595-0.652) | 0.334 |
| wGRS2+fmc | 1453 | 0.621(0.592-0.650) | 0.117 |
| PRFLR(SNPs+fmc) | 1453 | 0.625(0.596-0.653) | 0.250 |
| fmc | 1718 | 0.526(0.508-0.543) | 1.000 |
| cGRS | 1921 | 0.605(0.580-0.629) | 0.997 |
| wGRS1 | 1921 | 0.607(0.582-0.632) | 0.880 |
| wGRS2 | 1921 | 0.608(0.583-0.633) | 0.113 |
| PRFLR (SNPs) | 1718 | 0.635(0.610-0.660) | 0.927 |
| cGRS+fmc | 1718 | 0.611(0.585-0.637) | 0.743 |
| wGRS1+fmc | 1718 | 0.611(0.585-0.638) | 0.154 |
| wGRS2+fmc | 1718 | 0.609(0.583-0.636) | 0.016 |
| PRFLR(SNPs+fmc) | 1718 | 0.646(0.619-0.672) | 0.393 |
AUC: the area under operating characteristic curves; fmc: family history of caner; cGRS: count genetic risk score; wGRS: weighed genetic risk score; PRFLR: predicted risks from logistic regression analysis; a, 2000 bootstrap replicates; b, Hosmer-Lemeshow “goodness-of-fit” test for model calibration.
Figure 2Receiver operating characteristic curve plots in dataset2 and 3
Footnotes: fmc: family history of caner; cGRS: count genetic risk score; wGRS: weighed genetic risk score; PRFLR: predicted risks from logistic regression analysis.
Comparisons of AUC pairs in dataset2 and 3
| Datasets | cGRS | wGRS1 | wGRS2 | PRFLR(SNPs) | cGRS+fmc | wGRS1+fmc | wGRS2+fmc | PRFLR(SNPs+fmc) |
|---|---|---|---|---|---|---|---|---|
| fmc | 5.30E-06 | 8.59E-06 | 8.85E-06 | 8.63E-06 | 5.34E-09 | 5.77E-10 | 3.22E-10 | 2.44E-10 |
| cGRS | 0.766 | 0.766 | 0.738 | 0.421 | 0.191 | 0.392 | 0.150 | |
| wGRS1 | 1.000 | 0.738 | 0.421 | 0.191 | 0.392 | 0.150 | ||
| wGRS2 | 0.512 | 0.413 | 0.190 | 0.385 | 0.150 | |||
| PRFLR(SNPs) | 0.423 | 0.201 | 0.396 | 0.141 | ||||
| cGRS+fmc | 0.270 | 0.739 | 0.212 | |||||
| wGRS1+fmc | 0.340 | 0.507 | ||||||
| wGRS2+fmc | 0.224 | |||||||
| fmc | 4.91E-07 | 3.52E-07 | 2.78E-07 | 2.85E-13 | 3.10E-10 | 2.73E-10 | 4.69E-11 | 2.20E-16 |
| cGRS | 0.501 | 0.391 | 3.11E-04 | 0.229 | 0.309 | 0.583 | 9.07E-05 | |
| wGRS1 | 0.484 | 1.91E-03 | 0.554 | 0.381 | 0.747 | 4.37E-04 | ||
| wGRS2 | 1.90E-03 | 0.656 | 0.497 | 0.84 | 4.20E-04 | |||
| PRFLR(SNPs) | 1.71E-03 | 3.52E-03 | 2.81E-03 | 0.280 | ||||
| cGRS+fmc | 0.863 | 0.769 | 2.33E-04 | |||||
| wGRS1+fmc | 0.451 | 6.38E-04 | ||||||
| wGRS2+fmc | 2.90E-04 |
Results are denoted by P values of differences in AUC pairs; P values in each cell denotes comparison of AUC in corresponding row over that of the corresponding column; Bootstrap method proposed by Delong and his colleagues was used to calculated P values; AUC: the area under operating characteristic curves; fmc: family history of caner; cGRS: count genetic risk score; wGRS: weighed genetic risk score; PRFLR: predicted risks from logistic regression analysis.
Comparisons of cNRI and IDI in dataset2 and 3
| Datasets | cGRS | wGRS1 | PRFLR (SNPs) | |||
|---|---|---|---|---|---|---|
| cNRI 95%CI | IDI 95%CI | cNRI 95%CI | IDI 95%CI | cNRI 95%CI | IDI 95%CI | |
| fmc | 0.282(0.180-0.385) | 0.026(0.015-0.036) | 0.291(0.189-0.394) | 0.027(0.016-0.038) | 0.310(0.208-0.413) | 0.028(0.017-0.039) |
| cGRS | 0.046(-0.058-0.150) | 0.002(1e-04-0.003) | 0.096(-0.008-0.200) | 0.003(1e-04-0.006) | ||
| wGRS1 | 0.086(-0.011-0.183) | 0.001(-6e-04-0.003) | ||||
| PRFLR(SNPs) | ||||||
| cGRS+fmc | ||||||
| wGRS1+fmc | ||||||
| fmc | 0.271(0.176-0.366) | 0.034(0.025-0.044) | 0.332(0.2371-0.427) | 0.034(0.025-0.044) | 0.408(0.314-0.502) | 0.058(0.046-0.070) |
| cGRS | 0.007(-0.089-0.102) | 1e-04(-0.002-0.003) | 0.196(0.102-0.290) | 0.023(0.016-0.031) | ||
| wGRS1 | 0.197(0.103-0.290) | 0.023(0.016-0.030) | ||||
| PRFLR(SNPs) | ||||||
| cGRS+fmc | ||||||
| wGRS1+fmc | ||||||
| cGRS+fmc | wGRS1+fmc | PRFLR (SNPs+fmc) | ||||
| cNRI 95%CI | IDI 95%CI | cNRI 95%CI | IDI 95%CI | cNRI 95%CI | IDI 95%CI | |
| 0.388(0.287-0.488) | 0.034(0.025-0.044) | 0.342(0.239-0.444) | 0.036(0.026-0.046) | 0.336(0.234-0.438) | 0.037(0.027-0.047) | |
| 0.142(0.063-0.221) | 0.009(0.004-0.014) | 0.168(0.077-0.259) | 0.010(0.005-0.016) | 0.154(0.056-0.253) | 0.012(0.006-0.017) | |
| 0.077(-0.013-0.167) | 0.007(0.002-0.012) | 0.142(0.063-0.221) | 0.009(0.004-0.014) | 0.180(0.083-0.276) | 0.010(0.005-0.015) | |
| -0.095(-0.198-0.008) | -0.006(-0.012--5e-4) | -0.151(-0.233--0.070) | -0.008(-0.013--0.002) | 0.142(0.063-0.221) | 0.009(0.004-0.014) | |
| 0.040(-0.064-0.143) | 0.002(-1e-04-0.003) | 0.093(-0.011-0.197) | 0.003(0-0.005) | |||
| 0.079(-0.016-0.175) | 0.001(-6e-04-0.003) | |||||
| 0.310(0.215-0.404) | 0.038(0.029-0.047) | 0.325(0.230-0.419) | 0.038(0.029-0.047) | 0.420(0.325-0.514) | 0.061(0.050-0.073) | |
| 0.098(0.027-0.168) | 0.004(7e-04-0.007) | 0.047(-0.048-0.141) | 0.004(-2e-04-0.008) | 0.236(0.143-0.330) | 0.027(0.019-0.035) | |
| 0.098(0.027-0.168) | 0.004(6e-04-0.007) | 0.098(0.027-0.168) | 0.004(6e-04-0.007) | 0.230(0.136-0.323) | 0.027(0.019-0.035) | |
| 0.161(0.066-0.257) | 0.020(0.012-0.027) | 0.130(0.034-0.226) | 0.020(0.012-0.027) | 0.098(0.027-0.168) | 0.004(6e-04-0.006) | |
| 0.004(-0.091-0.010) | 0(-0.003-0.003) | 0.197(0.103-0.290) | 0.023(0.016-0.030) | |||
| 0.199(0.105-0.293) | 0.023(0.016-0.030) | |||||
AUC: the area under operating characteristic curves; fmc: family history of caner; cGRS: count genetic risk score; wGRS: weighed genetic risk score; PRFLR: predicted risks from logistic regression analysis; cNRI: continuous net reclassification improvement; IDI: integrated discriminant index analysis. Statistics in each cell denotes comparison of variables in corresponding row over that of the corresponding column. Point estimation and 95%CIs were based on 2000 replicates of bootstrapping.