| Literature DB >> 24651836 |
Daichi Shigemizu1, Testuo Abe1, Takashi Morizono1, Todd A Johnson1, Keith A Boroevich1, Yoichiro Hirakawa2, Toshiharu Ninomiya3, Yutaka Kiyohara2, Michiaki Kubo4, Yusuke Nakamura5, Shiro Maeda6, Tatsuhiko Tsunoda1.
Abstract
Recent genome-wide association studies (GWAS) have identified several novel single nucleotide polymorphisms (SNPs) associated with type 2 diabetes (T2D). Various models using clinical and/or genetic risk factors have been developed for T2D risk prediction. However, analysis considering algorithms for genetic risk factor detection and regression methods for model construction in combination with interactions of risk factors has not been investigated. Here, using genotype data of 7,360 Japanese individuals, we investigated risk prediction models, considering the algorithms, regression methods and interactions. The best model identified was based on a Bayes factor approach and the lasso method. Using nine SNPs and clinical factors, this method achieved an area under a receiver operating characteristic curve (AUC) of 0.8057 on an independent test set. With the addition of a pair of interaction factors, the model was further improved (p-value 0.0011, AUC 0.8085). Application of our model to prospective cohort data showed significantly better outcome in disease-free survival, according to the log-rank trend test comparing Kaplan-Meier survival curves (p--value 2:09 x 10(-11)). While the major contribution was from clinical factors rather than the genetic factors, consideration of genetic risk factors contributed to an observable, though small, increase in predictive ability. This is the first report to apply risk prediction models constructed from GWAS data to a T2D prospective cohort. Our study shows our model to be effective in prospective prediction and has the potential to contribute to practical clinical use in T2D.Entities:
Mesh:
Year: 2014 PMID: 24651836 PMCID: PMC3961382 DOI: 10.1371/journal.pone.0092549
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Outline of the risk prediction model construction and validation.
Figure 2Results of whole genome association scan for a training set.
Figure 3Risk prediction models using 10-fold cross-validation on the training set.
The top AUCs observed in regression methods and the number of SNPs used in risk prediction model construction.
| algorithm | method | #SNPs used | AUC:clinical (95%CIs) | AUC:combined (95%CIs) |
| GWAS | ridge regression | 5 | 0.7986 (0.7646–0.8326) | 0.8019 (0.7682–0.8356) |
| elastic net | 5 | 0.7984 (0.7644–0.8323) | 0.8025 (0.7689–0.8361) | |
| lasso | 5 | 0.7984 (0.7645–0.8324) | 0.8027 (0.7691–0.8363) | |
| with r-square | ridge regression | 5 | 0.7986 (0.7646–0.8326) | 0.8019 (0.7682–0.8356) |
| elastic net | 5 | 0.7984 (0.7644–0.8323) | 0.8025 (0.7689–0.8361) | |
| lasso | 5 | 0.7984 (0.7645–0.8324) | 0.8027 (0.7691–0.8363) | |
| SIS | ridge regression | 5 | 0.7986 (0.7646–0.8326) | 0.7989 (0.7651–0.8328) |
| elastic net | 5 | 0.7984 (0.7644–0.8323) | 0.7994 (0.7656–0.8332) | |
| lasso | 5 | 0.7984 (0.7645–0.8324) | 0.7995 (0.7657–0.8333) | |
| with r-square | ridge regression | 5 | 0.7986 (0.7646–0.8326) | 0.7989 (0.7651–0.8328) |
| elastic net | 5 | 0.7984 (0.7644–0.8323) | 0.7994 (0.7656–0.8332) | |
| lasso | 5 | 0.7984 (0.7645–0.8324) | 0.7995 (0.7657–0.8333) | |
| ABF | ridge regression | 10 | 0.7986 (0.7646–0.8326) | 0.8050 (0.7715–0.8386) |
| elastic net | 10 | 0.7986 (0.7646–0.8326) | 0.8054 (0.7719–0.8388) | |
| lasso | 10 | 0.7984 (0.7645–0.8324) | 0.8054 (0.772–0.8389) | |
| with r-square | ridge regression | 10 | 0.7986 (0.7646–0.8326) | 0.8051 (0.7717–0.8385) |
| elastic net | 10 | 0.7984 (0.7644–0.8323) | 0.8056 (0.7723–0.8389) | |
|
|
|
|
|
For the elastic net alpha is set to 0.5. Alphas of 0.1 to 0.9 at 0.1 intervals were tested and the complete results are in Table S2. Many coefficients of the lasso and the elastic net methods are set to 0 due to variable selection in regression models.
Figure 4The ROC curve for our risk prediction model.
Sensitivity and specificity was maximized at a sensitivity of 0.858 and specificity of 0.623.
A list of significant interaction factors.
| interaction (chromosome #, gene) | factor | p-value (ANOVA) | |
| age | gender | CF-CF | 0.0020 |
| rs1436953 (15, C2CD4A/B | rs11865230 (16, | GF-GF | 0.0425 |
*gene associated with T2D.
Figure 5Cumulative disease-free survival in a prospective cohort.
Models using (A) only clinical risk factors and (B) both of clinical and genetic risk factors.