| Literature DB >> 23457511 |
Hong-Lian Ruan1, Hai-De Qin, Yin Yao Shugart, Jin-Xin Bei, Fu-Tian Luo, Yi-Xin Zeng, Wei-Hua Jia.
Abstract
To date, the only established model for assessing risk for nasopharyngeal carcinoma (NPC) relies on the sero-status of the Epstein-Barr virus (EBV). By contrast, the risk assessment models proposed here include environmental risk factors, family history of NPC, and information on genetic variants. The models were developed using epidemiological and genetic data from a large case-control study, which included 1,387 subjects with NPC and 1,459 controls of Cantonese origin. The predictive accuracy of the models were then assessed by calculating the area under the receiver-operating characteristic curves (AUC). To compare the discriminatory improvement of models with and without genetic information, we estimated the net reclassification improvement (NRI) and integrated discrimination index (IDI). Well-established environmental risk factors for NPC include consumption of salted fish and preserved vegetables and cigarette smoking (in pack years). The environmental model alone shows modest discriminatory ability (AUC = 0.68; 95% CI: 0.66, 0.70), which is only slightly increased by the addition of data on family history of NPC (AUC = 0.70; 95% CI: 0.68, 0.72). With the addition of data on genetic variants, however, our model's discriminatory ability rises to 0.74 (95% CI: 0.72, 0.76). The improvements in NRI and IDI also suggest the potential usefulness of considering genetic variants when screening for NPC in endemic areas. If these findings are confirmed in larger cohort and population-based case-control studies, use of the new models to analyse data from NPC-endemic areas could well lead to earlier detection of NPC.Entities:
Mesh:
Year: 2013 PMID: 23457511 PMCID: PMC3574061 DOI: 10.1371/journal.pone.0056128
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Association between risk of nasopharyngeal carcinoma and seven single-nucleotide polymorphisms.
| risk allele frequency | OR (95% CI) | |||||||||
| SNP | Allele | case (%) | Control(%) | heterozygote | homozygote | OR (95% CI) per risk allele |
| |||
| rs6774494 | A/G | 68.6 | 64.5 | 1.31 | 1.02–1.68 | 1.52 | 1.19–1.95 | 1.21 | 1.08–1.35 | 8.56×10−4 |
| rs2860580 | G/A | 74.4 | 61.7 | 2.34 | 1.76–3.11 | 3.85 | 2.89–5.11 | 1.82 | 1.62–2.05 | 1.12×10−24 |
| rs2894207 | A/G | 89.3 | 82.9 | 1.72 | 1.03–2.88 | 2.86 | 1.74–4.71 | 1.67 | 1.44–1.95 | 9.16×10−11 |
| rs28421666 | A/G | 89.5 | 85.4 | 1.14 | 0.62–2.09 | 1.72 | 0.95–3.11 | 1.46 | 1.24–1.71 | 4.36×10−6 |
| rs1412829 | A/G | 92.1 | 88.7 | 1.91 | 0.79–4.63 | 2.80 | 1.17–6.69 | 1.50 | 1.25–1.79 | 4.37×10−6 |
| rs1572072 | C/A | 75.6 | 73.0 | 1.15 | 0.84–1.58 | 1.32 | 0.97–1.80 | 1.15 | 1.02–1.29 | 0.020 |
| rs9510787 | G/A | 39.9 | 35.4 | 1.17 | 1.00–1.38 | 1.48 | 1.19–1.86 | 1.21 | 1.09–1.34 | 5.00×10−4 |
Risk allele/reference allele.
OR = odds ratio; CI = confidence interval. OR (95% CI) for each SNP were estimated separately using a logistic regression adjusted for age, sex, educational level, dialect, and rural or urban household type.
P values for trend (two-sided) were derived from Cochran- Armitage trend tests.
Figure 1Distribution of genetic risk score.
Distribution of the seven SNPs-based genetic risk score in 1,387 NPC cases (black bars) and 1,459 controls (grey bars). Individual risk for NPC was calculated by weighting each risk allele with its corresponding risk coefficient, which was derived from logistic regression.
Figure 2Distribution of risk for NPC by genetic risk score (in quintiles).
Risk of NPC (expressed as OR ±95% CI) was adjusted for age, sex, education level, dialect, residential area, family history of NPC, pack-years smoked, salted fish and preserved vegetables consumption. The boundaries for each genetic risk score quintile are shown on the x-axis.
Associations between genetic variants, epidemiological risk factors and risk of nasopharyngeal carcinoma.
| Predictor (code) | Case (No.) | Control (No.) | OR | 95% CI | OR | 95% CI |
|
| ||||||
| 1 (0) = low | 162 | 400 | 1.00 | referent | 1.00 | referent |
| 2 (1) | 249 | 321 | 1.93 | 1.50–2.46 | 1.88 | 1.44–2.44 |
| 3 (2) | 281 | 291 | 2.39 | 1.87–3.06 | 2.47 | 1.90–3.21 |
| 4 (3) | 321 | 251 | 3.17 | 2.47–4.05 | 3.23 | 2.48–4.19 |
| 5 (4) = high | 374 | 196 | 4.74 | 3.68–6.10 | 4.64 | 3.55–6.07 |
|
| 8.629×10−38 | 4.112×10−26 | ||||
|
| ||||||
| No (0) | 1,154 | 1,382 | 1.00 | referent | 1.00 | referent |
| Yes (1) | 233 | 77 | 3.65 | 2.79–4.78 | 3.53 | 2.64–4.71 |
|
| ||||||
| ≤20 | 955 | 1,110 | 1.00 | referent | 1.00 | referent |
| >20 | 432 | 349 | 1.52 | 1.26–1.83 | 1.41 | 1.15–1.74 |
|
| ||||||
| < monthly | 731 | 1,084 | 1.00 | referent | 1.00 | referent |
| monthly | 236 | 118 | 3.02 | 2.37–3.84 | 2.07 | 1.57–2.73 |
| ≥ weekly | 420 | 257 | 2.45 | 2.04–2.95 | 1.55 | 1.25–1.92 |
|
| 1.821×10−27 | 5.767×10−5 | ||||
|
| ||||||
| < monthly | 555 | 973 | 1.00 | referent | 1.00 | referent |
| monthly | 204 | 129 | 2.81 | 2.20–3.58 | 2.07 | 1.56–2.74 |
| ≥ weekly | 628 | 357 | 3.27 | 2.75–3.88 | 2.66 | 2.17–3.25 |
|
| 1.424×10−43 | 2.542×10−12 | ||||
OR = odds ratio; CI = confidence interval. OR and 95% CI were derived from logistic regression, with adjustment for age, sex, education level, dialect, household type (rural/urban).
OR and 95% CI were derived using logistic regression adjusted for age, sex, education level, dialect, rural or urban household type, and all other variables listed in the table.
P values for trend (two-sided) were derived from Cochran- Armitage trend tests.
Area under curves (AUC) as a measure of predictive strength for risk-prediction models based on different indicatorsa.
| Model | AUC | 95% CI | AUC revisedoptimism-corrected | Model calibration |
| ||
| χ2 statistic |
| ||||||
| Environmental | 0.68 | 0.66–0.70 | 0.67 | 8.89 | 0.352 | <0.001 | |
| Family history of NPC | 0.57 | 0.55–0.59 | 0.55 | 3.53 | 0.897 | <0.001 | |
| Epidemiological | 0.70 | 0.68–0.72 | 0.69 | 13.01 | 0.112 | <0.001 | |
| Genetic risk score | 0.64 | 0.62–0.66 | 0.63 | 4.41 | 0.818 | <0.001 | |
| Inclusive model | 0.74 | 0.72–0.76 | 0.73 | 0.73 | 0.999 | reference | |
The environmental model is based on consumption of salted fish and preserved vegetables, and cumulative amount of smoking. The family history of NPC model includes family history of NPC only. The epidemiological model combines both environmental and family history of NPC predictors. The genetic risk score model includes a score derived from seven SNPs identified in the Cantonese GWAS. The inclusive model integrates all data on epidemiological and genetic predictors.
χ2 statistic and P value was calculated from the Hosmer–Lemeshow Goodness-of-Fit test, a model with χ2 statistic <20 (P>0.01) is considered as a good calibration.
AUC of the models were compared with a nonparametric approach, and P value was obtained from the comparison of the inclusive model with the other models.
Figure 3Receiver-operating characteristic (ROC) analysis.
The areas under the ROC curves (AUC) as measures of predictive power for risk-assessment models based on environmental risk factors, family history of NPC, and genetic variants for NPC.
Reclassification of data for use in epidemiological and inclusive modelsa.
| Epidemiological Model | Inclusive Model | |||
| Healthy controls | ||||
| [0,0.2) | [0.2,0.3) | [0.3,1] | % reclassified | |
| [0,0.2) | 0 | 0 | 0 | – |
| [0.2,0.3) | 105 | 111 | 102 | 65 |
| [0.3,1] | 108 | 147 | 886 | 22 |
| NPC cases | ||||
| [0,0.2) | 0 | 0 | 0 | – |
| [0.2,0.3) | 12 | 36 | 74 | 70 |
| [0.3,1] | 28 | 53 | 1184 | 6 |
| Combined Data | ||||
| [0,0.2) | 0 | 0 | 0 | – |
| [0.2,0.3) | 117 | 147 | 176 | 67 |
| [0.3,1] | 136 | 200 | 2070 | 14 |
| NRI [95% CI]: 0.16 [0.13–0.20]; | ||||
| IDI [95% CI]: 0.05 [0.04–0.06]; | ||||
NRI: net reclassification improvement; IDI: integrated discrimination index; Reclassification was calculated for strata of predicted risk of <0.2, 0.2 to 0.3, and ≥0.3.