| Literature DB >> 29130107 |
Sipeng Shen1, Jianling Bai1, Yongyue Wei1, Guanrong Wang2, Qingya Li1, Ruyang Zhang1, Weiwei Duan1, Sheng Yang1, Mulong Du1, Yang Zhao1, David C Christiani3, Feng Chen1.
Abstract
Head and neck squamous cell carcinoma (HNSCC) is the sixth most common cancer and displays divergent clinical outcomes. Prognostic biomarkers might improve risk stratification and survival prediction. We aimed to investigate the prognostic genes associated with overall survival. A two-step gene selection method was used to develop a seven-gene-based prognostic model based on the training set collected from The Cancer Genome Atlas (TCGA). In addition, the prognostic model was validated in an independent testing set from Gene Expression Omnibus (GEO). The score based on the model successfully distinguished HNSCC survival into high-risk and low-risk groups in the training set (HR, 2.79; 95% CI, 1.98-3.92; P=4.05x10-9) and the testing set (HR, 2.05; 95% CI, 1.35-3.11; P=7.98x10-4). In addition, the score could significantly predict 5-year survival by ROC curves (AUCs for training set, 0.73; testing set, 0.66). Combining risk scores with clinical characteristics improved the AUCs beyond using clinical characteristics alone (training set, from 0.57 to 0.75; testing set, from 0.63 to 0.72). A subgroup sensitivity analysis with HPV status and tumor sites revealed that the risk score was significant in all subgroups except oral cavity tumors of the testing set. Furthermore, HPV-positive status improves survival in oropharyngeal HNSCC but not non-oropharyngeal HNSCC. In conclusion, the seven-gene prognostic signature is a reliable and practical prognostic tool for HNSCC. This approach can add prognostic value to clinical characteristics and provides a new possibility for individualized treatment.Entities:
Mesh:
Year: 2017 PMID: 29130107 PMCID: PMC5783586 DOI: 10.3892/or.2017.6057
Source DB: PubMed Journal: Oncol Rep ISSN: 1021-335X Impact factor: 3.906
Demographic and clinical characteristics of HNSCC patients.
| Characteristics | Training set (n=512) | Testing set (n=270) |
|---|---|---|
| Median follow-up time (years) | 4.35 | 4.95 |
| Censor rate (%) | 70.8 | 67.4 |
| Age, mean ± SD (years) | 60.8±11.9 | 60.1±10.3 |
| Sex, n (%) | ||
| Male | 376 (73.4) | 223 (82.6) |
| Female | 136 (26.6) | 47 (17.4) |
| Smoking status, n (%) | ||
| Never | 115 (22.5) | 48 (17.8) |
| Current/former | 385 (75.2) | 222 (82.2) |
| NA[ | 12 (2.3) | 0 (0) |
| Tumor site, n (%) | ||
| Oropharynx[ | 80 (15.6) | 102 (37.8) |
| Larynx | 114 (22.3) | 48 (17.8) |
| Oral cavity[ | 308 (60.2) | 83 (30.7) |
| Others | 10 (2) | 37 (13.7) |
| HPV status, n (%) | ||
| Positive | 35 (6.8) | 73 (27.0) |
| Negative | 241 (47.1) | 196 (72.6) |
| NA[ | 236 (46.1) | 1 (0.4) |
| T classification, n (%) | ||
| T1 | 48 (9.4) | 35 (13.0) |
| T2 | 130 (25.4) | 80 (29.6) |
| T3 | 99 (19.3) | 58 (21.5) |
| T4 | 172 (33.6) | 97 (35.9) |
| TX or NA[ | 63 (12.3) | 0 (0) |
| N classification, n (%) | ||
| N0 | 174 (34) | 94 (34.8) |
| N1 | 66 (12.9) | 32 (11.9) |
| N2 | 165 (32.2) | 132 (48.9) |
| N3 | 8 (1.6) | 12 (4.4) |
| NX or NA[ | 99 (19.3) | 0 (0) |
| M classification, n (%) | ||
| M0 | 484 (94.5) | 263 (97.4) |
| M1 | 4 (0.8) | 7 (2.6) |
| MX or NA[ | 24 (4.7) | 0 (0) |
| TNM stage, n (%) | ||
| I | 20 (3.9) | 18 (6.7) |
| II | 97 (18.9) | 37 (13.7) |
| III | 104 (20.3) | 37 (13.7) |
| IV | 278 (54.3) | 178 (65.9) |
| NA[ | 13 (2.5) | 0 (0) |
| Grade, n (%) | ||
| 1 | 61 (11.9) | – |
| 2 | 300 (58.6) | – |
| 3 | 122 (23.8) | – |
| 4 | 7 (1.4) | – |
| NA[ | 22 (4.3) | 270 (100) |
| Neoadjuvant treatment, n (%) | ||
| Yes | 10 (1.9) | – |
| No | 502 (98.1) | – |
| NA[ | 0 (0) | 270 (100) |
NA, not available
oropharynx also includes tonsil and base of tongue; oral cavity also includes oral tongue, buccal mucosa, lip, alveolar ridge, hard palate and floor of mouth.
Figure 1.(A) Work flow of gene selection steps, with the number of genes remaining. (B) Heatmap of the seven genes using 43 pairs of tumor and matched adjacent normal HNSCC tissue data in the training set. The upper half is the expression of normal tissues, and the lower half is the expression of tumor tissues. (C) Volcano plot showing the prognostic significance of the 886 genes selected by the WTT method. This plot depicts the hazard ratio on the x-axis and statistical significance on the y-axis, as evaluated by the Cox regression model. The blue dashed line indicates a P-value of 0.05. (D) Hazard ratio with 95% CI of the seven genes in univariable Cox regression analysis of the training set.
Figure 2.(A and B) Training set. (C and D) Testing set. Upper left panel: risk score distribution of the seven-gene model classifier and patient survival status. All scores are standardized (‘score’ = score - 0.36) to make the low-risk group negative and high-risk group positive. Lower half panel: heatmap showing expression of the seven genes among tumor patients. Right panel: Kaplan-Meier analysis for the patients. The patients are divided into low-risk (red) and high-risk (blue) groups.
Multivariable Cox regression analysis of clinical characteristics and risk score.
| Training set | ||||||
|---|---|---|---|---|---|---|
| All cases (n=512) | Cases with HPV status (n=276) | Testing set (n=270) | ||||
| Characteristics | HR (95% CI) | P-value | HR (95% CI) | P-value | HR (95% CI) | P-value |
| High risk score | 2.86 (1.99–4.12) | <0.0001 | 3.17 (1.90–5.30) | <0.0001 | 1.94 (1.27–2.96) | 0.002 |
| Age (per year) | 1.02 (1.00–1.04) | 0.039 | 1.01 (0.99–1.04) | 0.141 | 1.03 (1.00–1.06) | 0.020 |
| Gender (Female) | 1.11 (0.76–1.62) | 0.578 | 1.07 (0.64–1.79) | 0.797 | 1.02 (0.57–1.79) | 0.952 |
| Smoking status | ||||||
| (Current/former smoker) | 1.26 (0.80–1.98) | 0.324 | 1.56 (0.87–2.79) | 0.133 | 0.92 (0.49–1.75) | 0.815 |
| Clinical stage (per stage) | 1.09 (0.91–1.31) | 0.331 | 1.02(0.82–1.25) | 0.878 | 1.77 (1.21–2.59) | 0.003 |
| HPV status (positive) | – | – | 0.65 (0.29–1.47) | 0.301 | 0.43 (0.24–0.79) | 0.006 |
Estimation of confidence intervals and P-values are based on a bootstrap estimate (500 resamples) of the variance-covariance matrix.
Figure 3.(A and B) Training set. (C and D) Testing set. Upper half panel: time-dependent ROC curves are used to evaluate patient survival with the risk score on different time, obtained by the nearest neighbor method. Lower half panel: ROC curves for clinical model and improved model. Clinical model contains age, sex, smoking status and clinical stage. Improved model contains the risk score and characteristics of clinical model.
Figure 4.Subgroup sensitivity analysis with HPV status. (A) Training set (HPV+). (B) Training set (HPV−). (C) Testing set (HPV+). (D) Testing set (HPV−). Cross tables of tumor sites and risk group proportions in each figure were summarized and tested by the Fishers exact test.
Figure 5.Subgroup sensitivity analysis with different tumor sites. (A and B) Laryngeal tumors. (C and D) Oral cavity tumors. (E and F) Oropharyngeal tumors. (A, C and E) Training set. (B, D and F) Testing set. Cross tables of HPV status and risk group proportions in each figure were summarized and tested by Fishers exact test.