Gang Liu1, Qin Liu2, Sheng-Rong Sun1. 1. Department of Breast and Thyroid Surgery, Renmin Hospital of Wuhan University, Wuhan, Hubei 430060, China. 2. Department of Breast Surgery, Thyroid Surgery, Huangshi Central Hospital of Edong Healthcare Group, Hubei Polytechnic University, Huangshi, Hubei, 435000, China.
Abstract
Background: The aim of this study was to develop and validate nomograms to predict the survival in patients with papillary thyroid cancer (PTC). Patients and methods: Adult patients who were surgically treated for PTC were selected from the Surveillance, Epidemiology and End Results (SEER) program (2004-2013). A multivariate analysis using the Cox proportional hazards regression was performed, and nomograms for predicting 10-year overall survival (OS) and cancer-specific survival (CSS) were constructed. The discrimination and calibration plots were used to measure the accuracy of the nomograms. Results: The records of 63,219 patients with PTC were retrospectively analyzed. Nine independent factors including age, race, sex, marital status, tumor size, extrathyroidal extension, radioactive iodine, T stage, and M stage were assembled into the OS nomogram. A nomogram predicting CSS was constructed based on eight factors (age, sex, marital status, tumor size, extrathyroidal extension, T stage, N stage, and M stage). With respect to the training set, the nomograms displayed improved discrimination power compared to the TNM staging system (6th edition) in both sets. The calibration curve for the probability of survival showed agreement between the predictive nomograms and the actual observation. Conclusion: We have successfully developed prognostic nomograms to predict OS and CSS for PTC with excellent discrimination and calibration.
Background: The aim of this study was to develop and validate nomograms to predict the survival in patients with papillary thyroid cancer (PTC). Patients and methods: Adult patients who were surgically treated for PTC were selected from the Surveillance, Epidemiology and End Results (SEER) program (2004-2013). A multivariate analysis using the Cox proportional hazards regression was performed, and nomograms for predicting 10-year overall survival (OS) and cancer-specific survival (CSS) were constructed. The discrimination and calibration plots were used to measure the accuracy of the nomograms. Results: The records of 63,219 patients with PTC were retrospectively analyzed. Nine independent factors including age, race, sex, marital status, tumor size, extrathyroidal extension, radioactive iodine, T stage, and M stage were assembled into the OS nomogram. A nomogram predicting CSS was constructed based on eight factors (age, sex, marital status, tumor size, extrathyroidal extension, T stage, N stage, and M stage). With respect to the training set, the nomograms displayed improved discrimination power compared to the TNM staging system (6th edition) in both sets. The calibration curve for the probability of survival showed agreement between the predictive nomograms and the actual observation. Conclusion: We have successfully developed prognostic nomograms to predict OS and CSS for PTC with excellent discrimination and calibration.
Thyroid cancer (TC) is the most common endocrine malignancy, with an estimated 53,990 new cases in the US in 2018.1 The incidence rate for TC has increased more than 2.5 folds (5.57/100,000–13.98/100,000) in the recent decades.2 This progressive increase was nearly entirely attributable to an increase in papillary thyroid carcinoma (PTC). Differentiated thyroid carcinoma is the major subtype of TC, and is subdivided into papillary thyroid carcinoma (PTC) and follicular thyroid carcinoma (FTC). PTC is the most common type of differentiated TC, accounting for approximately 90% of all the cases.3 Radical surgical intervention remains the primary treatment for TC. Despite a favorable rate of survival for PTC, the risk of recurrence ranges from 5% to 21%.4,5The TNM Cancer Staging System of the American Joint Committee on Cancer (AJCC) is the most widely used system to predict the survival outcomes.6 In this classification system, patients are stratified according to depth of invasion (T), number of metastatic nodes (N), and the status of distant metastasis (M). This system is effective for patient populations but is not very useful in predicting individual patient outcomes.7 In addition, it does not account for other variables, such as sex, race, marital status, multifocality, surgery, presence of vascular invasion, margin status, and radioactive iodine, which have been identified as independent prognostic factors in TC.8–10Nomograms have been accepted as reliable tools to accurately predict an individual’s clinical outcome by utilizing multiple variables. Nomograms provide a visual explanation for the predicted probabilities of an outcome as obtained by statistical predictive models. They were created by regression analysis and have extended beyond the standard TNM anatomical criteria.11 Nomograms have been widely used in multiple malignancies due to their ability to handle the complexity in a systematic and unbiased manner.12–15 Well-designed nomograms have been incorporated into the National Comprehensive Cancer Network (NCCN) guidelines.16,17 Nevertheless, no nomograms are available for individual PTC patients on the basis of population-based data. Therefore, we aimed to develop a prognostic nomogram based on the large population of PTC data retrieved from the Surveillance, Epidemiology and End Results (SEER) database, to predict the individualized survival in patients with PTC.
Patients and methods
Patients
This study is a retrospective cohort analysis using data from the SEER database which was designed and maintained by the National Cancer Institute (NCI). The SEER database collects clinical information on various cancer types for associated incidence, prevalence, and survival from 17 population-based cancer registries covering approximately 28% of the US population.18 We used the SEER*STAT software (version 8.3.5) to extract data from the SEER database. The cohort for this analysis consisted of adult patients (≥18 years) diagnosed with PTC who underwent thyroid surgery between 2004 and 2013. The histological subtypes of PTC were limited using the site code C73.9 and the International Classification of Diseases for Oncology-3: 8050, 8260, 8340–8344. The exclusion criteria were: (1) patients with second primary malignancies, (2) patients diagnosed at autopsy and those lost to follow-up, and (3) patients with incomplete clinical information (marital status, cause of death, survival month, tumor size, staging information, and follow-up months). All patients were randomly assigned to either the training set for nomograms or the validation set for the purposes of validation. Neither ethical approval nor informed consent was required because the data is publicly available, and the database does not hold any identifying patient data.
Variables
Several variables, including age, sex, race, marital status, tumor size, extrathyroidal extension, multifocality, surgery, radioactive iodine, T stage, N stage, and M stage were collected in the training set. Tumor size was categorized as “≤1.0 cm”, “1.1–2.0 cm”, “2.1–4.0 cm”, and “>4 cm”. The primary end point was the overall survival (OS) and cancer specific survival (CSS). While the OS was defined as the time from diagnosis of PTC to death or censoring, the CSS was defined as the time from diagnosis to death due to PTC or censoring.
Statistical analyses
The baseline patient features were compared using the Chi-square test. Survival curves were depicted using the Kaplan-Meier method and compared using the log-rank test. The construction of nomograms was based on the independent prognostic variables determined by multivariate Cox proportional hazards regression analyses. Variables were selected through the backward stepwise selection method with a threshold of P<0.050. The performance of the nomogram was evaluated by discrimination and calibration. Discrimination was assessed using the concordance index (C-index), which is similar to the area under receiver operating characteristic (ROC) curve (AUC), with values ranging from 0.5 (no discrimination) to 1.0 (perfect discrimination). Calibration was performed by comparing the observed versus predicted mean survival rate. Significance was achieved at P<0.05 in a two-tailed test. Statistical analyses were conducted using the SPSS version 23 (IBM, Armonk, NY, USA), and the nomogram was constructed using R version 3.5.1 (http://www.r-project.org) via the design and survival packages.
Results
Clinicopathological features
In total, 63,219 eligible PTC patients were selected and randomly assigned into a training set (n=31,610) and a validation set (n=31,609). The flow diagram of data selection is presented in Figure 1. In the whole study cohort, 35,337 (55.9%) patients were older than 45 years. While 49,959 (79.0%) patients were women, 13,260 (21.0%) of them were men. Most tumors (55.6%) were ≤1.0 cm in size. Multifocal tumors were observed in 26,546 (42.0%) patients and a gross extrathyroidal extension of cancer in 10,047 (15.9%) patients. Total thyroidectomy was performed in 83.4% of all the patients, and 49.0% of them received adjuvant radioactive radioiodine. Most patients (62.5%) were categorized as having T1 stage cancer. Additionally, a few patients had lymph node invasion (22.6%) and distant metastasis (99.2%) at diagnosis.
Figure 1
Flow chart of the data selection process.
Abbreviation: PTC, papillary thyroid cancer.
Flow chart of the data selection process.Abbreviation: PTC, papillary thyroid cancer.The median follow-up was 68 months (1–143 months). By the end of the follow up, 2015 of the 63,219 (3.2%) patients had died, which included 545 deaths due to PTC and 1,470 due to other causes. The clinicopathologic characteristics of the patients are listed in Table 1.
Table 1
Patient demographics and pathological characteristics
Variables
All patients (n=63,219)
Training set (n=31,610)
Validation set (n=31,609)
P-value
No.
%
No.
%
No.
%
Age
0.193
<45
27,882
44.1
13,860
43.8
14,022
44.4
≥45
35,337
55.9
17,750
56.2
17,587
55.6
Sex
0.583
Female
49,959
79.0
25,008
79.1
24,951
78.9
Male
13,260
21.0
6,602
20.9
6,658
21.1
Race
0.952
White
52,195
82.6
26,083
82.5
26,112
82.6
Black
3,825
6.1
1,917
6.1
1,908
6.0
Other
7,199
11.4
3,610
11.4
3,589
11.4
Marital status
0.100
Married
42,469
67.2
21,332
67.5
21,137
66.9
Unmarried
20,750
32.8
10,278
32.5
10,472
33.1
Tumor Size
0.658
≤1.0 cm
25,984
41.1
13,012
41.2
12,972
41.0
1.1–2.0 cm
18,778
29.7
9,437
29.9
9,341
29.6
2.1–4.0 cm
13,786
21.8
6,835
21.6
6,951
22.0
>4.0 cm
4,671
7.4
2,326
7.4
2,345
7.4
Extrathyroidal extension
0.101
Absent
53,172
84.1
26,511
83.9
26,661
84.3
Present
10,047
15.9
5,099
16.1
4,948
15.7
Multifocality
0.131
Unifocal
36,673
58.0
18,243
57.7
18,430
58.3
Multifocal
26,546
42.0
13,367
42.3
13,179
41.7
Surgery
0.369
Lobectomy
10,522
16.6
5,219
16.5
5,303
16.8
Total thyroidectomy
52,697
83.4
26,391
83.5
26,306
83.2
Radioactive iodine
0.997
Yes
30,950
49.0
15,475
49.0
15,475
49.0
No
32,269
51.0
16,135
51.0
16,134
51.0
T stage
0.352
T1
39,520
62.5
19,771
62.5
19,749
62.5
T2
10,257
16.2
5,063
16.0
5,194
16.4
T3
11,346
17.9
5,702
18.0
5,644
17.9
T4
2,096
3.3
1,074
3.4
1,022
3.2
N stage
0.134
N0
48,927
77.4
24,385
77.1
24,542
77.6
N1
14,292
22.6
7,225
22.9
7,067
22.4
M stage
0.320
M0
62,727
99.2
31,353
99.2
31,374
99.3
M1
492
0.8
257
0.8
235
0.7
Patient demographics and pathological characteristics
Construction and validation of nomograms
Data on age at diagnosis, sex, race, marital status, tumor size, extrathyroidal extension, multifocality, surgery, radioactive iodine, T stage, N stage, and M stage were collected and analyzed for patients in both the training and validation sets. Univariate analysis showed that 10 of the above variables were significantly associated with OS in the training set (P<0.05). After performing a multivariate analysis, 9 out of the 10 variables (age, sex, race, marital status, tumor size, extrathyroidal extension, radioactive iodine, T stage, and M stage) were found to be significantly associated with OS (Table 2). Therefore, a nomogram of the OS was established with these independent variables in the training set (Figure 2A). As is shown in Table 3, 8 variables (age, sex, marital status, tumor size, extrathyroidal extension, T stage, N stage, and M stage) were confirmed to have a significant impact on patientCSS by both univariate and multivariate analyses in the training set (P<0.05). A nomogram for predicting the 10-year CSS was constructed based on the independent variables (Figure 2B).
Table 2
Univariate and multivariate analyses of overall survival in the training set
Variable
Univariate Analysis
Multivariate analysis
P-Value
HR (95%CI)
P-Value
Age
<0.001
<45
Reference
≥45
2.706 (2.339–3.131)
<0.001
Sex
<0.001
Female
Reference
Male
1.919 (1.681–2.192)
<0.001
Race
0.005
White
Reference
Black
1.471 (1.185–1.825)
0.014
Other
0.849 (0.687–1.048)
0.127
Marital status
<0.001
Married
Reference
Unmarried
1.939 (1.709–2.199)
<0.001
Tumor Size
<0.001
≤1.0 cm
Reference
1.1–2.0 cm
0.954 (0.799–1.139)
0.603
2.1–4.0 cm
1.445 (1.080–1.934)
0.013
>4.0 cm
2.787 (2.087–3.721)
<0.001
Extrathyroidal extension
<0.001
Absent
Reference
Present
2.297 (1.665–3.170)
<0.001
Multifocality
0.685
Unifocal
Multifocal
Surgery
0.521
Lobectomy
Total thyroidectomy
Radioactive iodine
<0.001
Yes
Reference
No
1.901 (1.669–2.167)
<0.001
T stage
<0.001
T1
Reference
T2
0.706 (0.504–0.988)
0.042
T3
0.563 (0.390–0.814)
0.002
T4
1.418 (0.910–2.207)
0.123
N stage
<0.001
N0
Reference
N1
1.104 (0.945–1.289)
0.214
M stage
<0.001
M0
Reference
M1
6.374 (5.054–8.038)
<0.001
Figure 2
Nomogram for predicting 10-year OS (A) and CSS (B) of patients with PTC.
Table 3
Univariate and multivariate analyses of cancer-specific survival in the training set
Variable
Univariate analysis
Multivariate analysis
P-value
HR (95%CI)
P-value
Age
<0.001
<45
Reference
≥45
3.702 (2.667–5.139)
<0.001
Sex
<0.001
Female
Reference
Male
1.542 (1.2058–1.972)
<0.001
Race
0.018
White
Reference
Black
1.291 (0.790–2.109)
0.309
Other
1.213 (0.881–1.668)
0.236
Marital status
0.008
Married
Reference
Unmarried
1.423 (1.116–1.816)
0.004
Tumor Size
<0.001
≤1.0 cm
Reference
1.1–2.0 cm
1.272 (0.773–2.094)
0.343
2.1–4.0 cm
2.240 (1.295–3.874)
0.004
>4.0 cm
4.351 (2.521–7.507)
<0.001
Extrathyroidal extension
<0.001
Absent
Reference
Present
2.017 (1.209–3.365)
0.007
Multifocality
0.506
Unifocal
Multifocal
Surgery
0.191
Lobectomy
Total thyroidectomy
Radioactive iodine
0.925
Yes
No
T stage
<0.001
T1
Reference
T2
1.143 (0.582–2.246)
0.698
T3
1.808 (0.935–3.497)
0.079
T4
7.009 (3.312–14.834)
<0.001
N stage
<0.001
N0
Reference
N1
1.643 (1.253–2.153)
<0.001
M stage
<0.001
M0
Reference
M1
8.462 (6.334–11.307)
<0.001
Univariate and multivariate analyses of overall survival in the training setUnivariate and multivariate analyses of cancer-specific survival in the training setNomogram for predicting 10-year OS (A) and CSS (B) of patients with PTC.We next validated the nomograms. Following an internal validation in the training set, the C-indices for the nomograms to predict the OS and CSS were 0.776 (95% CI: 0.770–0.792) and 0.924 (95%CI: 0.907–0.941), respectively. Following an external validation using the validation set, C-indices were found to be 0.770 (95% CI: 0.753–0.787) and 0.925 (95% CI: 0.905–0.945) for the OS and CSS nomograms, respectively. The calibration curve showed good agreement between prediction and observation in the probability of 10-year OS and CSS in both the training and validation sets (Figure 3). Furthermore, comparisons were performed between the nomograms and TNM 6th staging system in the training set. Results comparable to those of the TNM staging system were obtained with nomograms for the OS (C-index=0.776, 95% CI: 0.770–0.792 vs 0.317, 95% CI: 0.301–0.333) and CSS (C-index=0.924, 95% CI: 0.907–0.941 vs 0.152, 95% CI: 0.136–0.168). Moreover, discrimination was also enhanced with the nomogram compared to the TNM staging system when analyzed in the validation set (Table 4).
Figure 3
Calibration plots of the training and validation sets for the OS and CSS associated nomograms.
Notes: (A, B) The calibration plots of the training set in 10-year OS and CSS; (C, D) the calibration plots of the validation set in 10-year OS and CSS. The x-axis represents the nomogram-predicted survival rate, whereas the y-axis represents the actual survival rate.
C-indexes for the nomograms and other stage systems in patients with PTCAbbreviations: HR, hazard ratio; CI, confidence interval; CSS, cancer-specific survival; OS, overall survival; PTC, papillary thyroid cancer.Calibration plots of the training and validation sets for the OS and CSS associated nomograms.Notes: (A, B) The calibration plots of the training set in 10-year OS and CSS; (C, D) the calibration plots of the validation set in 10-year OS and CSS. The x-axis represents the nomogram-predicted survival rate, whereas the y-axis represents the actual survival rate.Abbreviations: OS, overall survival; CSS, cancer-specific survival.
Comparison of AUC values of the nomogram and TNM 6th staging system
The predictive abilities of the nomograms and the TNM 6th staging system were compared by analyzing the AUC values (Figure 4). The AUC values for the nomogram and the TNM 6th staging system predicting the 10-year OS rates were 0.734 and 0.524, respectively, while those for predicting the 10-year CSS rates were 0.894 and 0.569, respectively. Taken together, the OS and CSS nomograms showed superior discriminative capacity compared to the TNM 6th staging system.
Figure 4
Comparison of the AUCs of the nomogram and TNM staging system in training set.
Notes: Area under the curves of the two models to predict 10-years OS (A) and CSS (B) in the training set. The blue lines represent nomogram-predicted overall survival rates, whereas the red lines represent TNM stage-predicted overall survival rates.
Abbreviations: AUC, area under ROC curve; CSS, cancer-specific survival; OS, overall survival; ROC, receiver operating characteristic.
Comparison of the AUCs of the nomogram and TNM staging system in training set.Notes: Area under the curves of the two models to predict 10-years OS (A) and CSS (B) in the training set. The blue lines represent nomogram-predicted overall survival rates, whereas the red lines represent TNM stage-predicted overall survival rates.Abbreviations: AUC, area under ROC curve; CSS, cancer-specific survival; OS, overall survival; ROC, receiver operating characteristic.
Discussion
Several scoring systems are used for prognostic purposes. Although these systems are easier to use in the clinic, they provide a stratified population risk assessment rather than an individualized patient risk.6,20–22 Nomograms are useful tools, which have been widely used for predicting survival outcomes in individual patients. They address the complexity of balancing different variables through statistical modelling and risk quantification. Their systematic approach also avoids the bias of individual physicians or individual abnormal clinical variables. Nomograms have been proven to be superior to the traditional staging scoring systems in a variety of tumors.15,23,24 In addition, they may be the most valuable when the potential benefits of added therapy are unclear.25,26 They are also very useful for individualized risk stratification and help doctors in the management of clinical care when no firm guidelines are available.To the best of our knowledge, this is the first study that describes the development and validation of nomograms to predict 10-year OS and CSS in patients with PTC. A total of 63,219 patients from the SEER dataset were analyzed in this study. Our nomograms displayed favorable discrimination and calibration. Furthermore, the ROC curve showed that the nomograms had better predictive ability than the 6th AJCC staging system. Our nomogram models are easy-to-use clinical tools which can help with patient counselling and personalized treatment.Our nomograms identified several independent factors that could influence the prognosis in PTC patients. The results showed that most patients older than 45 years of age had the worst OS and CSS. Studies have shown that age is a major determinant of thyroid CSS.27 Older age has been identified as an independent risk factor, suggesting that older patients have lower survival rates.28–30 Multiple studies have found that patients with TC who are older than 45 years of age usually have a poor prognosis.8,31 With advancing age, there is a higher risk of a histological phenotype.32 The previous edition of the AJCC staging system used 45-years as the cut-off value for age, while the recent eight edition uses 55 years. However, regardless of the cut-off value, age is identified as an important prognostic factor.The difference in the incidence of TC in the two sexes has also been well documented.33 The incidence of TC is higher in women compared to men, though the clinical outcomes are worse in men.34 Our results were consistent with those of previous studies. In addition to the above factors, marital status, tumor size, extrathyroidal extension, T stage, N stage, and M stage were also identified as significant predictors of prognosis. However, we found that multifocality, surgery, and radioactive iodine were not risk factors of the 10-year CSS.Our study has several limitations. First, the nomograms were constructed from retrospective data. Therefore, the potential risk of selection bias cannot be ruled out. Second, due to the rare specific mortality in PTC, the evaluation of the risk of recurrence may be more meaningful than death. However, the SEER database did not have data on recurrence, and therefore it could not be evaluated. Third, in spite of the patients being chosen randomly, there was still a significant difference between the numbers of male and female patients, which could have resulted in gender bias. Finally, some other critical prognostic factors, such as margin status, calcitonin, extent of surgery, radioiodine dosage, thyrotropin suppression, BRAF point mutation, and TERT promotor point mutation, were unavailable in the SEER database.In conclusion, we were successful in establishing and validating nomograms to predict the 10-year OS and CSS in individual patients with PTC based on a large study cohort. Our nomograms could be convenient, individualized predictive tools for prognosis, which can help surgeons perform personalized survival evaluation and mortality risk identification in PTC patients.