Zhenyu Pan1,2,3, Qingting Bu4, Haisheng You5, Jin Yang1,2, Qingqing Liu1,2, Jun Lyu1,2. 1. Clinical Research Center, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, Shaanxi, People's Republic of China, lujun2006@xjtu.edu.cn. 2. School of Public Health, Xi'an Jiaotong University Health Science Center, Xi'an, Shaanxi, People's Republic of China, lujun2006@xjtu.edu.cn. 3. Department of Pharmacy, The Affiliated Children Hospital of Xi'an Jiaotong University, Xi'an, Shaanxi, People's Republic of China. 4. Department of Genetics, Northwest Women's and Children's Hospital, Xi'an, Shaanxi, People's Republic of China. 5. Department of Pharmacy, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, Shaanxi, People's Republic of China.
Abstract
OBJECTIVE: Previous studies showed that the lymph node density (LND) was a predictor of survival in Wilms' tumor (WT). However, the optimal LND cutoff point is controversial due to methodological shortcomings of previous studies, and no studies have shown the effect of LND on survival in children with WT. The purpose of this study was to remedy this situation. METHODS: We identified 376 children with WT. LND cutoff point was determined using the median value, the X-tile program, the survival-tree algorithm, and the time-dependent ROC curve analysis. Survival functions were estimated by the Kaplan-Meier method. We used Cox regression analysis to determine the impact of LND on survival. Smooth curve fitting between relative mortality risk and LND was performed. RESULTS: The LND cutoff point was 0.44, 0.65, 0.65, and 0.64 according to the median value, the X-tile program, the survival-tree algorithm, and the time-dependent ROC curve analysis, respectively. The 5-, 10-, and 20-year overall survival rates were 86.9%, 86.9%, and 84.7%, respectively, in the <0.44 group and 81.3%, 80.3%, and 80.3%, respectively, in the ≥0.44 group. Survival did not differ significantly between the two groups (P=0.185). The 5-, 10-, and 20-year overall survival rates were 87.8%, 87.8%, and 86.0%, respectively, in the < 0.65 or < 0.64 group and 76.5%, 75.1%, and 75.1%, respectively, in the ≥ 0.65 or ≥ 0.64 group. Children with the high LND had a significantly worse survival (P=0.011) if 0.64 or 0.65 was used for the stratification. LND was a significant predictor for overall survival in the multivariate Cox regression analysis (HR =1.797; 95% CI, 1.043-3.097; P=0.035). Smooth curve fitting suggested that the risk of mortality tended to be ascending with the increase in LND in general. CONCLUSION: The three methods including the X-tile program, the survival-tree algorithm, and the time-dependent receiver operating characteristic (ROC) curve analysis are equivalent in their ability to stratify patients and clearly better than the median method. The results showed that the optimal LND cutoff point was around 0.65 and the LND was a reliable predictor of overall survival in children with WT.
OBJECTIVE: Previous studies showed that the lymph node density (LND) was a predictor of survival in Wilms' tumor (WT). However, the optimal LND cutoff point is controversial due to methodological shortcomings of previous studies, and no studies have shown the effect of LND on survival in children with WT. The purpose of this study was to remedy this situation. METHODS: We identified 376 children with WT. LND cutoff point was determined using the median value, the X-tile program, the survival-tree algorithm, and the time-dependent ROC curve analysis. Survival functions were estimated by the Kaplan-Meier method. We used Cox regression analysis to determine the impact of LND on survival. Smooth curve fitting between relative mortality risk and LND was performed. RESULTS: The LND cutoff point was 0.44, 0.65, 0.65, and 0.64 according to the median value, the X-tile program, the survival-tree algorithm, and the time-dependent ROC curve analysis, respectively. The 5-, 10-, and 20-year overall survival rates were 86.9%, 86.9%, and 84.7%, respectively, in the <0.44 group and 81.3%, 80.3%, and 80.3%, respectively, in the ≥0.44 group. Survival did not differ significantly between the two groups (P=0.185). The 5-, 10-, and 20-year overall survival rates were 87.8%, 87.8%, and 86.0%, respectively, in the < 0.65 or < 0.64 group and 76.5%, 75.1%, and 75.1%, respectively, in the ≥ 0.65 or ≥ 0.64 group. Children with the high LND had a significantly worse survival (P=0.011) if 0.64 or 0.65 was used for the stratification. LND was a significant predictor for overall survival in the multivariate Cox regression analysis (HR =1.797; 95% CI, 1.043-3.097; P=0.035). Smooth curve fitting suggested that the risk of mortality tended to be ascending with the increase in LND in general. CONCLUSION: The three methods including the X-tile program, the survival-tree algorithm, and the time-dependent receiver operating characteristic (ROC) curve analysis are equivalent in their ability to stratify patients and clearly better than the median method. The results showed that the optimal LND cutoff point was around 0.65 and the LND was a reliable predictor of overall survival in children with WT.
Wilms’ tumor (WT) is the most common pediatric renal tumor, representing approximately 90% of all pediatric renal tumors1 and 5% of all pediatric tumors.2 Although the overall survival rate exceeds 90% in patients with WT,3 reliable prognostic factors still need to be identified because they can be used to stratify patients in order to further improve the survival rate and reduce the intensity of chemotherapy or radiotherapy in situations where the prognosis is constant. The two recent studies of You et al4 and Saltzman et al5 showed that the lymph node density (LND) defined as the proportion of the number of positive lymph nodes (LNs) relative to the number of examined LNs can be used to stratify the prognosis in WT patients. Nevertheless, the impact of the LND on overall survival in children is unclear since all the previous studies also included adults. Furthermore, the optimal LND cutoff point is unclear. The cutoff point was determined by the classification and regression tree method available in the SPSS software in the study of You et al and as the median LND in the study of Saltzman et al. These studies4,5 did not consider that survival rates constitute time-dependent data. Three methods are commonly used to calculate an optimal cutoff point for time-dependent data as follows: 1) the X-tile program,6–9 2) the survival-tree algorithm, which involves a classification and regression tree that can deal with time-dependent data,10–12 and 3) the time-dependent receiver operating characteristic (ROC) curve analysis.13–17 However, no previous study has compared the effects of these three methods.In this study, we compared these three methods with the median method in determining the optimal LND cutoff point in children with WT. We also determined the impact of the LND on overall survival in children with WT based on data in the Surveillance, Epidemiology, and End Results (SEER) database.
Methods
Patient selection and data acquisition
The SEER database is a free database covering the period from 1973 to 2014. It includes data from 18 population-based registries and covers 28% of the US population.18 The data of patients with WT are recorded according to the third edition of the International Classification of Diseases for Oncology (ICD-O-3). Patients whose ICD-O-3 histological diagnostic code is 8960 were included in this study, whereas patients were excluded if they were older than 18 years, had no examined LNs, or an LND of 0. The following information was collected for each patient: age at diagnosis, sex, race, number of examined LNs, SEER stage, tumor laterality, year of diagnosis, LND, follow-up time, and vital status. The year of diagnosis was stratified based on the treatment time defined by the National Wilms’ Tumor Study Group (NWTS)-2, -3, -4, and -5 trial publication times (in 1981, 1989, 1998, and 2001, respectively).19–22 The study is in accordance with the Declaration of Helsinki and approved by the institutional review board of the First Affiliated Hospital of Xi’an Jiaotong University. All analyses were based on a free database, and thus for this type of study, informed consent is not required.
Statistical analyses
The LND cutoff point was calculated based on the median LND, the X-tile program, the survival-tree algorithm, and the time-dependent ROC curve analysis. The X-tile program was implemented using X-tile software (http://tissuearray.org/). The survival-tree algorithm and the time-dependent ROC curve analysis were implemented using the Rpart package and the survivalROC package in the R platform. The Kaplan–Meier method was used to perform survival analysis. Survival curves were compared using the log-rank test and used to determine the optimal LND cutoff point, which was used to stratify the patients into low-LND and high-LND groups. The graphical relationship of LND with mortality risk was evaluated with smooth curve fitting based on the restricted cubic spline method. Smooth curve fitting was performed by Empower(R) software (www.empowerstats.com; X&Ysolutions, Inc., Boston, MA, USA).The characteristics of patients were compared using Pearson’s chi-squared test or Student’s t-test for data that conformed to a normal distribution and using the Mann– Whitney U test for other data. The 5-, 10-, and 20-year survival rates were calculated based on life tables. Cox regression analysis was conducted to assess the prognostic significance of the potential risk factors, with risk factors that were significant in the univariate Cox regression analysis being included in the multivariate Cox regression analysis. A two-sided probability value of P≤0.05 was considered indicative of statistical significance. Statistical analyses were performed using SPSS 24.0 (IBM Corporation, Armonk, NY, USA) and R (version 3.4.3; http://www.r-project.org/).
Results
Characteristics of the patients
We included 376 patients with WT who conformed with the study inclusion criteria. These patients had a median age of 4 years, and a median of five LNs had been examined. Most of the patients were White (79.8%), had unilateral WT (98.9%), were diagnosed during 2000–2014 (67.6%), were female (56.1%), and had regional tumors (57.4%; the remaining 42.6% had distant tumors). The characteristics of the patients are presented in Table 1.
Table 1
Characteristics of patients
Total
Low-LND group (LND <0.65)
High-LND group (LND ≥0.65)
P-value
Patients, n
376
256
120
Age at diagnosis (years), median (25th–75th percentile)
4 (3–6)
4 (3–6)
4.5 (3–6)
0.169
LNs examined, median (25th–75th percentile)
5 (3–10)
7 (4–13)
2 (1–4)
P<0.001
Race, n (%)
0.779
White
300 (79.8)
207 (80.9)
93 (77.5)
Black
56 (14.9)
36 (14.1)
20 (16.7)
Others
15 (4.0)
10 (3.9)
5 (4.2)
Sex, n (%)
0.415
Male
165 (43.9)
116 (45.3)
49 (40.8)
Female
211 (56.1)
140 (54.7)
71 (59.2)
SEER stage, n (%)
P<0.001
Regional
216 (57.4)
163 (63.7)
53 (44.2)
Distant
160 (42.6)
93 (36.3)
67 (55.8)
Tumor laterality, n (%)
0.435
Unilateral
372 (98.9)
254 (99.2)
118 (98.3)
Bilateral
4 (1.1)
2 (0.8)
2 (1.7)
Year of diagnosis, n (%)
0.278
1988
4 (1.1)
4 (1.6)
0 (0)
1989–1997
79 (21.0)
52 (20.3)
27 (22.5)
1998–2001
39 (10.4)
23 (9.0)
16 (13.3)
2002–2014
254 (67.6)
177 (69.1)
77 (64.2)
Abbreviations: LND, lymph node density; LNs, lymph nodes; SEER, Surveillance, Epidemiology, and End Results.
Identification of the optimal LND cutoff point
The median LND of the 376 patients was 0.44, and so it was the cutoff point based on the median method. The X-tile program and survival-tree algorithm were constructed, both of which produced an LND cutoff point of 0.65. The cutoff point according to the time-dependent ROC curve analysis was 0.64. Figure 1 shows that patients were stratified according to the LND cutoff points, 0.44, 0.65, and 0.64.
Figure 1
Stratification of patients according to the LND cutoff points obtained using the median method, the X-tile program, the survival-tree algorithm, and the time-dependent ROC curve analysis.
Note: (A) 0.44 as the LND cutoff point, (B) 0.65 as the LND cutoff point, and (C) 0.64 as the LND cutoff point.
Survival analysis showed that the 5-, 10-, and 20-year overall survival rates were 86.9%, 86.9%, and 84.7%, respectively, in the <0.44 group and 81.3%, 80.3%, and 80.3%, respectively, in the ≥0.44 group. Survival did not differ significantly between the <0.44 and ≥0.44 groups (χ2=0.176, P=0.185) if the median LND of 0.44 was used for the stratification (Figure 2A). The 5-, 10-, and 20-year overall survival rates were 87.8%, 87.8%, and 86.0%, respectively, in the <0.65 group and 76.5%, 75.1%, and 75.1%, respectively, in the ≥0.65 group. The 5-, 10-, and 20-year overall survival rates in the <0.64 and ≥0.64 groups were the same as those in the <0.65 and ≥0.65 groups, respectively. Patients in the high-LND group had a significantly worse survival (χ2=6.416, P=0.011) if an LND of 0.64 or 0.65 was used for the stratification (Figure 2B and C).
Figure 2
Kaplan–Meier curve for evaluating the overall survival of children defined by different LND cutoff points.
Notes: (A) <0.44 (blue line) vs ≥0.44 (green line), (B) <0.65 (yellow line) vs ≥0.65 (purple line), (C) <0.64 (orange line) vs ≥0.64 (red line), and (D) synthetic diagram of all the survival curves.
Abbreviation: LND, lymph node density.
Figure 2D is a synthetic diagram of all the survival curves and allows comparison of the four methods including the median method, the X-tile program, the survival-tree algorithm, and the time-dependent ROC curve analysis for determining the cutoff point. The results indicated that the optimal LND cutoff point was around 0.65. We therefore finally stratified patients into the low-LND and high-LND groups using 0.65 as the LND cutoff point.
Impact of the LND on overall survival in children with WT
Kaplan–Meier survival curves (Figure 2B) showed that the survival outcome differed significantly between the low-LND (LND <0.65) and high-LND (LND ≥0.65) groups (P=0.011). Because the baselines were not completely balanced in the two groups (Table 1), we used Cox regression analysis to study the impact of the LND on overall survival (Table 2). The analysis showed that overall survival was significantly worse in the high-LND (LND ≥0.65) group than in the low-LND (LND <0.65) group in both univariate analysis (HR =1.965; 95% CI, 1.152–3.352; P=0.013) and multivariate analysis (HR =1.797; 95% CI, 1.043–3.097; P=0.035) (Table 2). In addition, smooth curve fitting after adjusting race, SEER stage, and tumor laterality shows that the risk of mortality tended to be ascending with the increase in LND in general (Figure 3).
Table 2
Univariate and multivariate analyses of overall survival
Univariate analysis
Multivariate analysis
HR (95% CI)
P-value
HR (95% CI)
P-value
Age (years)
1.088 (0.998–1.186)
0.055
LNs examined
0.964 (0.917–1.014)
0.152
Race
White
Reference
Reference
Black
0.963 (0.432–2.147)
0.926
0.940 (0.420–2.104)
0.88
Others
4.060 (1.720–9.581)
0.001
4.235 (1.788–10.031)
0.001
Sex
Male
Reference
Female
0.884 (0.518–1.509)
0.65
SEER stage
Regional
Reference
Reference
Distant
1.992 (1.161–3.416)
0.012
1.716 (0.982–2.999)
0.058
Tumor laterality
Unilateral
Reference
Reference
Bilateral
4.335 (1.055–17.815)
0.042
3.743 (0.879–15.930)
0.074
Year of diagnosis
1988
Reference
1989–1997
0.443 (0.056–3.496)
0.44
1998–2001
0.658 (0.079–5.512)
0.7
2002–2014
0.854 (0.116–6.277)
0.877
LND stratification
Low-LND (LND <0.65)
Reference
Reference
High-LND (LND ≥0.65)
1.965 (1.152–3.352)
0.013
1.797 (1.043–3.097)
0.035
Abbreviations: LND, lymph node density; LNs, lymph nodes; SEER, Surveillance, Epidemiology, and End Results.
Figure 3
Smooth curve fitting of the risk of mortality and LND after adjusting variables including race, SEER stage, and tumor laterality.
Note: The red line represents the fitting curve, and the blue dotted lines represent the 95% CI.
Abbreviation: LND, lymph node density; SEER, Surveillance, Epidemiology, and End Results.
Discussion
About 90% of the children with WT are expected to survive long-term thanks to the efficacy of current multimodal therapies.3 This has resulted in most of the current research focusing on reducing treatment morbidity while maintaining good clinical outcomes. Effective prognostic factors for stratifying patients need to be identified. Previous studies have suggested that LND can affect the survival of patients with WT.4,5 However, the LND cutoff point has been controversial because the methods used in previous studies to determine this did not consider time-dependent survival data. Moreover, no study has clarified the impact of the LND on survival in children with WT. This study is the first to use the median method, the X-tile program, the survival-tree algorithm, and the time-dependent ROC curve analysis to determine the optimal cutoff point and use this to investigate the impact of the LND on survival in children.The first method, based on the median, is a very simple method for determining the cutoff point, but it is crude since it does not apply statistical processing to the available data. The second method uses X-tile, which is a free software available from Yale University School of Medicine that can determine a cutoff point for continuous data with a time-dependent outcome.6 Its principle for finding the optimal cutoff point is to select the minimum P-value according to Kaplan–Meier survival curves and the log-rank test. To be specific, when finding the optimal LND cutoff point, the algorithm in the X-tile is like this: 1) an LND cutoff point splits patients into two groups; 2) the Kaplan–Meier survival curves between the two groups were compared by the log-rank test; 3) a P-value can be obtained by the log-rank test; 4) each LND cutoff point corresponds to a P-value by cycling the abovementioned process; and 5) the LND cutoff point with the minimum P-value is the optimal LND cutoff point because the minimum P-value means that the difference between the Kaplan–Meier survival curves of the two groups is the most significant from a statistical point of view. X-tile is often applied to time-dependent data. For example, Xu et al23 used X-tile to seek the optimal cutoff point for the levels of PEBP1 protein and mRNA, whereas Zhang et al24 used X-tile to discover the optimal cutoff points for the levels of 35 miR-NAs. The third method, the survival-tree algorithm, is based on recursive partitioning, in which patients are recursively split into two groups according to many LND cutoff points. The LND cutoff point is optimal when two groups have the most different Kaplan–Meier survival curves (the minimum P-value for the log-rank test).10,25–27 So, the survival-tree algorithm and the algorithm in the X-tile are essentially the same. But the X-tile software is more convenient to use than the survival-tree algorithm based on R platform. The fourth method of ROC curves is popular for displaying the sensitivity and specificity of a continuous prognostic variable for a binary variable.28 Sensitivity is the probability that applying an LND cutoff point predicts the positive event (death) correctly. Specificity is the probability that applying an LND cutoff point predicts the negative event (survival) correctly. The Youden index is calculated as sensitivity plus specificity minus 1. The higher the Youden index is, the more accurate the prediction is. Each LND cutoff point corresponds to a sensitivity and specificity. Therefore, each LND point corresponds to a Youden index. The optimal LND cutoff point based on the time-dependent ROC curve analysis was determined as where the Youden index was maximal.13,14,29 The different methods underlying these approaches explain why the X-tile program and the survival-tree algorithm produce the same cutoff point, whereas that obtained from the time-dependent ROC curve analysis is different. As can be seen from the Kaplan–Meier survival curves in Figure 2D, the two cutoff points (0.64 and 0.65) determined in this study indicate that the three methods – the X-tile program, the survival-tree algorithm, and the time-dependent ROC curve analysis – are equivalent in their ability to stratify patients and clearly better than the median method, which yielded a cutoff point of 0.44.Regarding patient selection, we excluded children with an LND of 0 in this study. According to the WT stage system based on the NWTS and the International Society of Pediatric Oncology,30–32 patients with LN metastases (LND was not 0) have a higher stage than stage II. In the SEER database, a localized tumor is defined as one “limited to the organ in which it began, without evidence of spread”, a regional tumor has “spread beyond the primary site to nearby LNs or organs and tissues”, and a distant tumor has “spread from the primary site to distant organs or distant lymph nodes”.33 The stage of the 376 patients who remained after excluding the patients with an LND of 0 all had regional or distant tumors; that is, none of them had localized tumors. This indicated that patients with an LND of 0 tended to be at a lower stage, whereas patients whose LND was not 0 tended to be at a higher stage. The cutoff point would therefore be biased to being significantly smaller if patients with an LND of 0 were included, resulting in inherently different patients being compared.Previous studies4,5 found the LND to be associated with the overall survival of patients with WT. However, they did not analyze the subset of children with WT, which is a severe limitation given that WT mainly appears in children, at a median age of 3.5 years.34 Moreover, adult and pediatric patients exhibit significantly different outcomes.35–37 In this study, we therefore aimed to determine the impact of the LND on overall survival in children with WT, and the results from the Kaplan–Meier survival curves and Cox regression analysis showed the LND to be associated with the overall survival of children with WT, which was similar to the findings in previous studies.4,5Nevertheless, this study inevitably had several shortcomings. First, currently there is no standard mandating the number of LNs to be sampled or defined template for LN dissection. Thus, there could be large variations in the LND, and the denominator in the calculation of LND is different. Second, the optimal LND cutoff point determined in this study was based on 376 children in the SEER database. This database is non-standardized and covers a long interval. So, its reliability and authenticity need to be verified with other larger data sets. Third, this is a retrospective study. Some potential important information such as chemotherapy and radiotherapy regimens could not be obtained from the database. In this study, we used the NWTS-2, -3, -4, and -5 trial publication times to show treatment differences.
Conclusion
This is the first analysis of the SEER database to have applied four methods including the median value, the X-tile program, the survival-tree algorithm, and the time-dependent ROC curve analysis to determine the optimal LND cutoff point and clarify the impact of the LND on overall survival in children with WT. The X-tile program, the survival-tree algorithm, and the time-dependent ROC curve analysis are equivalent in their ability to stratify patients and clearly better than the median method. The optimal LND cutoff point was found to be around 0.65, and LND was significantly associated with overall survival in children with WT. This is helpful for evaluating or predicting the survival of children with WT by LND in future clinical applications.
Authors: John A Kalapurakal; Jeffrey S Dome; Elizabeth J Perlman; Marcio Malogolowkin; Gerald M Haase; Paul Grundy; Max J Coppes Journal: Lancet Oncol Date: 2004-01 Impact factor: 41.316
Authors: D M Green; N E Breslow; J B Beckwith; M L Ritchey; R C Shamberger; G M Haase; G J D'Angio; E Perlman; M Donaldson; P E Grundy; R Weetman; M J Coppes; M Malogolowkin; P Shearer; P Coccia; M Kletzel; P R Thomas; R Macklis; G Tomlinson; V Huff; R Newbury; D Weeks Journal: J Clin Oncol Date: 2001-09-01 Impact factor: 44.544