Amal Saki Malehi1, Fakher Rahim2. 1. Research Center of Thalassemia and Hemoglobinopathy, Health Research Institute, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Iran; Department of Biostatistics and Epidemiology, School of Public Health, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Iran. 2. Research Center of Thalassemia and Hemoglobinopathy, Health Research Institute, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Iran.
Abstract
AIMS: The aim of this study was to determine the prognostic index for separating homogenous subgroups in colorectal cancer (CRC) patients based on clinicopathological characteristics using survival tree analysis. METHODS: The current study was conducted at the Research Center of Gastroenterology and Liver Disease, Shahid Beheshti Medical University in Tehran, between January 2004 and January 2009. A total of 739 patients who already have been diagnosed with CRC based on pathologic report were enrolled. The data included demographic and clinical-pathological characteristic of patients. Tree-structured survival analysis based on a recursive partitioning algorithm was implemented to evaluate prognostic factors. The probability curves were calculated according to the Kaplan-Meier method, and the hazard ratio was estimated as an interest effect size. RESULT: There were 526 males (71.2%) of these patients. The mean survival time (from diagnosis time) was 42.46± (3.4). Survival tree identified three variables as main prognostic factors and based on their four prognostic subgroups was constructed. The log-rank test showed good separation of survival curves. Patients with Stage I-IIIA and treated with surgery as the first treatment showed low risk (median = 34 months) whereas patients with stage IIIB, IV, and more than 68 years have the worse survival outcome (median = 9.5 months). CONCLUSION: Constructing the prognostic classification index via survival tree can aid the researchers to assess interaction between clinical variables and determining the cumulative effect of these variables on survival outcome.
AIMS: The aim of this study was to determine the prognostic index for separating homogenous subgroups in colorectal cancer (CRC) patients based on clinicopathological characteristics using survival tree analysis. METHODS: The current study was conducted at the Research Center of Gastroenterology and Liver Disease, Shahid Beheshti Medical University in Tehran, between January 2004 and January 2009. A total of 739 patients who already have been diagnosed with CRC based on pathologic report were enrolled. The data included demographic and clinical-pathological characteristic of patients. Tree-structured survival analysis based on a recursive partitioning algorithm was implemented to evaluate prognostic factors. The probability curves were calculated according to the Kaplan-Meier method, and the hazard ratio was estimated as an interest effect size. RESULT: There were 526 males (71.2%) of these patients. The mean survival time (from diagnosis time) was 42.46± (3.4). Survival tree identified three variables as main prognostic factors and based on their four prognostic subgroups was constructed. The log-rank test showed good separation of survival curves. Patients with Stage I-IIIA and treated with surgery as the first treatment showed low risk (median = 34 months) whereas patients with stage IIIB, IV, and more than 68 years have the worse survival outcome (median = 9.5 months). CONCLUSION: Constructing the prognostic classification index via survival tree can aid the researchers to assess interaction between clinical variables and determining the cumulative effect of these variables on survival outcome.
Entities:
Keywords:
Classification; Iran; colorectal cancer; prognostic index; survival tree
Colorectal cancer (CRC) is classified as the third common cancer worldwide with nearly 1.4 million new cases in 2012.[1] It is also one of most malignancies cancers in Iran stands after breast cancer in females and the fourth main cancer in males (8/100,000 in male and female).[2] Estimation of diagnosed new cases of CRC in Iran was reported more than 3641 each year.[3] Recent epidemiological studies have reported the increasing incidence trend of CRC in Iran.[45] CRC causes to 2262 deaths annually, and it is the sixth leading cause of cancer death in Iran.[23] Estimated mean survival time of CRC was 105 months (confidence interval: 95.1–115.1), and the overall 5 years survival was 61.0%.[67] However, variation of clinicopathologic characteristics of patients leads to different survival times in several subgroups of patients that defined by different values of prognostic factors.[3] Hence, assessing prognostic factors constitutes one of the principal tasks in clinical cancer research. Evaluating prognostic factors can provide prognostic indices (PI). PI as clinical tool aid clinicians in predicting the survival outcome and prognosis of patients with aggressive diseases. The PI should be defined with good ability in grouping patients with well-separated survival distributions.There are many prognostic evaluation methods in survival analysis, Cox proportional hazards regression model and its extensions (introduced in a seminal paper by Cox, 1972), broadly applicable and the most commonly used methods. However, Cox proportional hazards model needs to satisfy various assumptions which an underlying assumption is the proportional hazards. As well as it forces a particular link between covariates and the responses.[8] In the last two decades, tree-based models as nonparametric alternatives to parametric and semi-parametric models are developed to relax the restrictive assumptions.[910] Tree-based models are implemented regarding several clinicopathologic variables through recursively partitioning the covariates. Survival tree is the most popular use of the tree-based methods in survival analysis in biomedical studies.[111213] It is an analysis that enables to determine the PI and the natural identification of prognostic groups among patients. Such grouping is important because of the patient's heterogeneity in terms of disease-free survival outcome and allows physicians to make early prudent decisions regarding adjuvant, combination therapies.[14]The aim of this study was to characterize the PI and identify prognostic subgroups of Iranian colorectalpatients to predict survival outcome and time to an event of interest.
Methods
Patients
This study was performed on patients referred to Cancer Registry Center of the Research Center of Gastroenterology and Liver Disease (RCGLD), Shahid Beheshti Medical University in Tehran, from January 2004 to January 2009. The diagnosis of CRC confirmed based on the pathology report of a cancer registry. The survival time of patients was considered from the date of diagnosis up to January 2009. These patients were treated and referred to this cancer registry of 10 public and private collaborative hospitals. After preliminary assessment, a total of 739 patients were engaged. This study was approved by the Ethic Committee of RCGLD, and all participants signed the informed consent prior to enrollment.Deaths were confirmed through the telephonic contact to relatives of patients. Survival time as the primary outcome was calculated in months. Demographic information such as age at diagnosis, sex, race, education and marital statuses were obtained from the hospital records. The clinicopathological characteristics regarding family history of cancer, tumor grade, tumor size, pathologic stage,[15] and histopathology report were also recorded. Pathologic stage of tumor was defined based on (T) primary tumor, size, and invasiveness, (N) the extent of spread to the lymph nodes, (M) presence or absence of distant metastasis, including lymph nodes that are not regional.
Survival tree
Survival tree analysis is used to model the relationship between survival time and several potential prognostic factors nonparametrically. In this method, the patients were recursively partitioned into homogenous subgroups based on important prognostic factors. Survival tree selected predictors with the highest power to discriminate between good and bad survival as prognostic factors. The result of this analysis represented by terminal nodes which are characterized by a set of predictors and their values and is simultaneously associated with a distinct survival curve. Each terminal node defined as a class of patients with clearly separated survival curve.
Statistical analysis
Survival tree model was performed for the overall survival time, from initial diagnosis to death or censored time (end of Follow-up time). Survival probability is estimated by Kaplan–Meier method for each subgroup and represented as mean (±standard deviation). Log-rank test was used to compare the survival distributions of subgroups of PI and hazard ratio (HR) were estimated as the interested effect size. Data were analyzed using R and SPSS version 19 software (SPSS Inc., Chicago, IL, USA). P < 0.05 was considered as significant.
Results
A total of 739 patients were followed over the study period. The mean age at diagnosis was 59.67 ± 12.85 years (range 20–88), 526 (71.2%) were males, and 213 (28.8%) were females.The estimated mean and median (±standard error) survival time (from diagnosis time) was 42.46± (3.4) and 22.8± (2.27), respectively, and an estimated 5 years overall survival rate was 30%. The baseline and clinical characteristics of patients and result of univariate analysis are reported in Table 1. Survival tree model was fitted based on significant variables in the univariate test.
Table 1
Baseline and clinicopathologic characteristics of the study groups
Baseline and clinicopathologic characteristics of the study groupsThe diagram of survival tree is shown in Figure 1. It has an initial split on tumor, node, metastasis (TNM) stage as the principal prognostic factor. Survival tree identified two other variables that play important roles in survival time are age at diagnosis and the first treatment protocol. Finally, the patients were divided into homogenous subgroups based on these variables [Table 2]. Subgroup IV has a better survival outcome while subgroup II has worse survival time than other subgroups. Thus, patients with Stage IIIB-IV and more than 68 years with 9.5 months as median survival time have the lowest survival outcome [Table 3]. Estimated HRs of these subgroups showed greater risk for all subgroups than the fourth subgroup [Table 3].
Figure 1
Survival tree. Kaplan–Meier curve inside each terminal nodes
Table 2
Subgroups for prognostic index of survival tree
Table 3
Descriptive statistics and hazard ratio for each subgroup
Survival tree. Kaplan–Meier curve inside each terminal nodesSubgroups for prognostic index of survival treeDescriptive statistics and hazard ratio for each subgroupThe curves of cumulative hazard functions were drawn in Figure 2. According to these findings, we found that patients with Stage I-IIIA and surgery and biopsy as the first treatment (subgroup IV) has the lowest hazard rate.
Figure 2
Cumulative hazard rate for four subgroups generated by survival tree
Cumulative hazard rate for four subgroups generated by survival treeThe value of the overall log-rank test was 68.64 (P < 0.001) and revealed a significant difference between the subgroups. This means that survival tree leads to classify the patients with highly significant difference in survival outcome. In addition, Table 4 shows pairwise comparisons among the subgroups. According to these findings, subgroup II exhibited high-risk, subgroups I and III showed intermediate risk, and subgroup IV determined with low-risk.
Table 4
Pairwise comparisons by log-rank test
Pairwise comparisons by log-rank test
Discussion
Beside investigation on etiology and epidemiology, identifying and evaluating the prognostic factors are one of the major tasks in clinical cancer research. In many studies, several prognostic factors and PI for survival have been identified in patients with CRC.[67161718] However, because of geographic disparities in CRC survival[19] and heterogeneity in biological and clinical pathological characteristics in patients with CRC, the survival times are different in subgroups of patients, and it is difficult to use this information to predict an individual patient's prognosis.In this study, we evaluated prognostic factors in Iranian patients using tree-based models. The basic idea of the tree-based models to construct the subgroups based on prognostic factors that are internally as homogenous as possible with regard to their response and externally as separate as possible.[14] Recently, the tree-based model has been highlighted in predicting outcomes in cancerpatients in several biomedical studies.[202122] Survival tree analysis is utilized to homogenize the data by separating the data into different subgroups on the basis of similarity of survival outcome and determined the prognostic factors and the subgroup of patient simultaneously.[11] Evaluating the constructed prognostic subgroups via survival tree would aid the researchers to assess interaction between clinical variables, determining the cumulative effect of these variables on survival, and translating this information into appropriate management.[23] In this study, based on survival tree, TNM staging, age of diagnosis with cut of 68-year-old, and the first treatment protocols identified as prognostic factors and characterized prognostic classification index.Based on our result, HR of patients with chemotherapy and radiation was 1.97 times than patients with surgery. There are some arguable results,[2425] but some studies reached the same result.[6] Such reversal results may be related to molecular characteristic of the tumor.TNM staging was confirmed as the most prognostic factor in several studies.[6262728] However, there were few studies that showed inconsistent results.[29]There were some controversy findings about age of diagnostic;[628] however, numerous studies were agreement with our result.[729] Various cut points in categorizing the age may be led to different results in the survival studies.More investigating the result based on cumulative hazard rate curves and log-rank test showed the high-risk, intermediate and low risks subgroup of patients. The patient with Stage I-IIIA + surgery and biopsy as the first treatment was used identified as lower-risk group, and the patient with Stage IIIB-IV + more than 68 years explained as high risk.
Conclusion
Because of patient's heterogeneity in terms of overall survival outcome, using the survival tree to construct the prognostic classification index would aid the researchers to assess interaction between clinical variables and determining the cumulative effect of these variables on survival outcome.
Authors: Shereef Ahmed Elsamany; Abdullah Saeed Alzahrani; Mervat Mahrous Mohamed; Soha Ali Elmorsy; Jamal Eddin Zekri; Ahmed Saleh Al-Shehri; Rasha Mostafa Haggag; Ahmed Abdel-Reheem Alnagar; Hani Abdalla El Taani Journal: Asian Pac J Cancer Prev Date: 2014