Literature DB >> 30283530

The Benefits of Decision Tree to Predict Survival in Patients with Glioblastoma Multiforme with the Use of Clinical and Imaging Features.

Mohtaram Nematollahi1,2, Mahdie Jajroudi3, Farshid Arbabi4, Amir Azarhomayoun5,6, Zohreh Azimifar7.   

Abstract

BACKGROUND: Machine learning is a type of artificial intelligence which aims to improve machine with the ability of extracting knowledge from the environment. Glioblastoma multiforme (GBM) is one of the most common and aggressive primary malignant brain tumors in adults. Due to a low rate of survival in patients with these tumors, machine learning can help physicians for better decision-making. The aim of this paper is to develop a machine learning model for predicting the survival rate of patients with GBM based on clinical features and magnetic resonance imaging (MRI).
MATERIALS AND METHODS: The present investigation is an observational study conducted to predict the survival rate in patients with GBM in 12 months. Fifty-five patients who were registered in five Iranian Hospitals (Tehran) during 2012-2014 were selected in this study.
RESULTS: This study used Cox and C5.0 decision tree models based on clinical features and combined them with MRI. Accuracy, sensitivity, and specification parameters used to evaluate the models. The result of Cox and C5.0 for clinical feature was <32.73%, 22.5%, 45.83%>, <72.73%, 67.74%, 79.19%>, respectively; also, the result of Cox and C5.0 for both features was <60%, 48.58%, 75%>, <90.91%, 96.77%, 88.33%>, respectively.
CONCLUSION: Using C5.0 decision tree model in both survival models including clinical features, both the imaging features and the clinical features as the covariates, shows additional predictive values and better results. The tumor width and Karnofsky performance status scores were determined as the most important parameters in the survival prediction of these types of patients.

Entities:  

Keywords:  C5.0 decision tree; Cox; glioblastoma multiforme; survival rate

Year:  2018        PMID: 30283530      PMCID: PMC6159095          DOI: 10.4103/ajns.AJNS_336_16

Source DB:  PubMed          Journal:  Asian J Neurosurg


Introduction

Cancer is the second leading cause of death in the United States[1] and the third in Iran.[2] Due to a high prevalence of cancer causes, investigations have been performed to find the causes of cancer, the preventions, the development of effective treatment methods, and the prediction of its outcome.[3] The prediction of the survival by physicians is one of the most challenging tasks in the cancer treatment process.[4] Predicting the survival rate is difficult due to the existence of different factors such as the environment, genetics, and biology.[5] Physicians are still unable to accurately predict the relationship between health conditions, clinical findings, and survival. On the other hand, each treatment modality has some degree of negative side effects which lead some patients to die due to the complications of treatment.[6] Furthermore, patients and treatment selection are the main points in treating cancer. By attaining more reliable predictions of the treatment result, better treatment choices according to the patient's condition will be accessible. One of the adequate methods to improve medical services and improve cancer treatment protocols is survival prediction which leads to assessing the efficacy of novel treatment strategies.[57] Designing a system which will bridge the gap between the medical system and survival prediction is important and will improve the quality of the treatment.[5] The most important purpose of the modeling process is to determine the relationships between variables and results which is an effective effort in improving the patients’ treatment.[8] Because of the complexity existing in health-care data, it is important to select a suitable modeling method based on existing data.[89] Several statistical methods such as Cox and log logistic are used in majority of cancer researches.[5] Recently, machine learning methods develop and use in many of research. In fact, machine learning is a type of artificial intelligence through the aim of improving the machines to have the ability to extract knowledge and/or learn from the environment.[10] Methods of machine learning can be used for the analysis of computer-based oncology data. Decision tree is one of the most applicable and useful methods in machine learning. It is a useful learning tool. This method can have graphical representation of features and is easily interpreted.[10] Glioblastoma multiforme (GBM) is one of the most common and aggressive primary malignant brain tumors in adults. Since the tumor cells are very resistant to conventional therapies, the brain is susceptible to damage due to conventional therapy. The brain has a very limited capacity to repair itself, and many drugs cannot cross the blood–brain barrier to act on the tumor; therefore, it is very difficult to treat GBM.[11] The survival rate of patients with GBM is very low. The 1- and 5-year probabilities of survival for the patients with this type of cancer are reported about 36% and 5%, respectively.[11] Based on studies conducted by the American Brain Tumor Association in 2014, GMB has an incidence rate of about 15.4% from all primary and approximately 45.6% of all malignant brain tumors.[67] It is worth mentioning that by focusing on the influenced factors and with better knowledge of the tumor characteristics and also better management, increasing the survival rate will be possible.[5678] Several attempts have been made to identify the effective factors in this type of cancer. Various investigations showed that tumor grading, clinical features, and characteristics such as age, Karnofsky performance status (KPS) score, and type of treatment, and some imaging specifications are correlated with the tumor survival rate.[5] Investigations in this field are rare in publications; however, some important studies are reported.[8] Curran et al.[12] studied the treatment factors and tumor-related variables which influenced survival on 1578 patients with malignant glioma with contribution from the American Society of Cancer. In this research, recursive partitioning analysis and the Cox method were used. The results of this study showed that age, KPS score, tumor histopathology, the amount of tumor removed, the radiotherapy dose, and fraction as well as neurologic classification are the most important factors which influence survival rate.[12] Lacroix et al.[13] retrospectively analyzed 416 consecutive patients with histologically proven GBM who underwent tumor resection at the department of neurosurgery of Texas University between 1993 and 1999. The Cox proportional hazards model was used to identify the univariate and multivariate predictors of 14 months of survival. This study showed that age, tumor functional grade, necrosis grade, mass effect grade, surrounding edema, enhancement grade, the extent of tumor resection, and increased signal intensity surrounding the gadolinium-enhanced region of the tumor are independently effective factors on the survival rate.[13] In this field of study, not only being clinical parameters issues of interest but also radiological findings are a matter of focus. A study (which was performed in the Neuro-Oncology Department of UCLA University) by Pope et al. investigated the relationship between 15 imaging variables obtained from contrast-enhanced magnetic resonance imaging (MRI) scans and survival in patients with Grade III (n = 43) and Grade IV (n = 110) GBM gliomas. The goal of the study was to predict patients’ survival, taken from the date of diagnosis until death or until June 2004. Kaplan–Meier method, Cox method, and recursive partitioning analysis were both used to estimate survival probabilities. Results showed that age and the performance status are significantly correlated with survival. In addition, other factors such as signal failure following injection of the contrast agent, multifocality lesions, and edema are associated with the survival rate as well. It is worth mentioning that the percentage of brain tumor removal did not affect the survival rate significantly.[14] The prediction of an 18-month survival on 74 patients with high-grade gliomas (18 anaplastic gliomas, WHO Grades III/IV and 56 GBMs or gliosarcomas, WHO Grades IV/IV) was studied by Zacharaki et al. (2012)[16] using a J48 classification tree. The results showed that the average classification accuracy (percentage of correctly classified samples) was 85.1%, and the area under curve (AUC) of the receiver operating characteristic (ROC) curve was 0.84. Mazurowski et al.[15] performed a prediction of 1-year survival on patients with GBM using clinical specifications and imaging based on the Cox method. The participants in this study were 82 patients with glioblastoma whose clinical features as well as MR imaging examinations were made available by The Cancer Genome Atlas and The Cancer Imaging Archive with three clinical (age, sex, and performance status) and 26 magnetic resonance imaging (MRI) factors. In this investigation we are trying to make better model of survival prediction comparing the result with the statistical model. At the end of this study we show that important variables that affect to survival prediction.

Materials and Methods

Patients

The present investigation is an observational study conducted to predict the survival rate in patients with GBM over 12 months. Fifty-five patients who were registered in five Iranian Hospitals (in Tehran) during 2012–2014 were entered into the study. Not suffering from any type of brain tumors in the past, the existence of primary and preoperative MRIs, having had the surgery, and a pathology report which approved GBM Grade 4 based on the World Health Organization were the defined inclusion criteria for the patients of this study. In addition, radiotherapy or chemotherapy (one or both) would be performed along with the surgery. Patients who had done a biopsy were excluded from the study. All of MRI sequences that were acquired on 1.5 T contained T1-weighted and T2-weighted pre- and post-contrast. For viewing of images was applied K-PACS whose was named by IMAGE Information Systems Ltd.

Selected features

In this study, features were collected based on reviewing of literatures and supervision of expert neurosurgery and radiation oncologist. Then, all of them were divided into two groups: (1) clinical features and (2) clinical features plus MRI features. Clinical features consisting of age, sex, perioperative KPS, and the treatment modality (chemotherapy and radiotherapy) were recorded. In addition, tumor variables derived from the MR imaging scans were described and are presented in Table 1.
Table 1

Imaging features

Imaging features The cut point was 12 months after diagnosis, and then it was applied code 0 for death and code 1 for survivors. This investigation proposed two methods to predict survival: One was statistical classifier, Cox, and another was one of the machine learning methods, C5.0 tree. Nineteen variables were defined as the input data for both of the methods.

Classifier models

Cox model is a statistical technique for exploring the relationship between the survival of patients and several explanatory variables. Univariate Cox model was applied for each group to select important features and predict survival. Hazard ratio relates to the risk of death so increased it shows a worse prognosis. These features can be chosen based on significant P value, and the calculation method was stepwise. Another method was decision tree. Decision tree is one the most powerful tools which is used to classify and predict. It has a top-down hierarchical structure that produces roles. Each tree is composed of a series of nodes and leaves. Nodes imply input features, and leaves show the output class.[10] C.0 tree is one of the famous trees that was applied. Data split two groups: Training and test, and for the evaluation procedure, the cross-validation 10-fold methods were used. The evaluation criteria include accuracy, sensitivity, and specificity, which are defined based on the confusion matrix then ROC curve was calculated. All of the methods were constructed by IBM SPSS modeler 14.2 (IBM Inc, New York, USA).

Results

This study included 55 patients (29 men and 26 women) with GBM Grade 4. The average age of the patients was 54.7 with the median of 56. Almost 43.6% of the selected patients had a median survival of more than 1 year. The median survival was 275 days. Figure 1 shows the survival curve of the patients.
Figure 1

Survival curve of patients. Axis X is total months of survival and axis Y shows the cumulative survival rate

Survival curve of patients. Axis X is total months of survival and axis Y shows the cumulative survival rate Using only the Cox method for the clinical features, the accuracy, sensitivity, and specificity were determined 32.73%, 22.5%, and 45.83%, respectively. The area under the ROC curve was measured 0.3656. Table 2 shows effective features according to the Cox method which includes age, KPS, and radiation therapy effects.
Table 2

Effective clinical feature based on Cox method

Effective clinical feature based on Cox method When the Cox method was applied to combine clinical and imaging features, accuracy, sensitivity, and specificity were 60%, 48.58%, and 75%, respectively. In addition, the area under the ROC curve was 0.6089. Table 3 represents effective features based on the Cox method including radiotherapy effects, tumor margins, satellite, KPS, age, and the Cox method tumor width. Figure 2 shows the ROC curve for both feature conditions applied.
Table 3

Effective clinical and imaging feature based on Cox method

Figure 2

Receiver operating characteristic curve for clinical features and combining clinical and imaging features for Cox method. Axis X and Y are specificity and sensitivity, respectively, for death status. Upper receiver operating characteristic curve relates to combining clinical and imaging features and lower curve is receiver operating characteristic of clinical features

Effective clinical and imaging feature based on Cox method Receiver operating characteristic curve for clinical features and combining clinical and imaging features for Cox method. Axis X and Y are specificity and sensitivity, respectively, for death status. Upper receiver operating characteristic curve relates to combining clinical and imaging features and lower curve is receiver operating characteristic of clinical features Accuracy, sensitivity, and specificity of the C5.0 decision tree model based on the clinical features were 72.73%, 79.2%, and 67.7%, respectively. The AUC was equal to 0.7964. Figure 3 shows a graphical view of the C5.0 decision tree; therefore, Table 3 represents the probability of the clinical effectiveness of different features. According to the presented data, the KPS is the most important feature.
Figure 3

A graphical view of the decision tree for clinical features

A graphical view of the decision tree for clinical features Accuracy, sensitivity, and specificity of the C5.0 decision tree model based on a combination of clinical and imaging features were 90.91%, 87.5%, and 93.5%, respectively. Table 4 implies the probability of the clinical effectiveness of different features for C5.0. The AUC is equal to 0.9556. Figure 4 shows a graphical view of the tree. In addition, Table 5 shows the characteristics of this compound. The width of the tumor is the most effective feature in this situation.
Table 4

The probability of the clinical effectiveness of different features

Figure 4

A graphical view of the decision tree C5.0 for combining clinical and imaging features

Table 5

The probability of the both imaging and clinical effectiveness of different features

The probability of the clinical effectiveness of different features A graphical view of the decision tree C5.0 for combining clinical and imaging features The probability of the both imaging and clinical effectiveness of different features Figure 5 shows the ROC curve for two groups of features (clinical features and clinical combined with imaging feature).
Figure 5

Receiver operating characteristic curve for clinical features and combining clinical and imaging features for C5.0. Axis X and Y show specificity and sensitivity, respectively, for death status. Upper receiver operating characteristic curve relates to combining clinical and imaging features and lower curve is receiver operating characteristic of clinical features

Receiver operating characteristic curve for clinical features and combining clinical and imaging features for C5.0. Axis X and Y show specificity and sensitivity, respectively, for death status. Upper receiver operating characteristic curve relates to combining clinical and imaging features and lower curve is receiver operating characteristic of clinical features

Discussion

In this study, the prediction of 12-month survival was performed using the Cox statistical method as well as C5.0 decision tree in patients with GBM. Following the comparison of the results of accuracy, sensitivity, and specificity, we performed models on the histological data of 55 local patients with GBM, implied using the C5.0 decision tree model in both survival models including clinical features and both the imaging features and the clinical features as the covariates, resulted in additional predictive values and better results. In addition, the tumor width and KPS score were determined as the most important parameters in the survival prediction of these types of patients. Indeed, adding the imaging features to survival models caused an improvement in the power of prediction models. It is worth mentioning that the results of this study proved that the accuracy of the modeling increases by adding image features, which was previously reported by Mazurowski et al.[15] Furthermore, using the machine learning methods helps us to achieve a more accurate prediction model. Although the number of patients in this investigation was less than other studies, its results are good at correlating with the results obtained by Zacharaki et al.[16] They used the J48 decision tree and reported an average classification accuracy of 85.1% while the average classification accuracy of this study is about 92.7%.

Conclusion

This study proved that using the C5.0 decision tree could be a suitable candidate in predicting the survival rate in patients with GBM. Furthermore, using Imaging feature can help to build more accuate model. In the end further data sets, considering either genetic characteristics and using other imaging features or another machine learning methods would lead to better and more accurate results.

Financial support and sponsorship

Nil.

Conflicts of interest

There are no conflicts of interest.
  11 in total

1.  Terminal cancer. duration and prediction of survival time.

Authors:  J Llobera; M Esteva; J Rifà; E Benito; J Terrasa; C Rojas; O Pons; G Catalán; A Avellà
Journal:  Eur J Cancer       Date:  2000-10       Impact factor: 9.162

Review 2.  Epidemiology of clinical medicine.

Authors:  P K Whelton; L Gordis
Journal:  Epidemiol Rev       Date:  2000       Impact factor: 6.222

3.  Prediction of survival in thyroid cancer using data mining technique.

Authors:  M Jajroudi; T Baniasadi; L Kamkar; F Arbabi; M Sanei; M Ahmadzade
Journal:  Technol Cancer Res Treat       Date:  2013-11-04

4.  MR imaging correlates of survival in patients with high-grade gliomas.

Authors:  Whitney B Pope; James Sayre; Alla Perlina; J Pablo Villablanca; Paul S Mischel; Timothy F Cloughesy
Journal:  AJNR Am J Neuroradiol       Date:  2005 Nov-Dec       Impact factor: 3.825

Review 5.  Five common cancers in Iran.

Authors:  Shadi Kolahdoozan; Alireza Sadjadi; Amir Reza Radmard; Hooman Khademi
Journal:  Arch Iran Med       Date:  2010-03       Impact factor: 1.354

6.  Survival analysis of patients with high-grade gliomas based on data mining of imaging variables.

Authors:  E I Zacharaki; N Morita; P Bhatt; D M O'Rourke; E R Melhem; C Davatzikos
Journal:  AJNR Am J Neuroradiol       Date:  2012-02-09       Impact factor: 3.825

7.  Recursive partitioning analysis of prognostic factors in three Radiation Therapy Oncology Group malignant glioma trials.

Authors:  W J Curran; C B Scott; J Horton; J S Nelson; A S Weinstein; A J Fischbach; C H Chang; M Rotman; S O Asbell; R E Krisch
Journal:  J Natl Cancer Inst       Date:  1993-05-05       Impact factor: 13.506

8.  Survival prediction in terminally ill cancer patients by clinical estimates, laboratory tests, and self-rated anxiety and depression.

Authors:  Stephan Gripp; Sibylle Moeller; Edwin Bölke; Gerd Schmitt; Christiane Matuschek; Sonja Asgari; Farzin Asgharzadeh; Stephan Roth; Wilfried Budach; Matthias Franz; Reinhardt Willers
Journal:  J Clin Oncol       Date:  2007-08-01       Impact factor: 44.544

9.  Imaging descriptors improve the predictive power of survival models for glioblastoma patients.

Authors:  Maciej Andrzej Mazurowski; Annick Desjardins; Jordan Milton Malof
Journal:  Neuro Oncol       Date:  2013-02-07       Impact factor: 12.300

Review 10.  Machine learning applications in cancer prognosis and prediction.

Authors:  Konstantina Kourou; Themis P Exarchos; Konstantinos P Exarchos; Michalis V Karamouzis; Dimitrios I Fotiadis
Journal:  Comput Struct Biotechnol J       Date:  2014-11-15       Impact factor: 7.271

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.