| Literature DB >> 33247677 |
Amirarash Kashef1, Toktam Khatibi1, Azim Mehrvar2,3.
Abstract
BACKGROUND: Acute Lymphoblastic Leukemia (ALL) is the most common blood disease in children and is responsible for the most deaths amongst children. Due to major improvements in the treatment protocols in the 50-years period, the survivability of this disease has witnessed dramatic rise until this date which is about 90 percent. There are many investigations tending to indicate the efficiency of cranial radiotherapy found out that without that, outcome of the patients did not change and even it improved at some cases.Entities:
Keywords: Acute lymphoblastic leukemia (ALL); Cranial Radiotherapy; Stacked ensemble; childhood blood cancer; prediction
Mesh:
Year: 2020 PMID: 33247677 PMCID: PMC8033115 DOI: 10.31557/APJCP.2020.21.11.3211
Source DB: PubMed Journal: Asian Pac J Cancer Prev ISSN: 1513-7368
Figure 1The Flowchart of the Main Steps have been Used in the Research Methodology
Figure 2Sex Distribution of ALL Patients in the Data
Figure 3Age at the Time of Diagnosis Distribution in Our Data
Evaluation Metrics Achieved from Prediction Algorithms
| Prediction algorithms | AUC (test set) | Threshold | Max Accuracy (test set) | Threshold | Max Precision (test set) | Threshold | Max Recall (test set) | AUC |
|---|---|---|---|---|---|---|---|---|
| GBM | 0.8659 | 0.1946 | 87.38% | 0.1946 | 100% | 0.0473 | 100% | 0.89 |
| GLM | 0.8347 | 0.1368 | 88.35% | 0.1819 | 100% | 0.0843 | 100% | 0.785 |
| DRF | 0.8483 | 0.2276 | 85.44% | 0.072 | 45% | 0.0272 | 100% | 0.803 |
Best Suited Hyper Parameters for Each of the Prediction Algorithms
| Gradient Boosting Machine (GBM) | Random Forest (DRF) | Generalized Linear Model (GLM) |
|---|---|---|
| Learn rate = 0.01 | Ntrees = 100 | Alpha = 0.0186 |
| Sample rate = 0.8 | Mtries = 7 | Lambda = 0.963477 |
| Ntrees = 50 | Max depth = 7 | |
| Col sample rate = 1 | Sample rate = 0.2 | |
| Max depth = 3 |
Evaluation Metrics Achieved from Stacked Ensemble Models
| Prediction algorithms | AUC | Threshold | Max Accuracy (test set) | Threshold | Max Precision (test set) | Threshold | Max Recall (test set) | AUC |
|---|---|---|---|---|---|---|---|---|
| GBM & GLM | 0.8142 | 0.4058 | 89.32% | 0.4058 | 80% | 0.00065 | 100% | 0.8571 |
| GBM & DRF | 0.8752 | 0.3095 | 90.29% | 0.6881 | 100% | 0.000931 | 100% | 0.8214 |
| GLM & DRF | 0.8338 | 0.0397 | 85.44% | 0.000259 | 45.45% | 0.000001 | 100% | 0.7857 |
| GBM & GLM & DRF | 0.8732 | 0.4122 | 91.26% | 0.752138 | 100% | 0.000766 | 100% | 0.8214 |
Metalearner Parameters of the Best Stacked Model (GBM and DRF)
| Metalearner algorithm | Metalearner fold assignment |
| "gbm" | "Random" |
Confusion Matrix Resulted from GBM and RF Stacked Ensemble
| Confusion Matrix | Error | Rate | ||
|---|---|---|---|---|
| 0 | 1 | |||
| 0 | 75 | 14 | 0.1573 | 14/89 |
| 1 | 2 | 12 | 0.1428 | 14-Feb |
| Total | 77 | 26 | 0.1553 | 16/103 |
Top 10 Variables Derived from GBM Prediction Model
| 1 | "Relapse" |
| 2 | "Cell type" |
| 3 | "Age at diagnosis" |
| 4 | "Platelets" |
| 5 | "Risk group" |
| 6 | "fever" |
| 7 | "Pneumonia" |
| 8 | "Hemoglobin" |
| 9 | "Immunocompromised condition" |
| 10 | "Red Blood Cells" |