| Literature DB >> 26054335 |
YiMing Chen1, Wei Cao1, XianChao Gao1, HuiShan Ong1, Tong Ji2.
Abstract
BACKGROUND: Head and Neck Squamous Cell Carcinoma (HNSCC) has a high incidence in elderly patients. The postoperative complications present great challenges within treatment and they're hard for early warning.Entities:
Mesh:
Year: 2015 PMID: 26054335 PMCID: PMC4459053 DOI: 10.1186/s12911-015-0165-3
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 2.796
The information gain (IG) of 44 variables
| Variable | IG | Variable | IG | Variable | IG | Variable | IG |
|---|---|---|---|---|---|---|---|
| Age | 4.427 | GCS | 0.000 | Respiratory function | 2.971 | The chronic diseases | 17.430 |
| Cardiac function | 2.908 | Operation type | 1.863 | Levels of sodium | 1.373 | Smoking | 1.813 |
| Respiratory function | 2.971 | Number of procedures | 2.009 | Potassium | 4.144 | Alcoholism | 1.170 |
| ECG | 3.388 | Operative blood loss | 13.551 | Creatinine | 0.928 | Preoperative radiotherapy | 1.053 |
| Systolic BP | 3.72 | Peritoneal contamination | 1.661 | Hematocrit | 0.980 | Preoperative chemotherapy | 1.465 |
| Pulse rate | 2.981 | Malignancy | 2.477 | Preoperative WBC | 4.002 | Preoperative surgery | 2.099 |
| Hemoglobin levels | 4.511 | Confidential Enquiry into Perioperative Deaths | 0.015 | Operation time | 17.060 | Primary/recurrence | 1.745 |
| WBC count | 4.002 | Rectal temperature | 0.010 | Diameter of the tumor | 10.935 | Surgery classification | 16.009 |
| Levels of blood urea nitrogen | 0.927 | Mean arterial pressure: | 2.103 | Amount of blood loss | 13.551 | Clinical grading | 8.2800 |
| Sodium: | 3.291 | Blood pH | 2.372 | Hematocrit of POD1 | 1.983 | Pathological grading: | 4.998 |
| Potassium | 5.370 | Heart rate | 2.981 | Glucose of POD1 | 9.560 | Preoperative blood urea (g/L) | 0.913 |
Clinicopathological characteristic of the patients in training set
| Characteristic | No.(%) of patients |
|---|---|
| Gender | Total: 513 |
| Male | 266(52.0 %) |
| Female | 247(48.0 %) |
| Primary/recurrence | Total: 513 |
| Primary | 342(66.6 %) |
| Recurrence | 171(33.3 %) |
| Co-morbidities | Total: 173 |
| High blood pressure | 45(26.6 %) |
| Myocardial infarction | 34(20.0 %) |
| COPD | 11(6.4 %) |
| Bronchitis | 13(7.5 %) |
| Diabetes | 31(17.9 %) |
| Hypothyroidism | 10(5.8 %) |
| Hyperthyroidism | 3(1.7 %) |
| Agitation | 5(2.9 %) |
| Delirium | 3(1.7 %) |
| Metabolism | 6(3.5 %) |
| Acid reflux | 6(3.5 %) |
| Renal failure | 6(3.5 %) |
| Pathological grading | Total: 513 |
| I | 148(28.8 %) |
| II | 262(51.0 %) |
| III | 113(22.0 %) |
| Unidentified | 47(9.0 %) |
| Clinical grading | Total: 513 |
| T1 | 148(%) |
| T2 | 262(%) |
| T3 | 47(%) |
| T4 | 56(%) |
| Region | Total: 513 |
| Upper lip | 5(1.0 %) |
| Lower lip | 15(2.9 %) |
| Maxillofacial | 17(3.3 %) |
| Floor of mouth | 33(6.4 %) |
| Tongue | 122(23.4 %) |
| Oral-pharyngeal | 12(2.5 %) |
| Hypopharynx | 12(2.5 %) |
| Neck | 49(9.6 %) |
| Maxilla | 52(10.1 %) |
| Mandible | 123(24.0 %) |
| Buccal mucosa | 65(12.7 %) |
| Parotid gland | 6(1.2 %) |
| Surgical classification | Total: 513 |
| IO | 149(30.9 %) |
| JO | 261(50.1 %) |
| SMO | 103(20.0 %) |
| Postoperative complications | Total: 292 |
| Wound infection | 35(12.0 %) |
| Free flap infection | 23(4.5 %) |
| Edema | 46(15.8 %) |
| Wound dehiscence | 40(13.7 %) |
| Hematoma with re-exploration | 15(5.5 %) |
| Hematoma without re-exploration | 10(3.4 %) |
| Partial flap necrosis | 17(5.8 %) |
| Total flap necrosis | 4(1.4 %) |
| Pneumonia | 19(4.8 %) |
| Pulmonary embolus | 1(0.3 %) |
| Salivary fistula | 16(5.5 %) |
| Abdominal discomfort | 5(1.7 %) |
| Haematemesis | 4(1.4 %) |
| Central nerve system co-morbidity | 8(2.7 %) |
| Deep venous thrombosis (DVT) | 7(2.4 %) |
| Angina | 6(2.1 %) |
| Delirium | 8(2.7 %) |
Clinicopathological characteristic of the patients in external test set
| Characteristic | No.(%) of patients |
|---|---|
| Gender | Total: 12 |
| Male | 8(66.7 %) |
| Female | 4(33.3 %) |
| Primary/recurrence | Total: 12 |
| Primary | 10(83.3 %) |
| Recurrence | 2(16.7 %) |
| Co-morbidities | Total: 12 |
| High blood pressure | 3(25.0 %) |
| Myocardial infarction | 5(41.7 %) |
| Diabetes | 4(33.3 %) |
| Pathological grading | Total: 12 |
| I | 6(50.0 %) |
| II | 4(33.3 %) |
| III | 2(16.7 %) |
| Unidentified | 0(0 %) |
| Clinical grading | Total: 12 |
| T1 | 6(50.5 %) |
| T2 | 5(41.7 %) |
| T3 | 1(8.3 %) |
| T4 | 0(0 %) |
| Region | Total: 12 |
| Floor of mouth | 4(33.3 %) |
| Tongue | 3(25.0 %) |
| Oral-pharyngeal | 1(8.3 %) |
| Neck | 1(8.3 %) |
| Mandible | 1(8.3 %) |
| Parotid gland | 2(16.7 %) |
| Surgical classification | Total: 12 |
| IO | 6(50.0 %) |
| JO | 2(16.7 %) |
| SMO | 4(33.3 %) |
| Postoperative complications | Total: 6 |
| Wound infection | 3(25.0 %) |
| Partial flap necrosis | 1(8.3 %) |
| Total flap necrosis | 1(8.3 %) |
| Pulmonary embolus | 1(8.3 %) |
The preoperative variables are listed and evaluated by P and F values through SPSS 17.0 toolkit
| Variable | Postoperative complication group | Non-postoperative complication group |
|
|
|---|---|---|---|---|
| Preoperative heart rate (beats/min) | 75.25 ± 10.42 | 76.55 ± 12.13 | 1.298 | 0.132 |
| Preoperative systolic pressure (mmHg) | 149.14 ± 22.62 | 146.38 ± 23.36 | 1.358 | 0.843 |
| Preoperative white blood cell count (×109/L) | 6.04 ± 1.86 | 6.46 ± 2.14 | 2.289 |
|
| Preoperative hemoglobin (g/L) | 127.00 ± 15.44 | 128.03 ± 16.93 | 0.713 | 0.712 |
| Preoperative serum sodium (mmol/L) | 143.17 ± 8.106 | 135.20 ± 6.00 | 1.136 | 0.095 |
| Preoperative blood potassium (mmol/L) | 3.30 ± 0.57 | 3.34 ± 0.62 | 0.628 | 0.410 |
| Preoperative blood sugar (mmol/L) | 5.13 ± 1.01 | 5.23 ± 1.10 | 0.942 | 0.957 |
| Preoperative blood urea (g/L) | 5.01 ± 1.53 | 5.52 ± 8.78 | 0.888 |
|
The two variables that were finally selected are highlighted
The operative variables were evaluated by P and F values through an SPSS 17.0 toolkit
| Group | Postoperative complication group | Non-postoperative complication group |
|
|
|---|---|---|---|---|
| Diameter of tumor (cm) | 2.65 ± 1.54 | 3.49 ± 1.86 | 5.526 |
|
| The amount of blood loss (ml) | 239.73 ± 231.40 | 606.29 ± 356.31 | 13.084 |
|
| Operation time (hour, H) | 3.19 ± 2.06 | 5.74 ± 2.69 | 11.996 |
|
The three variables that were finally selected are highlighted
The postoperative variables were evaluated by P and F values through SPSS 17.0 toolkit
| Variable | Postoperative complication group | Non-postoperative complication group |
|
|
|---|---|---|---|---|
| Temperature on the first day after operation (°C) | 36.69 ± 0.47 | 36.89 ± 0.57 | 4.376 | 0.062 |
| Heart rate on the first day after operation (beats/min) | 80.19 ± 10.92 | 82.32 ± 10.89 | 2.208 | 0.381 |
| Breathing rate on the first day after operation (breaths/min) | 19.38 ± 2.38 | 19.20 ± 2.33 | 0.868 | 0.412 |
| White blood cell count on the first day after operation (×109/L) | 11.89 ± 4.32 | 13.07 ± 4.09 | 2.916 | 0.075 |
| Serum sodium on the first day after operation (mmol/L) | 135.58 ± 4.59 | 135.56 ± 4.70 | 1.136 | 0.095 |
| Blood potassium on the first day after operation (mmol/L) | 3.36 ± 0.69 | 3.37 ± 0.58 | 0.011 | 0.844 |
| Hematocrits on the first day after operation (l/L) | 0.35 ± 0.04 | 0.33 ± 0.04 | 5.303 |
|
| Serum creatinine on the first day after operation (μmol/L) | 72.53 ± 20.32 | 74.97 ± 23.55 | 0.986 | 0.064 |
| Blood sugar on the first day after operation (mmol/L) | 6.83 ± 1.01 | 8.23 ± 1.10 | 0.982 |
|
The two variables that were finally selected are highlighted
The other variables are listed and were evaluated by chi-square test and by Fisher's exact test with the SPSS17.0 toolkit
| Variable | Postoperative complication group | Non-postoperative complication group |
|
|---|---|---|---|
| Primary | |||
| Yes | 169 | 165 |
|
| No | 84 | 95 | |
| Chronic disease | |||
| Yes | 29 | 150 |
|
| No | 224 | 110 | |
| Smoking | |||
| Yes | 47 | 64 |
|
| No | 206 | 196 | |
| Alcoholism | |||
| Yes | 32 | 35 |
|
| No | 221 | 225 | |
| Preoperative radiotherapy | |||
| Yes | 14 | 31 |
|
| No | 239 | 229 | |
| Preoperative surgery | |||
| Yes | 75 | 84 |
|
| No | 178 | 176 | |
| Preoperative chemotherapy | |||
| Yes | 14 | 30 |
|
| No | 239 | 230 | |
| Surgery classification | |||
| IO | 119 | 30 |
|
| MO | 125 | 136 | |
| JO | 9 | 94 | |
| Clinical Stage (TNM) | |||
| T1 | 84 | 64 |
|
| T2 | 84 | 95 | |
| T3 | 23 | 24 | |
| Pathology Stage | |||
| I | 84 | 64 |
|
| II | 125 | 137 | |
| III | 23 | 24 | |
The 10 variables that were finally selected are highlighted
To find the accuracy of each predictive model under different variable systems, a 5-fold-cross validation was used
| Algorithm variable | Accuracy | ||||
|---|---|---|---|---|---|
| Support Vector Machine (SVM) | Random Forest (RF) | Rotation Forest (ROF) | Bayesian Network (BN) | Naïve Bayesian Network (NBN) | |
| Alla | 83.431 % | 89.084 % | 85.965 % | 82.261 % | 75.634 % |
| Newb | 81.676 % | 87.135 % | 83.041 % | 77.778 % | 73.489 % |
| POSSUMb | 79.337 % | 82.261 % | 79.142 % | 74.074 % | 71.929 % |
| APACHE IIb | 75.439 % | 76.420 % | 76.023 % | 73.294 % | 75.829 % |
The predictive model based on the Random Forest algorithm has the best accuracy of 89.084 % under the “All variables” system
aAll variables included in the study
bVariables included in the model
Fig. 1There are 5 ROC curves that illustrate the reliability of the predictive models based on 5 data mining algorithms under the “All Variable” system. The light blue curve signifies the predictive model based on the Random Forest algorithm, which has the largest AUC value. All AUC values are shown in Table 9. SVM = Support Vector Machine
The AUC of each predictive model under “All Variable” system, and the predictive model based on random forest algorithm has the largest AUC value of 0.949
| Variable | AUC |
|---|---|
| All Random Forest | 0.949 |
| All Rotation Forest | 0.942 |
| All SVM | 0.930 |
| All Bayesian Network | 0.905 |
| All Naïve Bayesian | 0.865 |
All ”All Variable” system
SVM “Support Vector Machine” algorithm
Fig. 2There are 4 ROC curves that illustrate the reliability of the predictive models based on the Random Forest algorithm under 4 different variable systems. The light blue curve signifies the predictive model based on the Random Forest algorithm under the “All Variable” system, which has the largest AUC value. All AUC values are shown in Table 10. All=”All Variable” system, New=”New Variable” system, Pos=”POSSUM Variable” system, Apa=”APACHE II Variable” system, RF = Random Forest algorithm
The AUC of the predictive models based on the random forest algorithm under 4 different variable systems, and the predictive model under the “All Variable” system has the largest AUC value of 0.949
| Variable | AUC |
|---|---|
| All RF | 0.949 |
| New RF | 0.944 |
| POSSUM RF | 0.878 |
| APACHE II RF | 0.794 |
All “All Variable” system
New “New Variable” system
POSSUN “POSSUM Variable” system
APACHE II “APACHE II Variable” system
RF random forest algorithm
Comparison between the training variable set and the external variable set
| Variable set (Training) | CA (Accuracy, %) | AUC |
|---|---|---|
| All Random Forest | 89.084 % | 0.949 |
| All Rotation Forest | 85.965 % | 0.942 |
| All SVM | 83.431 % | 0.930 |
| All Bayesian Network | 82.261 % | 0.905 |
| All Naïve Bayesian Network | 75.634 % | 0.865 |
| Variable set (external) | ||
| All Random Forest | 83.333 % | 0.781 |
All “All Variable” system
SVM support vector machine