| Literature DB >> 31428684 |
Yi-Ting Lin1,2, Michael Tian-Shyug Lee2, Yen-Chun Huang2, Chih-Kuang Liu1,2, Yi-Tien Li2, Mingchih Chen2.
Abstract
Research has failed to resolve the dilemma experienced by localized prostate cancer patients who must choose between radical prostatectomy (RP) and external beam radiotherapy (RT). Because the Charlson Comorbidity Index (CCI) is a measurable factor that affects survival events, this research seeks to validate the potential of the CCI to improve the accuracy of various prediction models. Thus, we employed the Cox proportional hazard model and machine learning methods, including random forest (RF) and support vector machine (SVM), to model the data of medical records in the National Health Insurance Research Database (NHIRD). In total, 8581 individuals were enrolled, of whom 4879 had received RP and 3702 had received RT. Patients in the RT group were older and exhibited higher CCI scores and higher incidences of some CCI items. Moderate-to-severe liver disease, dementia, congestive heart failure, chronic pulmonary disease, and cerebrovascular disease all increase the risk of overall death in the Cox hazard model. The CCI-reinforced SVM and RF models are 85.18% and 81.76% accurate, respectively, whereas the SVM and RF models without the use of the CCI are relatively less accurate, at 75.81% and 74.83%, respectively. Therefore, CCI and some of its items are useful predictors of overall and prostate-cancer-specific survival and could constitute valuable features for machine-learning modeling.Entities:
Year: 2019 PMID: 31428684 PMCID: PMC6698054 DOI: 10.1515/med-2019-0067
Source DB: PubMed Journal: Open Med (Wars)
Figure 1Flow chart of subjects searching
This figure demonstrates whole procedure for establishing our target population. The dataset of target population was extracted from the outpatient expense file, hospitalization expense file, TCR file and death cause file of NHIRD.
Demographic features among treatment groups.
| Variables | Different Treatment | |||
|---|---|---|---|---|
| OP | RT | p-value | ||
| No. (%) of patients | 8581 | 4879 | 3702 | |
| Age | 65.79 | 74.1 | <.0001* | |
| T-stage | 1A | 88(1.80) | 128(3.46) | <.0001* |
| 1B | 96(1.97) | 156(4.21) | ||
| 1C | 1512(30.99) | 936(25.28) | ||
| 2A | 994(20.37) | 617(16.67) | ||
| 2B | 580(11.89) | 420(11.35) | ||
| 2C | 1609(32.98) | 1445(39.03) | ||
| Grade | 1 | 481(9.86) | 539(14.56) | <.0001* |
| 2 | 1903(39.00) | 1361(36.76) | ||
| 3 | 2495(51.14) | 1802(48.68) | ||
| Connective tissue disease | 98(2.07) | 63(1.70) | 0.2995 | |
| Mild liver disease | 166(3.40) | 105(2.84) | 0.158 | |
| Ulcer disease | 839(17.20) | 544(14.69) | 0.0018* | |
| Congestive heart failure | 171(3.50) | 214(5.78) | <.0001* | |
| Peripheral vascular disease | 117(2.40) | 115(3.11) | 0.0451* | |
| Chronic pulmonary disease | 629(12.89) | 593(16.02) | <.0001* | |
| Cerebrovascular disease | 371(7.60) | 410(11.08) | <.0001* | |
| Diabetes | 1079(22.12) | 657(17.75) | <.0001* | |
| Diabetes with end organ damage | 4548(93.22) | 3457(93.38) | 0.7607 | |
| Moderate or severe renal disease | 229(4.69) | 199(5.38) | 0.1507 | |
| Metastatic solid tumor | 130(2.66) | 51(1.38) | <.0001* | |
| Hemiplegia | 27(0.55) | 38(1.03) | 0.0123* | |
| Solid tumor without metastasis | 289(5.92) | 300(8.10) | <.0001* | |
| Myocardial infarct | 44(0.90) | 45(1.22) | 0.1554 | |
| Dementia | 47(0.96) | 97(2.62) | <.0001* | |
| Moderate or severe liver disease | 3(0.06) | 0(0) | 0.1313 | |
| Any except malignancy, malignant including neoplasm lymphoma of skin and leukemia, | 256(5.25) | 270(7.29) | 0.0003* | |
| AIDS/HIV | 1(0.02) | 0(0) | 0.8447 | |
| Overall mortality | 88(1.80) | 325(8.78) | <.0001* | |
| F/U TIME | 4.11 | 4.17 | 0.2517 | |
| Prostate cancer specific mortality | 24(0.49) | 79(2.13) | <.0001* | |
| Patients with CCI Score equal to | 0 | 2073(42.49) | 1779(48.06) | <.0001* |
| 1 | 1346(27.59) | 746(20.15) | ||
| 2 | 571(11.70) | 404(10.91) | ||
| 3 | 321(6.58) | 241(6.51) | ||
| 4 | 203(4.16) | 191(5.16) | ||
| 5 | 127(2.60) | 126(3.40) | ||
| 6+ | 238(4.88) | 215(5.81) | ||
| CCI Average | 1.36 | 1.42 | <.0001* |
Grade 1: Gleason score 2~5; Grade 2: Gleason score 6,7; Grade 3: Gleason score 8~10
*: Statistically significant, p<0.05
Figure 2Accumulated mortality events curve, stratified initial definite treatment, grade, stage and years
Mortality events are significantly higher in high grade, RT group.
Grade 1: Gleason score 2~5; Grade 2: Gleason score 6,7; Grade 3: Gleason score 8~10
*: Statistically significant, p<0.05
Hazard ratios of features in Cox proportional Hazard model
| Cox models | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Variables | Overall survival CCI only | Overall survival CCI item | Specific survival CCI only | Specific survival CCI item | ||||||||
| Cancerous Characteristic variables | ||||||||||||
| HR | 95%HR | p-value | HR | 95%HR | p-value | HR | 95%HR | p-value | HR | 95%HR | p-value | |
| Age | 1.066 | 1.05-1.08 | <.0001 | 1.058 | 1.04-1.07 | <.0001 | 1.047 | 1.01-1.07 | 0.0023 | 1.04 | 1.01-1.07 | 0.0037 |
| T-stage (ref. = 1A) | ||||||||||||
| 1B | 1.157 | 0.54 2.45 | 0.2798 | 1.239 | 0.58-2.63 | 0.5775 | 2.922 | 0.35-24.35 | 0.3216 | 1.775 | 0.23-1.24 | 0.2811 |
| 1C | 0.697 | 0.36-1.34 | 0.3923 | 0.72 | 0.37-1.39 | 0.3296 | 1.829 | 0.24-13.53 | 0.5545 | 2.125 | 0.28-15.96 | 0.5756 |
| 2A | 0.748 | 0.38-1.45 | 0.6994 | 0.785 | 0.40-1.53 | 0.4786 | 2.157 | 0.28-16.07 | 0.4532 | 2.032 | 0.26-15.56 | 0.4639 |
| 2B | 0.875 | 0.44-1.77 | 0.4872 | 0.929 | 0.47-1.83 | 0.8314 | 2.118 | 0.27-16.11 | 0.4686 | 1.564 | 0.21-11.55 | 0.4949 |
| 2C | 0.796 | 0.41-1.51 | 0.2798 | 0.833 | 0.43-1.59 | 0.5803 | 1.597 | 0.21-11.70 | 0.645 | 1.775 | 0.6614 | |
| Grade (ref. = 1) | ||||||||||||
| 2 | 4.657 | 1.71-12.67 | 0.0026 | 4.496 | 1.65-12.24 | 0.0033 | 92.. | 0 | 0.9792 | 10.. | 0 | 0.9798 |
| 3 | 6.78 | 2.51-18.28 | 0.0002 | 6.463 | 2.39-17.44 | 0.0002 | 23.. | 0 | 0.9778 | 2... | 0 | 0.9785 |
| Different treatment (ref. = RT) | 2.672 | 2.03-3.50 | <.0001 | 2.693 | 2.04-3.54 | <.0001 | 3.144 | 1.85-5.33 | <.0001 | 3.194 | 1.88-5.41 | <.0001 |
| CCI (ref. = 0) | ||||||||||||
| 1 | 1.461 | 1.10-1.94 | 0.0088 | 0.79 | 0.46-1.34 | 0.3872 | ||||||
| 2 | 1.816 | 1.32-2.49 | 0.0002 | 1.188 | 0.66-2.13 | 0.5634 | ||||||
| 3 | 2.015 | 1.40-2.91 | 0.0002 | 0.868 | 0.38-1.96 | 0.7345 | ||||||
| 4 | 2.455 | 1.68-3.59 | <.0001 | 1.368 | 0.63-2.96 | 0.4263 | ||||||
| 5 | 2.108 | 1.32-3.37 | 0.0018 | 0.486 | 0.11-2.02 | 0.3216 | ||||||
| 6+ | 2.767 | 1.94-3.95 | <.0001 | 1.864 | 0.96-3.60 | 0.0645 | ||||||
| Connective tissue disease | 0.76 | 0.358-1.61 | 0.4761 | 0 | 0 | 0.9861 | ||||||
| Mild liver disease | 1.488 | 0.96-2.30 | 0.0757 | 1.95 | 0.84-4.49 | 0.1167 | ||||||
| Ulcer disease | 1.132 | 0.89-1.43 | 0.3052 | 1.13 | 0.69-1.84 | 0.6207 | ||||||
| Congestive heart failure | 2.172 | 1.63-2.88 | <.0001 | 1.69 | 1.88-3.22 | 0.1107 | ||||||
| Peripheral vascular disease | 0.941 | 0.56-1.55 | 0.814 | 0.815 | 0.25-2.59 | 0.729 | ||||||
| Chronic pulmonary disease | 1.644 | 1.32-2.03 | <.0001 | 1.18 | 0.73-1.89 | 0.4907 | ||||||
| Cerebrovascular disease | 1.5 | 1.05-1.73 | 0.019 | 1.197 | 0.69-2.05 | 0.5143 | ||||||
| Diabetes | 1.105 | 1.85-1.42 | 0.4354 | 0.774 | 0.44-1.34 | 0.361 | ||||||
| Diabetes with end organ damage | 1.092 | 0.75-1.57 | 0.6421 | 0.969 | 0.40-2.29 | 0.9429 | ||||||
| Moderate or severe renal disease | 1.35 | 0.96-1.9 | 0.0802 | 2.02 | 1.07-3.8 | 0.0283 | ||||||
| Metastatic solid tumor | 1.75 | 0.95-3.22 | 0.0718 | 3.87 | 1.65-9.09 | 0.0019 | ||||||
| Hemiplegia | 0.828 | 0.33-2.03 | 0.681 | 0.467 | ||||||||
| Solid tumor without metastasis | 1.792 | 0.88-3.63 | 0.1058 | 0.734 | 0.10-5.33 | 0.7602 | ||||||
| Myocardial infarct | 0.805 | 0.35-1.83 | 0.6055 | 0.53 | 0.07-3.90 | 0.533 | ||||||
| Dementia | 1.874 | 1.26-2.78 | 0.0018 | 1.716 | 0.40-7.35 | 0.7527 | ||||||
| Moderate or severe liver disease | 36.78 | 0.96-1.90 | 0.0004 | 0 | 0 | 0.9988 | ||||||
| Any malignancy, including lymphoma and leukemia, except malignant neoplasm of skin | 0.692 | 0.32-1.47 | 0.3408 | 0.89 | 0.10-7.35 | 0.9142 | ||||||
| AIDS/HIV | 0 | 0 | 0.9706 | 0 | 0 | 0.9982 | ||||||
The RP and RT groups were pooled together. The factors which significantly affect the prostate-cancer-specific survival and overall survival are identified with Cox regression model.
Grade 1: Gleason score 2~5; Grade 2: Gleason score 6,7; Grade 3: Gleason score 8~10
*: Statistically significant, p<0.05
List of variable used in machine learning modeling for overall survival prediction
| Machine learning models for overall survival | ||||
|---|---|---|---|---|
| Variables | RF With CCI | RF Without CCI | SVM With CCI | SVM Without CCI |
| Age (yr) | X | X | X | X |
| T-stage | X | X | X | X |
| Grade | X | X | X | X |
| Different treatment | X | X | X | X |
| Duration of follow up (yr) | X | X | X | X |
| CCI | X | X | ||
| Connective tissue disease | X | X | ||
| Mild liver disease | X | X | ||
| Ulcer disease | X | X | ||
| Congestive heart failure | X | X | ||
| Peripheral vascular disease | X | X | ||
| Chronic pulmonary disease | X | X | ||
| Cerebrovascular disease | X | X | ||
| Diabetes | X | X | ||
| Diabetes with end organ damage | X | X | ||
| Moderate or severe renal disease | X | X | ||
| Metastatic solid tumor | X | X | ||
| Hemiplegia | X | X | ||
| Solid tumor without metastasis | X | X | ||
| Myocardial infarct | X | X | ||
| Dementia | X | X | ||
| Moderate or severe liver disease | X | X | ||
| Any malignancy, including lymphoma and leukemia, except malignant neoplasm of skin | X | X | ||
| AIDS/HIV | X | X | ||
| Overall death | Y | Y | Y | Y |
This table lists the feature selection during modeling. X represents variables which was put at the input site of machine learning model. Y represents variables put at the output site of machine learning model.
Predictive ability of machine learning models
| Overall death | RF | SVM | ||||
|---|---|---|---|---|---|---|
| 1:1 | 1:2 | 1:3 | 1:1 | 1:2 | 1:3 | |
| Accuracy | 0.8000 | 0.8095 | 0.8262 | 0.8518 | 0.7958 | 0.8116 |
| Sensitivity | 0.8500 | 0.6047 | 0.4198 | 0.8182 | 0.5213 | 0.3186 |
| Specificity | 0.7500 | 0.9157 | 0.9595 | 0.8133 | 0.9316 | 0.9777 |
| Kappa | 0.6000 | 0.5512 | 0.4480 | 0.6315 | 0.4955 | 0.3724 |
| Accuracy | 0.7483 | 0.7324 | 0.8024 | 0.7527 | 0.7621 | 0.7837 |
| Sensitivity | 0.7027 | 0.5571 | 0.3690 | 0.7660 | 0.5312 | 0.1971 |
| Specificity | 0.7945 | 0.8182 | 0.9451 | 0.7391 | 0.8763 | 0.9810 |
| Kappa | 0.4969 | 0.3823 | 0.3721 | 0.5052 | 0.4314 | 0.2369 |
The CCI- reinforced and CCI absent RF and SVM models were built. We evaluated the ability of model with accuracy, sensitivity, specificity and kappa.