| Literature DB >> 36212485 |
Guanghua Huang1, Lei Liu1, Luyi Wang1, Shanqing Li1.
Abstract
Background: Approximately 20% of patients with lung cancer would experience postoperative cardiopulmonary complications after anatomic lung resection. Current prediction models for postoperative complications were not suitable for Chinese patients. This study aimed to develop and validate novel prediction models based on machine learning algorithms in a Chinese population.Entities:
Keywords: FEV1/FVC; lung cancer; machine learning; postoperative complication; ppoFEV1%; prediction model
Year: 2022 PMID: 36212485 PMCID: PMC9539671 DOI: 10.3389/fonc.2022.1003722
Source DB: PubMed Journal: Front Oncol ISSN: 2234-943X Impact factor: 5.738
Figure 1The flow chart of patient selection.
Characteristics of the derivation cohort and the validation cohort.
| Terms | Total (N = 1085) | Derivation (N = 760) | Validation(N = 325) | p |
|---|---|---|---|---|
| Male, N (%) | 423 (39.0) | 285 (37.5) | 138 (42.5) | 0.142 |
| Age, mean (SD) | 58.4 (10.4) | 58.0 (10.6) | 59.2 (9.8) | 0.100 |
| Body mass index, mean (SD) | 24.2 (3.1) | 24.2 (3.0) | 24.3 (3.3) | 0.647 |
| Smoker, N (%) | 235 (21.7) | 167 (22.0) | 68 (20.9) | 0.761 |
| Alcohol use, N (%) | 139 (12.8) | 94 (12.4) | 45 (13.8) | 0.570 |
| Hypertension, N (%) | 312 (28.8) | 209 (27.5) | 103 (31.7) | 0.185 |
| Diabetes mellitus, N (%) | 130 (12.0) | 92 (12.1) | 38 (11.7) | 0.928 |
| COPD, N (%) | 41 (3.8) | 31 (4.1) | 10 (3.1) | 0.536 |
| Arrhythmia, N (%) | 23 (2.1) | 16 (2.1) | 7 (2.2) | 1.000 |
| Coronary artery disease, N (%) | 59 (5.4) | 39 (5.1) | 20 (6.2) | 0.593 |
| Cerebrovascular disease, N (%) | 23 (2.1) | 18 (2.4) | 5 (1.5) | 0.523 |
| Chronic kidney disease, N (%) | 5 (0.5) | 4 (0.5) | 1 (0.3) | 1.000 |
| Interstitial lung disease, N (%) | 4 (0.4) | 3 (0.4) | 1 (0.3) | 1.000 |
| CCI, median [IQR] | 0 [0-0] | 0 [0-0] | 0 [0-0] | 0.828 |
| ppoFEV1%, mean (SD) | 76.3 (14.1) | 76.8 (13.8) | 75.2 (14.8) | 0.094 |
| FVC%, mean (SD) | 89.4 (14.0) | 89.8 (13.9) | 88.5 (14.2) | 0.192 |
| FEV1/FVC, mean (SD) | 76.0 (7.9) | 76.0 (7.8) | 75.9 (8.2) | 0.865 |
| Thoracotomy, N (%) | 25 (2.3) | 18 (2.4) | 7 (2.2) | 1.000 |
| Surgical procedures, N (%) | 0.774 | |||
| Segmentectomy | 209 (19.3) | 146 (19.2) | 63 (19.4) | |
| Lobectomy | 863 (79.5) | 604 (79.5) | 259 (79.7) | |
| Bilobectomy | 11 (1.0) | 9 (1.2) | 2 (0.6) | |
| Pneumonectomy | 2 (0.2) | 1 (0.1) | 1 (0.3) | |
| Extended resection, N (%) | 12 (1.1) | 10 (1.3) | 2 (0.6) | 0.488 |
| Postoperative complications, N (%) | 90 (8.3) | 64 (8.4) | 26 (8.0) | 0.912 |
OR, odds ratio; CI, confidence interval; SD, standard deviation; COPD, chronic obstructive pulmonary disease; CCI, the Charlson Comorbidity Index; IQR, interquartile range; ppoFEV1%, the percentage of predicted postoperative forced expiratory volume in one second; FVC%, the percentage of forced vital capacity; FEV1/FVC, the ratio of forced expiratory volume in one second to forced vital capacity.
Risk factors and their parameters of the logistic model.
| Variables | Coefficients | OR (95%CI) | p |
|---|---|---|---|
| Intercept | 1.430 | – | – |
| Male | 0.686 | 1.986 (1.142-3.454) | 0.015 |
| Arrhythmia | 1.283 | 3.606 (1.095-11.880) | 0.035 |
| Cerebrovascular disease | 1.689 | 5.415 (1.852-15.832) | 0.002 |
| ppoFEV1% | -1.859 | 0.156 (0.016-1.543) | 0.112 |
| FEV1/FVC | -3.894 | 0.020 (0.001-0.810) | 0.038 |
OR, odds ratio; CI, confidence interval; ppoFEV1%, the percentage of predicted postoperative forced expiratory volume in one second; FEV1/FVC, the ratio of forced expiratory volume in one second to forced vital capacity.
Model performance of the logistic model, random forest model and XGBoost model.
| Logistic | Random forest | XGBoost | |
|---|---|---|---|
| AUC | 0.728 | 0.721 | 0.767 |
| 95% CI | 0.619-0.836 | 0.614-0.828 | 0.671-0.862 |
| Spiegelhalter z test | 0.656 | 0.628 | 0.368 |
| Sensitivity | 0.769 | 0.692 | 0.692 |
| Specificity | 0.645 | 0.699 | 0.749 |
| Positive predictive value | 0.159 | 0.167 | 0.194 |
| Negative predictive value | 0.970 | 0.963 | 0.966 |
| Accuracy | 0.655 | 0.698 | 0.745 |
AUC, area under the curve; CI, confidence interval.
Figure 2Performance of three models. (A) shows the receiver operating characteristic curves. (B) shows the calibration curves. The blue line indicates the logistic model The red line indicates the random forest model. The yellow line indicates the XGBoost model.
Figure 3The nomogram of the logistic model. CVD, cerebrovascular disease; ppoFEV1%, the percentage of predicted postoperative forced expiratory volume in one second; FEV1/FVC, the ratio of forced expiratory volume in one second to forced vital capacity.
Figure 4The feature importance of (A) the random forest model and (B) the XGBoost model. PpoFEV1%, the percentage of predicted postoperative forced expiratory volume in one second; FEV1/FVC, the ratio of forced expiratory volume in one second to forced vital capacity; FVC%, the percentage of forced vital capacity; CVD, cerebrovascular disease; BMI, body mass index; COPD, chronic obstructive pulmonary disease; CCI, the Charlson Comorbidity Index; CAD, coronary artery disease; ILD, Interstitial lung disease; DM, diabetes mellitus; HTN, hypertension; CKD, chronic kidney disease.