| Literature DB >> 34853529 |
Menglin Zhu1, Bo Wang2, Tiejun Wang3, Yilin Chen1,4, Du He1,5.
Abstract
OBJECTIVE: Pulmonary metastasis (PM) is an independent risk factor affecting the prognosis of cervical patients, but it still lacks a prediction. This study aimed to develop machine learning-based predictive models for PM.Entities:
Keywords: SEER database; cervical cancer; machine learning; predictive model; prognosis; pulmonary metastasis
Year: 2021 PMID: 34853529 PMCID: PMC8628546 DOI: 10.2147/IJGM.S338389
Source DB: PubMed Journal: Int J Gen Med ISSN: 1178-7074
Figure 1The flow chart of selecting the cervical patients with PM.
Baseline Demographic and Clinical Characteristics of Included Patients Diagnosed with and without PM
| Variables | Level | Total (N=22766) | Yes (N=998) | No (N=21768) | P-value |
|---|---|---|---|---|---|
| Age, y (median [IQR]) | 49.00 [39.00, 61.00] | 58.00 [49.00, 68.] | 49.00 [39.00, 61.00] | <0.001 | |
| Race (%) | White | 16,957 (74.5) | 731 (73.2) | 16,226 (74.5) | <0.001 |
| Black | 3153 (13.8) | 172 (17.2) | 2981 (13.7) | ||
| Other | 2403 (10.6) | 95 (9.5) | 2308 (10.6) | ||
| Unknown | 253 (1.1) | 0 (0.0) | 253 (1.2) | ||
| Year (%) | 2010 | 3220 (14.1) | 113 (11.3) | 3107 (14.3) | 0.038 |
| 2011 | 3193 (14.0) | 126 (12.6) | 3067 (14.1) | ||
| 2012 | 3242 (14.2) | 152 (15.2) | 3090 (14.2) | ||
| 2013 | 3079 (13.5) | 130 (13.0) | 2949 (13.5) | ||
| 2014 | 3309 (14.5) | 150 (15.0) | 3159 (14.5) | ||
| 2015 | 3364 (14.8) | 173 (17.3) | 3191 (14.7) | ||
| 2016 | 3359 (14.8) | 154 (15.4) | 3205 (14.7) | ||
| Site* (%) | Cervix uteri | 17,682 (77.7) | 870 (87.2) | 16,812 (77.2) | <0.001 |
| Endocervix | 4321 (19.0) | 108 (10.8) | 4213 (19.4) | ||
| Exocervix | 411 (1.8) | 10 (1.0) | 401 (1.8) | ||
| OLC | 352 (1.5) | 10 (1.0) | 342 (1.6) | ||
| Grade (%) | Grade I | 2536 (11.1) | 18 (1.8) | 2518 (11.6) | <0.001 |
| Grade II | 7029 (30.9) | 172 (17.2) | 6857 (31.5) | ||
| Grade III | 6427 (28.2) | 415 (41.6) | 6012 (27.6) | ||
| Grade IV | 542 (2.4) | 46 (4.6) | 496 (2.3) | ||
| Unknown | 6232 (27.4) | 347 (34.8) | 5885 (27.0) | ||
| Pathology (%) | ADC | 4276 (18.8) | 141 (14.1) | 4135 (19.0) | <0.001 |
| SCC | 14552 (63.9) | 573 (57.4) | 13,979 (64.2) | ||
| Others | 3938 (17.3) | 284 (28.5) | 3654 (16.8) | ||
| SEER sit£ (%) | Localized | 8788 (38.6) | 0 (0.0) | 8788 (40.4) | <0.001 |
| Regional | 7421 (32.6) | 0 (0.0) | 7421 (34.1) | ||
| Distant | 2822 (12.4) | 844 (84.6) | 1978 (9.1) | ||
| Unknown | 3735 (16.4) | 154 (15.4) | 3581 (16.5) | ||
| T stage₰ (%) | T0 | 13 (0.1) | 2 (0.2) | 11 (0.1) | <0.001 |
| T1 | 10,462 (46.0) | 90 (9.0) | 10,372 (47.6) | ||
| T2 | 4242 (18.6) | 143 (14.3) | 4099 (18.8) | ||
| T3 | 3045 (13.4) | 323 (32.4) | 2722 (12.5) | ||
| T4 | 724 (3.2) | 112 (11.2) | 612 (2.8) | ||
| TX | 767 (3.4) | 159 (15.9) | 608 (2.8) | ||
| Unknown | 3513 (15.4) | 169 (16.9) | 3344 (15.4) | ||
| N stage (%) | N0 | 13,676 (60.1) | 223 (22.3) | 13,453 (61.8) | <0.001 |
| N1 | 4834 (21.2) | 472 (47.3) | 4362 (20.0) | ||
| NX | 743 (3.3) | 134 (13.4) | 609 (2.8) | ||
| Unknown | 3513 (15.4) | 169 (16.9) | 3344 (15.4) | ||
| M stage (%) | M0 | 16,632 (73.1) | 0 (0.0) | 16,632 (76.4) | <0.001 |
| M1 | 2621 (11.5) | 829 (83.1) | 1792 (8.2) | ||
| Unknown | 3513 (15.4) | 169 (16.9) | 3344 (15.4) | ||
| Lym biopsy (%) | <3 | 466 (2.0) | 9 (0.9) | 457 (2.1) | <0.001 |
| ≥4 | 7264 (31.9) | 27 (2.7) | 7237 (33.2) | ||
| Unknown | 15,036 (66.0) | 962 (96.4) | 14,074 (64.7) | ||
| Surgery (%) | Yes | 12,582 (55.3) | 85 (8.5) | 12,497 (57.4) | <0.001 |
| No | 10,030 (44.1) | 909 (91.1) | 9121 (41.9) | ||
| Unknown | 154 (0.7) | 4 (0.4) | 150 (0.7) | ||
| Lym examination (%) | Negative | 14,281 (62.7) | 909 (91.1) | 13,372 (61.4) | <0.001 |
| Positive | 8161 (35.8) | 60 (6.0) | 8101 (37.2) | ||
| Unknown | 324 (1.4) | 29 (2.9) | 295 (1.4) | ||
| Bone met (%) | Yes | 537 (2.4) | 223 (22.3) | 314 (1.4) | <0.001 |
| No | 22,185 (97.4) | 750 (75.2) | 21,435 (98.5) | ||
| Unknown | 44 (0.2) | 25 (2.5) | 19 (0.1) | ||
| Brain met (%) | Yes | 85 (0.4) | 48 (4.8) | 37 (0.2) | <0.001 |
| No | 22,640 (99.4) | 925 (92.7) | 21,715 (99.8) | ||
| Unknown | 41 (0.2) | 25 (2.5) | 16 (0.1) | ||
| Liver met (%) | Yes | 486 (2.1) | 244 (24.4) | 242 (1.1) | <0.001 |
| No | 22,245 (97.7) | 734 (73.5) | 21,511 (98.8) | ||
| Unknown | 35 (0.2) | 20 (2.0) | 15 (0.1) | ||
| Lym distant (%) | Yes | 269 (1.2) | 73 (7.3) | 196 (0.9) | <0.001 |
| No | 3076 (13.5) | 77 (7.7) | 2999 (13.8) | ||
| Unknown | 19,421 (85.3) | 848 (85.0) | 18,573 (85.3) | ||
| Tumor size, cm (%) | <5 | 13,256 (58.2) | 426 (42.7) | 12,830 (58.9) | <0.001 |
| ≥5 | 455 (2.0) | 1 (0.1) | 454 (2.1) | ||
| Unknown | 9055 (39.8) | 571 (57.2) | 8484 (39.0) | ||
| Insurance (%) | Medicaid | 6881 (30.2) | 355 (35.6) | 6526 (30.0) | 0.001 |
| Insured | 13,794 (60.6) | 553 (55.4) | 13,241 (60.8) | ||
| Uninsured | 1412 (6.2) | 65 (6.5) | 1347 (6.2) | ||
| Unknown | 679 (3.0) | 25 (2.5) | 654 (3.0) | ||
| Marital status (%) | Married | 14,345 (63.0) | 641 (64.2) | 13,704 (63.0) | 0.217 |
| Unmarried | 6984 (30.7) | 307 (30.8) | 6677 (30.7) | ||
| Unknown | 1437 (6.3) | 50 (5.0) | 1387 (6.4) |
Notes: *According to the primary site labeled. ₰According to the American Joint Committee on Cancer(AJCC), 6th. £According to the SEER historic stage (1973–2015).
Abbreviations: IQR, interquartile range; OLC, overlapping lesion of cervix uteri (cervix uteri, endocervix, exocervix equivalent FIGO I, overlapping lesion of cervix uteri equivalent FIGO II); ADC, adenocarcinoma; SCC, squamous cell carcinoma; Lym biopsy, lymph node biopsy; Lym exam, lymph node examination; Bone met, bone metastasis; Brain met, brain metastasis; Liver met, liver metastasis; Lym distant, distant lymphatic metastasis.
Figure 2Generalized linear model. (A) Nomograms conveying the results of the candidate factors for predicting micrometastasis of lymph nodes. (B) Calibration curves for internal validation of the nomogram. (C) Predicted risk histogram comparing predicted risk of the nomogram with the observed frequency.
Figure 3Random forest model. (A) The candidate factors associated with micrometastasis of lymph nodes were ordered according to the mean decreased Gini index. (B) Relationship of dynamic changes between the prediction error and the number of decision trees. (C) Performance of the prediction model with increasing numbers of features in the ROC curve.
Figure 4Cox proportional hazard model. (A) Nomograms for 1-, 3-, and 5-year overall survival (OS) prediction. (B) Calibration curves for internal validation of the nomogram. (C) Predicted risk histogram comparing predicted risk of the nomogram with the observed frequency.
Figure 5Competitive risk model. (A) Nomogram predicting CSS at 1-, 3- and 5-year using the competitive risk model. According to the Nomogram score, patient No.31 has a cumulative risk of 0.675, 0.882, and 0.928 at 1-, 3- and 5-year, respectively. (B) Calibration curves for internal validation of the nomogram. (C) Predicted risk histogram comparing predicted risk of the nomogram with the observed frequency. *P<0.05; ***P<0.001.