| Literature DB >> 35646679 |
Huan Gao1,2, Zhi-Yi He2, Xing-Li Du2, Zheng-Gang Wang2, Li Xiang1.
Abstract
Background: This study aimed to develop an artificial neural network (ANN) model for predicting synchronous organ-specific metastasis in lung cancer (LC) patients.Entities:
Keywords: SEER; artificial neural network; lung cancer; machine learning; metastasis
Year: 2022 PMID: 35646679 PMCID: PMC9136456 DOI: 10.3389/fonc.2022.817372
Source DB: PubMed Journal: Front Oncol ISSN: 2234-943X Impact factor: 5.738
Figure 1Schematic structure of the artificial neural network (ANN) model including one input layer with 13 nodes, nine hidden layers with 100 nodes, and one output layer with 1 node.
Baseline demographic and clinical characteristics of patients with lung cancer.
| Characteristics | Total patients | Patients with no metastasis | Patients with metastases | Patients with bone metastasis | Patients with brain metastasis | Patients with liver metastasis |
|---|---|---|---|---|---|---|
| n=62151 | n=49969 | n=12182 | n=3982 | n=3674 | n=1307 | |
|
| ||||||
| Mean±SD | 68±11 | 68+10 | 66+11 | 68±11 | 64±10 | 68±10 |
| Median | 68 | 68 | 66 | 68 | 64 | 68 |
| (IQR 25%-75%) | (61-75) | (61-76) | (59-74) | (60-76) | (57-72) | (61-76) |
|
| ||||||
| Male | 31736 (51.1%) | 24926 (49.9%) | 6810 (55.9%) | 2333 (58.6%) | 1904 (51.8%) | 724 (55.4%) |
| Female | 30415 (48.9%) | 25043 (50.1%) | 5372 (44.1%) | 1649 (41.4%) | 1770 (48.2%) | 583 (44.6%) |
|
| ||||||
| White | 50589 (81.4%) | 40911 (81.9%) | 9678 (79.4%) | 3134 (78.7%) | 2885 (78.5%) | 1076 (82.3%) |
| Blake | 6855 (11%) | 5326 (10.7%) | 1529 (12.6%) | 526 (13.2%) | 491 (13.4%) | 169 (12.9%) |
| American Indian/Alaska Native | 291 (0.5%) | 241 (0.5%) | 50 (0.4%) | 17 (0.4%) | 18 (0.5%) | 4 (0.3%) |
| Asian or Pacific Islander | 4416 (7.1%) | 3491 (7%) | 925 (7.6%) | 305 (7.7%) | 280 (7.6%) | 58 (4.4%) |
|
| ||||||
| Single (never married) | 8840 (14.2%) | 6834 (13.7%) | 2006 (16.5%) | 605 (15.2%) | 700 (19.1%) | 199 (15.2%) |
| Married (including common law) | 34269 (55.1%) | 27547 (55.1%) | 6722 (55.2%) | 2235 (56.1%) | 1941 (52.8%) | 683 (52.3%) |
| Separated | 726 (1.2%) | 577 (1.2%) | 149 (1.2%) | 49 (1.2%) | 43 (1.2%) | 21 (1.6%) |
| Divorced | 8267 (13.3%) | 6637 (13.3%) | 1630 (13.4%) | 499 (12.5%) | 525 (14.3%) | 179 (13.7%) |
| Widowed | 10049 (16.2%) | 8374 (16.8%) | 1675 (13.7%) | 594 (14.9%) | 465 (12.7%) | 225 (17.2%) |
|
| ||||||
| Uninsured | 1602 (2.6%) | 1136 (2.3%) | 466 (3.8%) | 105 (2.6%) | 178 (4.8%) | 39 (3%) |
| Insured/Medicaid | 60549 (97.4%) | 48833 (97.7%) | 11716 (96.2%) | 3877 (97.4%) | 3496 (95.2%) | 1268 (97%) |
|
| ||||||
| Main bronchus | 2036 (3.3%) | 1388 (2.8%) | 648 (5.3%) | 196 (4.9%) | 154 (4.2%) | 103 (7.9%) |
| Upper lobe | 37284 (60%) | 29918 (59.9%) | 7366 (60.5%) | 2437 (61.2%) | 2324 (63.3%) | 740 (56.6%) |
| Middle lobe | 3136 (5%) | 2600 (5.2%) | 536 (4.4%) | 170 (4.3%) | 158 (4.3%) | 58 (4.4%) |
| Lower lobe | 19008 (30.6%) | 15486 (31%) | 3522 (28.9%) | 1146 (28.8%) | 1010 (27.5%) | 389 (29.8%) |
| Overlapping lesion of lung | 687 (1.1%) | 577 (1.2%) | 110 (0.9%) | 33 (0.8%) | 28 (0.8%) | 17 (1.3%) |
|
| ||||||
| Squamous cell carcinoma | 17973 (28.9%) | 15782 (31.6%) | 2191 (18%) | 874 (21.9%) | 515 (14%) | 331 (25.3%) |
| Small cell carcinoma | 3236 (5.2%) | 1807 (3.6%) | 1429 (11.7%) | 244 (6.1%) | 339 (9.2%) | 341 (26.1%) |
| Adenocarcinoma | 33036 (53.2%) | 26471 (53%) | 6565 (53.9%) | 2229 (56%) | 2185 (59.5%) | 429 (32.8%) |
| Large cell carcinoma | 1117 (1.8%) | 830 (1.7%) | 287 (2.4%) | 78 (2%) | 101 (2.7%) | 32 (2.4%) |
| Adenosquamous carcinoma | 5244 (8.4%) | 3609 (7.2%) | 1635 (13.4%) | 532 (13.4%) | 518 (14.1%) | 161 (12.3%) |
| Sarcomatoid carcinoma | 183 (0.3%) | 146 (0.3%) | 37 (0.3%) | 15 (0.4%) | 12 (0.3%) | 1 (0.1%) |
| Carcinoid tumor | 1362 (2.2%) | 1324 (2.6%) | 38 (0.3%) | 10 (0.3%) | 4 (0.1%) | 12 (0.9%) |
|
| ||||||
| Well differentiated | 7619 (12.3%) | 7183 (14.4%) | 436 (3.6%) | 170 (4.3%) | 111 (3%) | 37 (2.8%) |
| Moderately differentiated | 21737 (35%) | 18991 (38%) | 2746 (22.5%) | 1072 (26.9%) | 816 (22.2%) | 199 (15.2%) |
| Poorly differentiated | 29483 (47.4%) | 21774 (43.6%) | 7709 (63.3%) | 2489 (62.5%) | 2406 (65.5%) | 785 (60.1%) |
| Undifferentiated | 3312 (5.3%) | 2021 (4%) | 1291 (10.6%) | 251 (6.3%) | 341 (9.3%) | 286 (21.9%) |
|
| ||||||
| Mean±SD | 42±25 | 39±24 | 52 | 51±25 | 52±25 | 53±26 |
| Median | 35 | 32 | 48 | 46 | 48 | 50 |
| (IQR 25%-75%) | (22-56) | (20-52) | (32-69) | (32-67) | (32-68) | (33-70) |
|
| ||||||
| STN0 | 55677 (89.6%) | 47096 (94.3%) | 8581 (70.4%) | 2798 (70.3%) | 2788 (75.9%) | 945 (72.3%) |
| STN1 | 2276 (3.7%) | 901 (1.8%) | 1375 (11.3%) | 445 (11.2%) | 365 (9.9%) | 145 (11.1%) |
| STN2 | 2416 (3.9%) | 1187 (2.4%) | 1229 (10.1%) | 421 (10.6%) | 312 (8.5%) | 117 (9%) |
| STN3 | 1782 (2.9%) | 785 (1.6%) | 997 (8.2%) | 318 (8%) | 209 (5.7%) | 100 (7.7%) |
|
| ||||||
| PL0 | 21565 (34.7%) | 20633 (41.3%) | 932 (7.7%) | 278 (7%) | 338 (9.2%) | 101 (7.7%) |
| PL1 | 1758 (2.8%) | 1715 (3.4%) | 43 (0.4%) | 5 (0.1%) | 26 (0.7%) | 4 (0.3%) |
| PL2 | 1513 (2.4%) | 1455 (2.9%) | 58 (0.5%) | 15 (0.4%) | 30 (0.8%) | 6 (0.5%) |
| PL3 | 686 (1.1%) | 648 (1.3%) | 38 (0.3%) | 18 (0.5%) | 12 (0.3%) | 2 (0.2%) |
| PLX | 36629 (58.9%) | 25518 (51.1%) | 11111 (91.2%) | 3666 (92.1%) | 3268 (88.9%) | 1194 (91.4%) |
|
| ||||||
| T1a | 11271 (18.1%) | 10696 (21.4%) | 575 (4.7%) | 183 (4.6%) | 214 (5.8%) | 69 (5.3%) |
| T1b | 8238 (13.3%) | 7397 (14.8%) | 841 (6.9%) | 288 (7.2%) | 267 (7.3%) | 86 (6.6%) |
| T2a | 17176 (27.6%) | 14653 (29.3%) | 2523 (20.7%) | 832 (20.9%) | 840 (22.9%) | 264 (20.2%) |
| T2b | 5989 (9.6%) | 4615 (9.2%) | 1374 (11.3%) | 400 (10%) | 485 (13.2%) | 143 (10.9%) |
| T3 | 9616 (15.5%) | 6763 (13.5%) | 2853 (23.4%) | 951 (23.9%) | 869 (23.7%) | 293 (22.4%) |
| T4 | 9861 (15.9%) | 5845 (11.7%) | 4016 (33%) | 1328 (33.4%) | 999 (27.2%) | 452 (34.6%) |
|
| ||||||
| NX | 626 (1%) | 346 (0.7%) | 280 (2.3%) | 93 (2.3%) | 83 (2.3%) | 32 (2.4%) |
| N0 | 32972 (53.1%) | 30260 (60.6%) | 2712 (22.3%) | 863 (21.7%) | 1066 (29%) | 281 (21.5%) |
| N1 | 6262 (10.1%) | 5116 (10.2%) | 1146 (9.4%) | 386 (9.7%) | 386 (10.5%) | 120 (9.2%) |
| N2 | 17174 (27.6%) | 11319 (22.7%) | 5855 (48.1%) | 1885 (47.3%) | 1641 (44.7%) | 642 (49.1%) |
| N3 | 5117 (8.2%) | 2928 (5.9%) | 2189 (18%) | 755 (19%) | 498 (13.6%) | 232 (17.8%) |
SD, standard deviation; IQR, interquartile range; STN0, no separate tumor nodules noted; STN1, separate tumor nodules in ipsilateral lung, same lobe; STN2, separate tumor nodules in ipsilateral lung, different lobe; STN3, separate tumor nodules, ipsilateral lung, same and different lobe.
Cancer-specific survival and multivariate analysis for patients with lung cancer.
| Site | No. (%) | Cancer-specific survival | Multivariate analysis | |||
|---|---|---|---|---|---|---|
| Median | Mean | SD | HR (95% CI) | P-value | ||
| None | 19139 (65.3) | 10 | 13.4 | 12.761 | 1 | |
| Bone | 3262 (11.1) | 4 | 6.97 | 8.061 | 1.630 (1.568-1.695) | <0.001 |
| Brain | 2974 (10.2) | 4 | 7.22 | 8.4 | 1.698 (1.631-1.768) | <0.001 |
| Liver | 1126 (3.8) | 4 | 6.46 | 7.63 | 1.673 (1.573-1.778) | <0.001 |
| Two or Three | 2795 (9.5) | 3 | 5.48 | 7.075 | 2.025 (1.941-2.112) | <0.001 |
| Total | 29296 | 7 | 11.03 | 11.769 | ||
SD, standard deviation; HR, hazard ratio; CI, confidence interval.
Figure 2Kaplan-Meier analysis of cancer-specific survival for patients with lung cancer stratified by organ-specific metastasis.
Performance of the artificial neural network (ANN) model with increasing layers for predicting distant metastasis.
| Number of the hidden layer | AUC | Sensitivity | Specificity | Accuracy | FPR | FNR | LRP | LRN |
|---|---|---|---|---|---|---|---|---|
| 5 | 0.737 | 0.776 | 0.697 | 0.713 | 0.303 | 0.224 | 2.565 | 0.321 |
| 6 | 0.747 | 0.815 | 0.679 | 0.705 | 0.321 | 0.185 | 2.536 | 0.273 |
| 7 | 0.748 | 0.837 | 0.660 | 0.691 | 0.340 | 0.163 | 2.460 | 0.247 |
| 8 | 0.759 | 0.889 | 0.629 | 0.679 | 0.371 | 0.111 | 2.398 | 0.176 |
| 9 | 0.759 | 0.906 | 0.613 | 0.669 | 0.387 | 0.094 | 2.339 | 0.154 |
| 10 | 0.761 | 0.902 | 0.620 | 0.674 | 0.380 | 0.098 | 2.371 | 0.158 |
| 11 | 0.756 | 0.896 | 0.609 | 0.665 | 0.391 | 0.104 | 2.293 | 0.170 |
AUC, area under curve; FPR, false positive rate; FNR, false negative rate; LRP, likelihood ratio positive; LRN, likelihood ratio negative.
Figure 3Receiver operating characteristic curve of (A) the artificial neural network (ANN) model and (B) the random forest (RF) model.
Figure 4Receiver operating characteristic curve of the artificial neural network (ANN) model for predicting organ-specific metastasis.
Performance of the artificial neural network (ANN) model for predicting organ-specific metastasis.
| Site of the organ-specific metastasis | AUC | Sensitivity | Specificity | Accuracy | FPR | FNR | LRP | LRN |
|---|---|---|---|---|---|---|---|---|
| Bone | 0.688 | 0.913 | 0.443 | 0.539 | 0.557 | 0.087 | 1.638 | 0.197 |
| Brain | 0.686 | 0.906 | 0.449 | 0.525 | 0.551 | 0.094 | 1.646 | 0.209 |
| Liver | 0.664 | 0.925 | 0.403 | 0.453 | 0.597 | 0.075 | 1.548 | 0.187 |
AUC, area under curve; FPR, false positive rate; FNR, false negative rate; LRP, likelihood ratio positive; LRN, likelihood ratio negative.
Figure 5Variable importance from the artificial neural network (ANN) model for predicting (A) distant metastasis and (B) organ-specific metastasis [(1) bone, (2) brain, and (3) liver].