| Literature DB >> 33092589 |
Yin-Chen Hsu1,2, Yuan-Hsiung Tsai1,2, Hsu-Huei Weng1,2, Li-Sheng Hsu1,2,3, Ying-Huang Tsai2,4,5,6,7,8, Yu-Ching Lin2,6,7,8, Ming-Szu Hung2,6,7,8, Yu-Hung Fang2,6,7,8, Chien-Wei Chen9,10,11,12.
Abstract
BACKGROUND: This study proposes a prediction model for the automatic assessment of lung cancer risk based on an artificial neural network (ANN) with a data-driven approach to the low-dose computed tomography (LDCT) standardized structure report.Entities:
Keywords: Data visualization; Early detection of cancer; Machine learning; Receiver operating characteristic (ROC) curves; Sensitivity and specificity
Mesh:
Year: 2020 PMID: 33092589 PMCID: PMC7579928 DOI: 10.1186/s12885-020-07465-1
Source DB: PubMed Journal: BMC Cancer ISSN: 1471-2407 Impact factor: 4.430
Fig. 1Flow diagram
Clinical descriptors of the derivation and validation cohorts at the baseline
| Derivation cohort ( | Validation cohort ( | ||||
|---|---|---|---|---|---|
| Cancer ( | Control | Cancer ( | Control | ||
| Sex a | |||||
| Male | 7 (36.84%) | 236 (40.48%) | 2 (25.00%) | 111 (49.12%) | 0.038 |
| Female | 12 (63.16%) | 347 (59.52%) | 6 (75.00%) | 115 (50.88%) | |
| Age (y) a | 64.89 ± 7.53 | 61.87 ± 6.42 | 57.63 ± 8.73 | 61.05 ± 7.88 | 0.053 |
| LDCT parameters | |||||
| Dose (mSv) a | 1.95 ± 0.64 | 1.46 ± 0.24 | 1.78 ± 0.45 | 1.49 ± 0.26 | 0.161 |
| DLP (mGy.cm) a | 75.53 ± 32.54 | 49.17 ± 11.08 | 64.50 ± 24.43 | 50.77 ± 10.72 | 0.206 |
| Pattern of nodules | |||||
| Nodules of interest a | 2.42 (1–7) | 1.11 (0–32) | 1.88 (1–7) | 1.29 (0–8) | 0.330 |
| Number of involved lobes a | 1.68 (1–3) | 0.75 (0–5) | 1.38 (1–4) | 0.99 (0–5) | 0.007 |
| Size of nodules (mm) | |||||
| Solid nodule a | 10.01 (0–136.00) | 1.49 (0–19.80) | 0.63 (0–5.00) | 1.80 (0–36.75) | 1.000 |
| PS nodule a | 3.89 (0–20.40) | 0.38 (0–11.95) | 1.77 (0–4.90) | 0.54 (0–7.30) | 0.498 |
| GGN a | 8.87 (0–31.00) | 0.58 (0–23.30) | 4.56 (0–10.30) | 0.24 (0–9.05) | 0.038 |
| Calcified nodule a | 0.00 (0) | 0.39 (0–19.25) | 0.86 (0–6.90) | 0.55 (0–7.05) | 0.100 |
| Fat-containing nodule a | 0.00 (0) | 0.05 (0–28.15) | 0.00 (0) | 0.00 (0) | 0.506 |
| Intra-pulmonary findings | |||||
| Linear atelectasis a | 10 (52.63%) | 431 (73.93%) | 2 (25.00%) | 108 (47.79%) | < 0.001 |
| Plate-like atelectasis a | 5 (26.32%) | 73 (12.52%) | 0 (0.00%) | 19 (8.41%) | 0.050 |
| Plate-like GGN a | 2 (10.53%) | 143 (24.53%) | 1 (12.50%) | 39 (17.26%) | 0.029 |
| Bronchiectasis a | 0 (0.00%) | 39 (6.69%) | 1 (12.50%) | 8 (3.54%) | 0.143 |
| Emphysema a | 1 (5.26%) | 51 (8.75%) | 2 (25.00%) | 28 (12.39%) | 0.068 |
| Fibrotic change a | 2 (10.53%) | 154 (26.42%) | 0 (0.00%) | 42 (18.58%) | 0.015 |
| Extra-pulmonary findings | |||||
| Mediastinal tumour a | 4 (21.05%) | 30 (5.15%) | 1 (12.50%) | 8 (3.54%) | 0.290 |
| Thyroid nodule a | 1 (5.26%) | 19 (3.26%) | 0 (0.00%) | 2 (0.88%) | 0.045 |
| Adrenal nodule a | 1 (5.26%) | 5 (0.86%) | 0 (0.00%) | 0 (0.00%) | 0.125 |
| Hepatic nodule a | 1 (5.26%) | 67 (11.49%) | 0 (0.00%) | 20 (8.85%) | 0.245 |
| Renal nodule a | 0 (0.00%) | 16 (2.74%) | 0 (0.00%) | 10 (4.42%) | 0.229 |
| Lung-RADS | |||||
| Category 1 | 0 (0.00%) | 323 (55.40%) | 0 (0.00%) | 115 (50.89%) | 0.240 |
| Category 2 | 6 (31.58%) | 222 (38.08%) | 7 (87.50%) | 102 (45.13%) | 0.021 |
| Category 3 | 5 (26.32%) | 31 (5.32%) | 1 (12.50%) | 6 (2.65%) | 0.080 |
| Category 4 | 8 (42.10%) | 7 (1.20%) | 0 (0.00%) | 3 (1.33%) | 0.279 |
a The 22 input features for developing the ANN
b Comparison of the derivation cohort and validation cohort, P-values less than 0.05 indicated statistical significance
c Participant who did not have confirmed lung cancer prior to the index date were labelled as control
BMI body mass index; DLP dose length product; GGN ground-glass nodule; PS nodule, part-solid nodule
The values are given as the mean ± SD, range or n (%)
Fig. 2Structure of an ANN
Contingency table for the Lung-RADS and ANN models (n = 234)
| Scale/model | Lung-RADS | ANN | ||||
|---|---|---|---|---|---|---|
| No | Yes | Sum | No | Yes | Sum | |
| Control a | 217 | 9 | 226 | 192 | 34 | 226 |
| Lung cancer | 7 | 1 | 8 | 2 | 6 | 8 |
| Sum | 224 | 10 | 234 | 194 | 40 | 234 |
a Participant who did not have confirmed lung cancer prior to the index date were labelled as control
Fig. 3ROC curves for the Lung-RADS and ANN model
Performance analysis for the Lung-RADS and ANN models (n = 234)
| Scale/model | Lung-RADS | ANN |
|---|---|---|
| Cut-off | Category 3 | > 0.012 |
| AUC (95% CI) | 0.764 (0.705, 0.817) | 0.873 (0.823, 0.913) |
| Classification accuracy (%) | 93.16 | 84.62 |
| Sensitivity (95% CI) | 12.50 (0.3, 52.7) | 75.00 (34.9, 96.8) |
| Specificity (95% CI) | 96.02 (92.6, 98.2) | 84.96 (79.6, 89.4) |
| PPV (95% CI) | 10.0 (1.6, 43.7) | 15.0 (9.6, 22.6) |
| NPV (95% CI) | 96.9 (96.0, 97.6) | 99.0 (96.7, 99.7) |
| LR+ (95% CI) | 3.14 (0.5, 21.9) | 4.99 (3.0, 8.3) |
| LR- (95% CI) | 0.91 (0.7, 1.2) | 0.29 (0.1, 1.0) |
AUC area under the curve; CI confidence interval; LR+ positive likelihood ratio; LR− negative likelihood ratio; NPV negative predictive value; PPV positive predictive value
Fig. 4The plot visualizing permutation feature importance scores of the ANN model