| Literature DB >> 33836677 |
Yeshwant Reddy Chillakuru1,2, Kyle Kranen1, Vishnu Doppalapudi1, Zhangyuan Xiong1, Letian Fu1, Aarash Heydari1, Aditya Sheth1, Youngho Seo1, Thienkhai Vu1, Jae Ho Sohn3.
Abstract
BACKGROUND: Reidentification of prior nodules for temporal comparison is an important but time-consuming step in lung cancer screening. We develop and evaluate an automated nodule detector that utilizes the axial-slice number of nodules found in radiology reports to generate high precision nodule predictions.Entities:
Keywords: Deep learning; Lung cancer; Lung nodule; Machine learning; Nodule detection
Mesh:
Year: 2021 PMID: 33836677 PMCID: PMC8034095 DOI: 10.1186/s12880-021-00594-4
Source DB: PubMed Journal: BMC Med Imaging ISSN: 1471-2342 Impact factor: 1.930
Fig. 1Cohort selection. Cohort selection for train/validation data and test data. Training data and test data are collected from two different sources. “n” refers to the number of CT scans. Each CT scan originates from a different patient
Fig. 2Algorithm schema. Training and inference pipeline for slice-assisted nodule detection. Training consists of using 2-dimensional axial and coronal MIP slices from each CT being input into a Retinanet model. Inference adds three additional steps to the Retinanet raw inferences: unsupervised clustering, false positive reduction using clustering metadata (max and mean confidence scores, whether the cluster contains both axial and coronal raw inferences, the number of raw inferences clustered together, and distance from top and bottom of the CT in mm)
Fig. 3Unsupervised clustering of Retinanet raw inferences. Visualization of unsupervised clustering of raw Retinanet inferences using DBSCAN. Diameter of each production corresponds to the Retinanet inference confidence score. Higher density clusters containing both axial and coronal predictions with high confidence scores are more likely to be real nodules. The 3 large clusters in (b)—green, purple, and dark teal—were true nodules
NLST test set patient demographics
| Variable | Missed/incorrect/unintended nodule predictionsa | Correct nodule predictionsa | Totals | |
|---|---|---|---|---|
| Patients | 15 (100%) | 32 (100%) | – | 47 (100%) |
| Age | 0.586 | |||
| (mean ± SD) | 61.07 ± 4.74 | 61.91 ± 5.12 | 61.64 ± 4.97 | |
| Sex | 0.806 | |||
| Female | 5 (33%) | 8 (25%) | 13 (28%) | |
| Male | 10 (67%) | 24 (75%) | 34 (68%) | |
| Race | 0.180 | |||
| White | 12 (80%) | 31 (97%) | 43 (91%) | |
| Black | 1 (7%) | 0 (0%) | 1 (2%) | |
| Asian | 1 (7%) | 1 (3%) | 2 (4%) | |
| > 1 Race | 1 (7%) | 0 (0%) | 1 (2%) | |
| BMI | 0.256 | |||
| (mean ± SD) | 29.72 ± 6.15 | 27.56 ± 5.35 | 28.25 ± 5.64 | |
| Smoker at start of NLST | 0.468 | |||
| Yes | 6 (40%) | 18 (56%) | 24 (51%) | |
| No (former smoker) | 9 (60%) | 14 (44%) | 23 (49%) | |
| Cigarettes/day | 0.251 | |||
| (mean ± SD) | 32.67 ± 20.08 | 26.22 ± 9.00 | 28.28 ± 13.65 | |
| Smoking total years | 0.978 | |||
| (mean ± SD) | 39.87 ± 8.68 | 39.94 ± 7.03 | 39.76 ± 7.41 | |
| Smoking pack years | 0.320 | |||
| (mean ± SD) | 63.53 ± 54.91 | 51.62 ± 17.72 | 56.37 ± 34.33 |
Adjusted nodule performance with the highest recall score at a 20 mm distance threshold was used to split missed/incorrect/unintended and correct nodule predictions
SD standard deviation
aFor patients with > 1 nodule, if at least one nodule was correctly identified for that patient, this patient was classified as a correct prediction
NLST test set nodule characteristics
| Nodule variable | Missed/incorrect/unintended nodule predictions | Correct nodule predictions n | Totals | |
|---|---|---|---|---|
| N | 35 (100%) | 54 (100%) | – | 89 (100%) |
| Location | 0.307 | |||
| Left lower lobe | 7 (20%) | 10 (19%) | 17 (19%) | |
| Left upper lobe | 6 (17%) | 7 (13%) | 13 (15%) | |
| Lingula | 0 (0%) | 6 (11%) | 6 (7%) | |
| Right lower lobe | 12 (34%) | 12 (22%) | 24 (27%) | |
| Right middle lobe | 4 (11%) | 10 (19%) | 14 (16%) | |
| Right upper lobe | 6 (17%) | 9 (17%) | 15 (17%) | |
| Central versus peripheral | 0.332 | |||
| Central | 3 (9%) | 1 (2%) | 4 (4%) | |
| Peripheral | 32 (91%) | 53 (98%) | 85 (96%) | |
| Subpleural versus parenchymal | 0.807 | |||
| Subpleural | 19 (54%) | 32 (59%) | 38 (43%) | |
| Parenchymal | 16 (46%) | 22 (41%) | 51 (57%) | |
| Margins | 0.254 | |||
| Smooth | 27 (77%) | 46 (85%) | 73 (82%) | |
| Poorly defined | 5 (14%) | 5 (9%) | 10 (11%) | |
| Spiculated | 1 (3%) | 3 (6%) | 4 (4%) | |
| Unable to determine | 2 (6%) | 0 (0%) | 3 (2%) | |
| Diameter (mm) | 0.752 | |||
| (mean ± SD) | 4.77 ± 2.78 | 4.94 ± 1.75 | 4.88 ± 2.26 | |
| Attenuation | 0.028* | |||
| Soft tissue | 27 (77%) | 51 (94%) | 78 (88%) | |
| Ground glass | 3 (9%) | 1 (2%) | 4 (4%) | |
| Mixed | 1 (3%) | 2 (4%) | 4 (4%) | |
| Unable to determine | 4 (11%) | 0 (0%) | 3 (3%) |
Adjusted nodule performance with the highest recall score at a 20 mm distance threshold was used to split missed/incorrect/unintended and correct nodule predictions
SD standard deviation
*p < 0.05 using Chi-square test for categorical and Welch’s T-test for continuous variables to test for difference between correct and missed/incorrect nodule predictions
Nodule detector performance
| Performance metric | 10 mm distance threshold | 20 mm distance threshold |
|---|---|---|
| Nodule confidence score ≥ 0.50 | ||
| Precision | 0.962 | 0.931 |
| Recall | 0.573 | 0.607 |
| FPs/scan | 0.040 | 0.080 |
| Adjusted precisiona | 0.943 | 0.862 |
| Adjusted recalla | 0.561 | 0.562 |
| Adjusted FPs/scana | 0.060 | 0.160 |
| Nodule confidence score ≥ 0.20 | ||
| Precision | 0.889 | 0.870 |
| Recall | 0.629 | 0.674 |
| FPs/scan | 0.140 | 0.180 |
| Adjusted precisiona | 0.841 | 0.783 |
| Adjusted recalla | 0.596 | 0.607 |
| Adjusted FPs/scana | 0.200 | 0.300 |
aAdjusted precision/recall/FPs counts only predictions on intended nodule as a true positive (e.g. if a calcified nodule was predicted, but the ground truth NLST label specified a ground glass nodule, this was recorded as an incorrect prediction)
Fig. 4Free-response receiver operating characteristic at 10 mm and 20 mm thresholds. Low false positive (FP) rates were observed with axial-slice assisted selection of nodules. * adjusted to count only predictions on intended nodule as a true positive (e.g. if a calcified nodule was predicted, but the ground truth NLST label specified a ground glass nodule, this was recorded as an incorrect prediction)
Fig. 5Prediction examples. Top row contains correctly identified nodules in green outline (a–d). Correctly identified nodules include ground glass nodule (a) and several soft tissue nodules (b–d). Bottom row contains missed nodules and false positives with yellow crosshairs specifying correct nodules (e, f). e FP prediction outlined in red. f Calcified nodule identified (green box) instead of ground glass nodule (yellow crosshair). g Small soft tissue nodule missed. h Small subpleural soft tissue nodule missed
Fig. 6Clinical integration schema. Integration of the high precision nodule detector into clinical workflow. When reviewing a previously annotated lung cancer screening CT, the nodule detector will automatically highlight nodules using axial slice numbers derived from the radiology report