| Literature DB >> 34625636 |
Sun Wook Cho1, Jin Young Kwak2, Inyoung Youn3, Eunjung Lee4, Jung Hyun Yoon5, Hye Sun Lee6, Mi-Ri Kwon3, Juhee Moon3, Sunyoung Kang7, Seul Ki Kwon7, Kyong Yeun Jung8, Young Joo Park7, Do Joon Park7.
Abstract
To compare the diagnostic performances of physicians and a deep convolutional neural network (CNN) predicting malignancy with ultrasonography images of thyroid nodules with atypia of undetermined significance (AUS)/follicular lesion of undetermined significance (FLUS) results on fine-needle aspiration (FNA). This study included 202 patients with 202 nodules ≥ 1 cm AUS/FLUS on FNA, and underwent surgery in one of 3 different institutions. Diagnostic performances were compared between 8 physicians (4 radiologists, 4 endocrinologists) with varying experience levels and CNN, and AUS/FLUS subgroups were analyzed. Interobserver variability was assessed among the 8 physicians. Of the 202 nodules, 158 were AUS, and 44 were FLUS; 86 were benign, and 116 were malignant. The area under the curves (AUCs) of the 8 physicians and CNN were 0.680-0.722 and 0.666, without significant differences (P > 0.05). In the subgroup analysis, the AUCs for the 8 physicians and CNN were 0.657-0.768 and 0.652 for AUS, 0.469-0.674 and 0.622 for FLUS. Interobserver agreements were moderate (k = 0.543), substantial (k = 0.652), and moderate (k = 0.455) among the 8 physicians, 4 radiologists, and 4 endocrinologists. For thyroid nodules with AUS/FLUS cytology, the diagnostic performance of CNN to differentiate malignancy with US images was comparable to that of physicians with variable experience levels.Entities:
Mesh:
Year: 2021 PMID: 34625636 PMCID: PMC8501016 DOI: 10.1038/s41598-021-99622-0
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Summary of the demographic features.
| Total | Benign | Malignancy | |||||
|---|---|---|---|---|---|---|---|
| Numbers of nodules | 202 | 86 (42.6%) | 116 (57.4%) | ||||
| 0.416 | |||||||
| Male | 48 | 18 (20.9%) | 30 (25.9%) | ||||
| Female | 154 | 68 (79.1%) | 86 (74.1%) | ||||
| Mean age (years)a | 47.9 ± 13.3 | 47.0 ± 14.8 | 0.669 | ||||
| < 0.001 | |||||||
| Institution Ab | 158 | 78 | 50 (58.1%) | 23 | 108 (93.1%) | 55 | |
| Institution Bb | 43 | 14 | 29 | ||||
| Institution Cb | 37 | 13 | 24 | ||||
| Institution Ab | 44 | 34 | 36 (41.9%) | 29 | 8 (6.9%) | 5 | |
| Institution Bb | 1 | 1 | 0 | ||||
| Institution Cb | 9 | 6 | 3 | ||||
| Median size (IQR, mm)c | 19.5 (13–32) | 13.5 (11–23) | 0.009 | ||||
| Median cancer probability calculated by CNN (IQR, %)c | 36.5 (18.7–69.5) | 67.7 (30.2–89.9) | < 0.001 | ||||
AUS atypia of undetermined significance, FLUS follicular lesion of undetermined significance, IQR interquartile range, CNN deep convolutional neural network.
aThe independent two sample t-test.
bWe collected consecutive patients from three institutions, and the numbers of patients recruited from each hospital was expressed as Institution A, B, and C.
cThe Mann–Whitney U test.
Pathologic results after surgery.
| Pathologic result | AUS | FLUS | Total |
|---|---|---|---|
| Adenomatous hyperplasia | 22 (44.0) | 12 (3.3) | 34 (39.5) |
| Follicular adenoma | 19 (38.0) | 20 (55.6) | 39 (45.3) |
| Hurthle cell adenoma | 3 (6.0) | 3 (8.3) | 6 (7.0) |
| Noninvasive follicular thyroid neoplasm with papillary-like nuclear feature | 3 (6.0) | 1 (2.8) | 4 (4.7) |
| Hyaline trabecular tumor | 1 (2.0) | – | 1 (1.2) |
| Localized fibrosis | 1 (2.0) | – | 1 (1.2) |
| Lymphocytic thyroiditis | 1 (2.0) | – | 1 (1.2) |
| 50 | 36 | 86 | |
| Papillary thyroid carcinoma | 99 (91.7) | 4 (50.0) | 103 (88.8) |
| Follicular carcinoma | 8 (7.4) | 3 (37.5) | 11 (9.5) |
| Poorly differentiated carcinoma | 1 (0.9) | 1 (12.5) | 2 (1.7) |
| 108 | 8 | 116 | |
Data in parentheses are percentages.
AUS atypia of undetermined significance, FLUS follicular lesion of undermined significance.
Diagnostic performances of the 8 physicians and deep convolutional neural network.
| Sensitivity | Specificity | AUC | ||||
|---|---|---|---|---|---|---|
| R1 | 37.9% (29.1–46.8%) | < 0.001 | 96.5% (92.6–100%) | < 0.001 | 0.709 (0.643–0.776) | 0.279 |
| R2 | 44.8% (35.8–53.9%) | 0.008 | 95.3% (90.9–99.8%) | < 0.001 | 0.717 (0.649–0.784) | 0.187 |
| R3 | 47.4% (38.3–56.5%) | 0.020 | 89.5% (83.1–96.0%) | < 0.001 | 0.688 (0.62–0.757) | 0.568 |
| R4 | 50.0% (40.9–59.1%) | 0.082 | 90.7% (84.6–96.8%) | < 0.001 | 0.722 (0.654–0.789) | 0.145 |
| E1 | 50.9% (41.8–60.0%) | 0.137 | 81.4% (73.3–89.6%) | 0.015 | 0.680 (0.612–0.749) | 0.742 |
| E2 | 39.7% (30.8–48.6%) | 0.001 | 89.5% (83.1–96.0%) | 0.001 | 0.695 (0.629–0.760) | 0.500 |
| E3 | 24.1% (16.4–31.9%) | < 0.001 | 98.8% (96.6–100%) | < 0.001 | 0.709 (0.642–0.775) | 0.305 |
| E4 | 42.2% (33.3–51.2%) | 0.001 | 87.2% (80.2–94.3%) | 0.002 | 0.692 (0.624–0.761) | 0.494 |
| CNN | 59.5% (50.5–68.4%) | 69.8% (60.1–79.5%) | 0.666 (0.592–0.740) | |||
| R1 | 39.8% (30.6–49.0%) | < 0.001 | 96.0% (90.6–100%) | < 0.001 | 0.732 (0.658–0.806) | 0.111 |
| R2 | 47.2% (37.8–56.6%) | 0.011 | 98.0% (94.2–100%) | < 0.001 | 0.768 (0.699–0.837) | 0.011 |
| R3 | 50.0% (40.6–59.4%) | 0.029 | 86.0% (76.4–95.6%) | 0.008 | 0.698 (0.618–0.778) | 0.336 |
| R4 | 52.8% (43.4–62.2%) | 0.110 | 84.0% (73.8–94.2%) | 0.008 | 0.705 (0.624–0.786) | 0.253 |
| E1 | 52.8% (43.4–62.2%) | 0.128 | 76.0% (64.2–87.8%) | 0.123 | 0.657 (0.574–0.741) | 0.913 |
| E2 | 42.6% (33.3–51.9%) | 0.001 | 86.0% (76.4–95.6%) | 0.008 | 0.685 (0.605–0.765) | 0.525 |
| E3 | 25.0% (16.8–33.2%) | < 0.001 | 98.0% (94.2–100%) | < 0.001 | 0.730 (0.654–0.806) | 0.110 |
| E4 | 44.4% (35.1–53.8%) | 0.002 | 82.0% (71.4–92.6%) | 0.037 | 0.675 (0.59–0.759) | 0.628 |
| CNN | 62.0% (52.9–71.2%) | 66.0% (52.9–79.1%) | 0.652 (0.563–0.741) | |||
| R1 | 12.5% (0–35.4%) | 0.046 | 97.2% (91.9–100%) | 0.011 | 0.469 (0.234–0.703) | 0.435 |
| R2 | 12.5% (0–35.4%) | 0.046 | 91.7% (82.6–100%) | 0.119 | 0.634 (0.372–0.895) | 0.902 |
| R3 | 12.5% (0–35.4%) | 0.046 | 94.4% (87.0–100%) | 0.046 | 0.535 (0.313–0.757) | 0.493 |
| R4 | 12.5% (0–35.4%) | 0.046 | 100% (100–100%) | 0.001 | 0.535 (0.290–0.780) | 0.699 |
| E1 | 25.0% (0–55.0%) | 0.128 | 88.9% (78.6–99.2%) | 0.239 | 0.587 (0.371–0.803) | 0.857 |
| E2 | 0% (0–0%) | 0.001 | 94.4% (87.0–100%) | 0.046 | 0.509 (0.320–0.697) | 0.528 |
| E3 | 12.5% (0–35.4%) | 0.046 | 100% (100–100%) | 0.001 | 0.674 (0.465–0.882) | 0.803 |
| E4 | 12.5% (0–35.4%) | 0.046 | 94.4% (87.0–100%) | 0.046 | 0.615 (0.420–0.809) | 0.970 |
| CNN | 62.5% (29.0–96.0%) | 77.8% (64.2–91.4%) | 0.808 | 0.622 (0.355–0.888) | ||
R radiologist, E endocrinologist, CNN deep convolutional neural network, AUS atypia of undetermined significance, FLUS follicular lesion of undetermined significance.
aCompared with the results of the convolutional neural network (CNN) using by generalized estimating equation.
bCompared with the results of the CNN using by DeLong’s test.
Figure 1Comparing diagnostic performances between the 8 physicians and CNN using the receiver operating characteristic analysis for the atypia of undetermined significance (AUS)/follicular lesion of undetermined significance (FLUS, A), only AUS (B), and only FLUS (C) groups. Data in parentheses are the AUC results of each physician or CNN. CNN deep convolutional neural network, AUS atypia of undetermined significance, FLUS follicular lesion of undetermined significance, R radiologist, E endocrinologist.
Figure 2Diagram of the study group which included patients from 3 different hospitals. FNA fine-needle aspiration, AUS atypia of undetermined significance, FLUS follicular lesion of undetermined significance.
Figure 3Deep convolutional neural network (CNN) processing using ultrasonography (US) images of malignant thyroid nodules with atypia of undetermined significance (AUS, A) or follicular lesion of undetermined significance (FLUS, B) results on fine-needle aspiration (FNA). (A) A captured thyroid US image of a yellow square region-of-interest covering the whole thyroid nodule in a 71-year-old man. There was a 10 mm-sized thyroid nodule diagnosed as AUS on US-guided FNA. The cancer probability calculated by CNN was 90.9%. The patient underwent surgery, and pathology confirmed papillary carcinoma. (B) A captured thyroid US image of a yellow square region-of-interest covering the whole nodule in a 57-year-old woman. There was a 12 mm-sized thyroid nodule diagnosed as FLUS on US-guided FNA. The cancer probability calculated by CNN was 88.1%. The patient underwent surgery, and pathology confirmed encapsulated angioinvasive follicular carcinoma.