| Literature DB >> 31882683 |
Eunjung Lee1, Heonkyu Ha2, Hye Jung Kim3, Hee Jung Moon4, Jung Hee Byon4, Sun Huh4, Jinwoo Son4, Jiyoung Yoon4, Kyunghwa Han4, Jin Young Kwak5.
Abstract
Thyroid nodules are a common clinical problem. Ultrasonography (US) is the main tool used to sensitively diagnose thyroid cancer. Although US is non-invasive and can accurately differentiate benign and malignant thyroid nodules, it is subjective and its results inevitably lack reproducibility. Therefore, to provide objective and reliable information for US assessment, we developed a CADx system that utilizes convolutional neural networks and the machine learning technique. The diagnostic performances of 6 radiologists and 3 representative results obtained from the proposed CADx system were compared and analyzed.Entities:
Mesh:
Year: 2019 PMID: 31882683 PMCID: PMC6934479 DOI: 10.1038/s41598-019-56395-x
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Diagnostic performances of radiologists and CNNs.
| TP | FN | FP | TN | Accuracy | Specificity | Sensitivity | |
|---|---|---|---|---|---|---|---|
| Faculty 1 | 91 | 9 | 24 | 26 | 78 (70.64, 83.93) | 52 (38.63, 65.08) | 91 (83.58, 95.26) |
| Faculty 2 | 76 | 24 | 2 | 48 | 82.67 (75.8, 87.89) | 96 (85.32, 99) | 76 (66.75, 83.32) |
| Fellow 1 | 63 | 37 | 3 | 47 | 73.33 (65.81, 79.71) | 94 (82.92, 98.06) | 63 (53.19, 71.84) |
| Fellow 2 | 65 | 35 | 4 | 46 | 74 (66.59, 80.25) | 92 (80.81, 96.91) | 65 (55.23, 73.65) |
| Resident 1 | 49 | 51 | 6 | 44 | 62 (53.97, 69.42) | 88 (75.99, 94.44) | 49 (39.37, 58.71) |
| Resident 2 | 63 | 37 | 14 | 36 | 66 (58.37, 72.88) | 72 (57.53, 83) | 63 (53.36, 71.71) |
| CNN 1 | 96 | 4 | 5 | 45 | 94 (88.83, 96.86) | 90 (78.03, 95.8) | 96 (89.82, 98.49) |
| CNN 2 | 94 | 6 | 3 | 47 | 94 (88.83, 96.86) | 94 (82.92, 98.06) | 94 (87.27, 97.28) |
| CNN 3 | 98 | 2 | 7 | 43 | 94 (88.88, 96.85) | 86 (73.28, 93.23) | 98 (92.45, 99.49) |
Note. - Data in parentheses are 95% confidence intervals.
TP = true positive; FN = false negative; FP = false positive; TN = true negative; CNN = deep convolutional neural network; AUC = area under the curve.
Performances of fine-tuned CNNs.
| Net | AlexNet | OverFeat | VGG | VGG-verydeep | ResNet | Inception |
|---|---|---|---|---|---|---|
| Acc | 86.7 | 85.3 | 86 | 85.3 | 84 | 86.7 |
| Spe | 88 | 86 | 84 | 74 | 86 | 78 |
| Sen | 86 | 85 | 87 | 91 | 83 | 91 |
| AUC | 90.3 | 88.4 | 89.3 | 90.6 | 90.5 | 88.3 |
Extended features from a single CNN with/without fine-tuning and classification using SVM/RF: ‘Name’ follows the form ‘extracted layer-classifier’ and # denotes the number of features.
| Net | Name | # | Without fine-tuning | With fine-tuning | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Acc | Spe | Sen | AUC | Acc | Spe | Sen | AUC | |||
| AlexNet | fc1-SVM | 4096 | 80.0 | 80.0 | 80.0 | 89.2 | 87.3 | 86.0 | 88.0 | 91.2 |
| fc1-RF | 4096 | 85.3 | 82.0 | 87.0 | 88.4 | 86.0 | 82.0 | 88.0 | 88.7 | |
| fc2-SVM | 4096 | 81.3 | 80.0 | 82.0 | 88.3 | 84.7 | 82.0 | 86.0 | 90.0 | |
| fc2-RF | 4096 | 84.0 | 78.0 | 87.0 | 86.9 | 87.3 | 82.0 | 90.0 | 88.3 | |
| fc1fc2-SVM | 8192 | 82.0 | 80.0 | 83.0 | 89.0 | 85.3 | 84.0 | 86.0 | 90.7 | |
| fc1fc2-RF | 8192 | 86.0 | 82.0 | 88.0 | 86.6 | 87.3 | 84.0 | 89.0 | 88.8 | |
| OverFeat | fc1-SVM | 4096 | 78.7 | 74.0 | 81.0 | 86.7 | 84.7 | 82.0 | 86.0 | 90.6 |
| fc1-RF | 4096 | 81.3 | 78.0 | 83.0 | 84.3 | 87.3 | 84.0 | 89.0 | 89.9 | |
| fc2-SVM | 4096 | 81.3 | 80.0 | 82.0 | 86.8 | 85.3 | 84.0 | 86.0 | 89.6 | |
| fc2-RF | 4096 | 81.3 | 74.0 | 85.0 | 84.8 | 88.0 | 84.0 | 90.0 | 88.4 | |
| fc1fc2-SVM | 8192 | 81.3 | 76.0 | 84.0 | 86.6 | 85.3 | 84.0 | 86.0 | 90.2 | |
| fc1fc2-RF | 8192 | 82.0 | 72.0 | 87.0 | 85.1 | 88.0 | 86.0 | 89.0 | 89.5 | |
| VGG | fc1-SVM | 4096 | 79.3 | 82.0 | 78.0 | 86.5 | 84.7 | 80.0 | 87.0 | 90.7 |
| fc1-RF | 4096 | 84.7 | 80.0 | 87.0 | 86.4 | 89.3 | 86.0 | 91.0 | 90.7 | |
| fc2-SVM | 4096 | 80.7 | 84.0 | 79.0 | 86.1 | 86.0 | 82.0 | 88.0 | 90.6 | |
| fc2-RF | 4096 | 85.3 | 80.0 | 88.0 | 86.8 | 88.0 | 84.0 | 90.0 | 90.8 | |
| fc1fc2-SVM | 8192 | 79.3 | 82.0 | 78.0 | 86.2 | 86.7 | 82.0 | 89.0 | 91.0 | |
| fc1fc2-RF | 8192 | 82.7 | 80.0 | 84.0 | 83.4 | 88.7 | 84.0 | 91.0 | 90.8 | |
| VGG-verydeep | fc1-SVM | 4096 | 84.7 | 88.0 | 83.0 | 91.4 | 78.0 | 76.0 | 79.0 | 85.8 |
| fc1-RF | 4096 | 84.0 | 88.0 | 82.0 | 91.1 | 74.0 | 76.0 | 73.0 | 80.4 | |
| fc2-SVM | 4096 | 84.0 | 88.0 | 82.0 | 91.0 | 72.0 | 76.0 | 70.0 | 81.2 | |
| fc2-RF | 4096 | 85.3 | 90.0 | 83.0 | 89.9 | 69.3 | 74.0 | 67.0 | 75.6 | |
| fc1fc2-SVM | 8192 | 84.7 | 88.0 | 83.0 | 91.1 | 76.0 | 74.0 | 77.0 | 85.9 | |
| fc1fc2-RF | 8192 | 85.3 | 92.0 | 82.0 | 90.6 | 71.3 | 74.0 | 70.0 | 77.9 | |
| ResNet | avg-SVM | 2048 | 84.0 | 82.0 | 85.0 | 89.8 | 74.7 | 82.0 | 71.0 | 84.7 |
| avg-RF | 2048 | 85.3 | 86.0 | 85.0 | 90.9 | 76.7 | 80.0 | 75.0 | 85.6 | |
| Inception | avg-SVM | 2048 | 85.3 | 82.0 | 87.0 | 88.3 | 75.3 | 70.0 | 78.0 | 82.9 |
| avg-RF | 2048 | 84.7 | 72.0 | 91.0 | 87.4 | 76.0 | 68.0 | 80.0 | 78.2 | |
Selected CNN features: AlexNet-fc2 with fine-tuning [A], OverFeat-fc2 with fine-tuning [O], VGG-fc1 with fine-tuning [V], VGG-verydeep-fc2 without fine-tuning [Vv], ResNet-avg without fine-tuning [R], Inception-avg without fine-tuning [I].
| Name | Classifier | |||||||
|---|---|---|---|---|---|---|---|---|
| SVM | RF | |||||||
| Acc | Spe | Sen | AUC | Acc | Spe | Sen | AUC | |
| [A] | 84.7 | 82.0 | 86.0 | 90.0 | 87.3 | 82.0 | 90.0 | 88.3 |
| [O] | 85.3 | 84.0 | 86.0 | 89.6 | 88.0 | 84.0 | 90.0 | 88.4 |
| [V] | 84.7 | 80.0 | 87.0 | 90.7 | 89.3 | 86.0 | 91.0 | 90.7 |
| [Vv] | 84.0 | 88.0 | 82.0 | 91.0 | 85.3 | 90.0 | 83.0 | 89.9 |
| [R] | 84.0 | 82.0 | 85.0 | 89.8 | 85.3 | 86.0 | 85.0 | 90.9 |
| [I] | 85.3 | 82.0 | 87.0 | 88.3 | 84.7 | 72.0 | 91.0 | 87.4 |
Feature concatenation (2 or 3 CNNs) results: denotes feature concatenation using the features from CNNs, to . An asterisk denotes that the concatenation result is worse than the individual result.
| Name | Classifier | Name | Classifier | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| SVM | RF | SVM | RF | ||||||||||||||
| Acc | Spe | Sen | AUC | Acc | Spe | Sen | AUC | Acc | Spe | Sen | AUC | Acc | Spe | Sen | AUC | ||
| [AO] | 86.7 | *84.0 | 88.0 | 90.6 | 87.3 | 82.0 | 90.0 | 89.7 | [AOV] | 90.7 | 82.0 | 95.0 | 95.0 | 91.3 | 92.0 | 91.0 | 95.1 |
| [AV] | 88.0 | *76.0 | 94.0 | 94.5 | 90.7 | 90.0 | 91.0 | 95.0 | [AOVv] | 91.3 | 88.0 | 93.0 | 93.9 | 93.3 | 92.0 | 94.0 | 94.1 |
| [AVv] | 93.3 | 90.0 | 95.0 | 94.1 | 92.7 | 90.0 | 94.0 | 93.9 | [AOR] | 92.0 | 90.0 | 93.0 | 94.1 | *86.7 | *82.0 | *89.0 | 91.5 |
| [AR] | 92.0 | 90.0 | 93.0 | 94.1 | 88.7 | 88.0 | 89.0 | 91.3 | [AOI] | 88.7 | 86.0 | 90.0 | 92.4 | 93.3 | 84.0 | 98.0 | 91.7 |
| [AI] | 88.0 | 86.0 | 89.0 | 92.1 | 90.7 | 82.0 | 95.0 | 91.7 | [AVVv] | 93.3 | 88.0 | 96.0 | 97.2 | 93.3 | 90.0 | 95.0 | 96.8 |
| [OV] | 90.0 | 82.0 | 94.0 | 94.8 | 92.0 | 90.0 | 93.0 | 94.8 | [AVR] | 93.3 | 84.0 | 98.0 | 96.8 | 92.0 | 88.0 | 94.0 | 95.8 |
| [OVv] | 90.7 | 90.0 | 91.0 | 94.1 | 92.0 | 92.0 | 92.0 | 94.6 | [AVI] | 92.0 | 86.0 | 95.0 | 94.3 | 92.7 | 80.0 | 99.0 | 93.5 |
| [OR] | 92.0 | 90.0 | 93.0 | 94.3 | 89.3 | 88.0 | 90.0 | 93.2 | [AVvR] | 93.3 | 90.0 | 95.0 | 94.2 | 92.7 | 90.0 | 94.0 | 93.8 |
| [OI] | 87.3 | 84.0 | 89.0 | 91.3 | 90.7 | 86.0 | 93.0 | 91.5 | [AVvI] | 90.0 | 88.0 | 91.0 | 93.1 | 91.3 | 86.0 | 94.0 | 93.5 |
| [VVv] | 90.7 | 88.0 | 92.0 | 95.9 | 93.3 | 90.0 | 95.0 | 97.5 | [ARI] | 88.7 | 86.0 | 90.0 | 92.9 | 90.7 | 84.0 | 94.0 | 92.2 |
| [VR] | 92.0 | 86.0 | 95.0 | 95.4 | 90.0 | 86.0 | 92.0 | 93.3 | [OVVv] | 92.0 | 88.0 | 94.0 | 97.3 | 93.3 | 90.0 | 95.0 | 97.7 |
| [VI] | 87.3 | 80.0 | 91.0 | 93.4 | *88.7 | 78.0 | 94.0 | 92.9 | [OVR] | 94.7 | 88.0 | 98.0 | 97.4 | 93.3 | 94.0 | 93.0 | 96.4 |
| [VvR] | 88.7 | 90.0 | 88.0 | 92.1 | *84.7 | *88.0 | *83.0 | 91.5 | [OVI] | 90.7 | 82.0 | 95.0 | 94.4 | 91.3 | 82.0 | 96.0 | 94.7 |
| [VvI] | *84.0 | *80.0 | *86.0 | 89.4 | 86.7 | *82.0 | *89.0 | 90.4 | [OVvR] | 90.7 | 90.0 | 91.0 | 94.2 | 91.3 | 90.0 | 92.0 | 95.3 |
| [RI] | 85.3 | 82.0 | 87.0 | 89.4 | 85.3 | 78.0 | 89.0 | 87.0 | [OVvI] | 88.7 | *84.0 | 91.0 | 92.5 | 91.3 | 86.0 | 94.0 | 93.0 |
| [ORI] | 88.0 | 84.0 | 90.0 | 92.2 | 88.7 | *80.0 | 93.0 | 90.4 | |||||||||
| [VVvR] | 92.0 | 88.0 | 94.0 | 96.4 | 93.3 | 90.0 | 95.0 | 97.1 | |||||||||
| [VVvI] | 90.0 | *84.0 | 93.0 | 94.8 | 92.7 | *86.0 | 96.0 | 95.8 | |||||||||
| [VRI] | 88.0 | *82.0 | 91.0 | 94.2 | 90.0 | *78.0 | 96.0 | 91.3 | |||||||||
| [VvRI] | 86.7 | *84.0 | 88.0 | 90.2 | 87.3 | *82.0 | *90.0 | 91.0 | |||||||||
Feature concatenation (4 or more CNNs) results: denotes feature concatenation using the features from CNNs, to . An asterisk denotes that the concatenation result is worse than the individual result.
| Name | Classifier | |||||||
|---|---|---|---|---|---|---|---|---|
| SVM | RF | |||||||
| Acc | Spe | Sen | AUC | Acc | Spe | Sen | AUC | |
| [AOVVv] | 93.3 | 88.0 | 96.0 | 96.8 | 94.0 | 92.0 | 95.0 | 97.1 |
| [AOVR] | 93.3 | 84.0 | 98.0 | 96.9 | 92.7 | 92.0 | 93.0 | 95.6 |
| [AOVI] | 92.7 | 84.0 | 97.0 | 94.7 | 92.7 | 84.0 | 97.0 | 94.5 |
| [AOVvR] | 92.0 | 90.0 | 93.0 | 94.1 | 92.7 | 90.0 | 94.0 | 94.1 |
| [AOVvI] | 91.3 | 86.0 | 94.0 | 92.9 | 91.3 | 86.0 | 94.0 | 92.8 |
| [AORI] | 89.3 | 86.0 | 91.0 | 92.9 | 91.3 | *84.0 | 95.0 | 91.4 |
| [AVVvR] | 94.0 | 90.0 | 96.0 | 97.3 | 92.7 | *88.0 | 95.0 | 97.4 |
| [AVVvI] | 91.3 | 88.0 | 93.0 | 95.8 | 92.0 | *88.0 | 94.0 | 94.6 |
| [AVRI] | 93.3 | 86.0 | 97.0 | 95.0 | 92.0 | *84.0 | 96.0 | 93.8 |
| [AVvRI] | 90.0 | 88.0 | 91.0 | 93.3 | 92.0 | *86.0 | 95.0 | 93.4 |
| [OVVvR] | 93.3 | 90.0 | 95.0 | 97.4 | 94.0 | 94.0 | 94.0 | 98.5 |
| [OVVvI] | 90.0 | *84.0 | 93.0 | 95.2 | 91.3 | *86.0 | 94.0 | 96.2 |
| [OVRI] | 92.0 | 84.0 | 96.0 | 95.0 | 92.0 | *78.0 | 99.0 | 94.4 |
| [OVvRI] | 90.0 | 88.0 | 91.0 | 93.1 | 92.0 | *88.0 | 94.0 | 93.4 |
| [VVvRI] | 90.0 | *84.0 | 93.0 | 95.4 | 90.0 | *84.0 | 93.0 | 94.4 |
| [AOVVvR] | 94.0 | 90.0 | 96.0 | 96.9 | 93.3 | 90.0 | 95.0 | 97.0 |
| [AOVVvI] | 93.3 | 90.0 | 95.0 | 95.7 | 92.7 | *88.0 | 95.0 | 95.5 |
| [AOVRI] | 93.3 | 86.0 | 97.0 | 95.2 | 93.3 | 86.0 | 97.0 | 94.2 |
| [AOVvRI] | 92.0 | 88.0 | 94.0 | 93.2 | 92.7 | *88.0 | 95.0 | 94.6 |
| [AVVvRI] | 91.3 | 88.0 | 93.0 | 95.9 | 92.0 | *88.0 | 94.0 | 94.1 |
| [OVVvRI] | 92.0 | 88.0 | 94.0 | 95.8 | 90.7 | *86.0 | 93.0 | 95.5 |
| [AOVVvRI] | 93.3 | 90.0 | 95.0 | 95.7 | 90.7 | *86.0 | 93.0 | 95.6 |
Classification ensemble results: denotes classification ensemble, where [] indicates the ensemble result of SVM and RF using CNN-based features. An asterisk denotes that the classification ensemble result is worse than the individual result.
| Name | Acc | Spe | Sen | AUC | Name | Acc | Spe | Sen | AUC | Name | Acc | Spe | Sen | AUC |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| *86.0 | 82.0 | 88.0 | 88.5 | 90.7 | 90.0 | 91.0 | 94.2 | 93.3 | 90.0 | 95.0 | 96.6 | |||
| *86.0 | 84.0 | *87.0 | 88.8 | 91.3 | 90.0 | 92.0 | 94.3 | 93.3 | 88.0 | 96.0 | 96.6 | |||
| *87.3 | *82.0 | *90.0 | 90.6 | 89.3 | 90.0 | *89.0 | 93.5 | 93.3 | 88.0 | 96.0 | 94.7 | |||
| *83.3 | *88.0 | *81.0 | 90.7 | 89.3 | 86.0 | 91.0 | 91.4 | 92.0 | *88.0 | 94.0 | 95.0 | |||
| *84.7 | 90.0 | *82.0 | 90.9 | 92.0 | *86.0 | 95.0 | 97.6 | 93.3 | *88.0 | 96.0 | 94.0 | |||
| *84.0 | *70.0 | 91.0 | 87.3 | 94.7 | 90.0 | 97.0 | 97.6 | 92.0 | 86.0 | 95.0 | 93.5 | |||
| *84.7 | *80.0 | *87.0 | 89.0 | 92.7 | *84.0 | 97.0 | 95.5 | 93.3 | *88.0 | 96.0 | 97.7 | |||
| 90.7 | *80.0 | 96.0 | 95.2 | 90.7 | 90.0 | 91.0 | 95.0 | 92.0 | *84.0 | 96.0 | 96.6 | |||
| 92.0 | *88.0 | 94.0 | 94.9 | 90.7 | *86.0 | 93.0 | 94.1 | 94.0 | 88.0 | 97.0 | 96.5 | |||
| 93.3 | 88.0 | 96.0 | 94.1 | 92.7 | 86.0 | 96.0 | 93.4 | 90.0 | *88.0 | 91.0 | 94.3 | |||
| 90.7 | 82.0 | 95.0 | 91.5 | 93.3 | *88.0 | 96.0 | 97.5 | 92.0 | 90.0 | 93.0 | 97.7 | |||
| 91.3 | 86.0 | 94.0 | 95.2 | 92.7 | 88.0 | 95.0 | 97.5 | 92.7 | *86.0 | 96.0 | 96.8 | |||
| 90.7 | *88.0 | 92.0 | 95.1 | 94.0 | 86.0 | 98.0 | 95.6 | 92.0 | 86.0 | 95.0 | 96.6 | |||
| 90.7 | 88.0 | 92.0 | 94.6 | 90.0 | 90.0 | 90.0 | 95.0 | 89.3 | 88.0 | 90.0 | 94.4 | |||
| 92.7 | 84.0 | 97.0 | 92.1 | 89.3 | *86.0 | 91.0 | 94.4 | 88.7 | *86.0 | 90.0 | 96.1 | |||
| 90.0 | *88.0 | 91.0 | 96.8 | 91.3 | 86.0 | 94.0 | 93.7 | 94.0 | 90.0 | 96.0 | 97.2 | |||
| 92.0 | 86.0 | 95.0 | 97.1 | 89.3 | 88.0 | 90.0 | 97.2 | 94.0 | 90.0 | 96.0 | 96.2 | |||
| 89.3 | *76.0 | 96.0 | 94.2 | 88.7 | *82.0 | 92.0 | 96.0 | 94.0 | 88.0 | 97.0 | 96.1 | |||
| 88.7 | 92.0 | 87.0 | 91.5 | 92.0 | *82.0 | 97.0 | 96.0 | 91.3 | *88.0 | 93.0 | 94.6 | |||
| 86.7 | *86.0 | 87.0 | 91.4 | 87.3 | *86.0 | 88.0 | 91.6 | 93.3 | *88.0 | 96.0 | 97.0 | |||
| 86.0 | *84.0 | *87.0 | 90.9 | 90.7 | *88.0 | 92.0 | 96.9 | |||||||
| 92.0 | *86.0 | 95.0 | 96.6 |
Results for when both feature concatenation and classifier ensemble were performed.
| Name | Acc | Spe | Sen | AUC | Name | Acc | Spe | Sen | AUC |
|---|---|---|---|---|---|---|---|---|---|
| [AOVVv] | 93.3 | 88.0 | 96.0 | 97.0 | [OVVvI] | 91.3 | 86.0 | 94.0 | 95.9 |
| [AOVR] | 93.3 | 82.0 | 99.0 | 96.6 | [OVRI] | 92.0 | 78.0 | 99.0 | 95.0 |
| [AOVI] | 93.3 | 84.0 | 98.0 | 94.8 | [OVvRI] | 92.0 | 88.0 | 94.0 | 93.3 |
| [AOVvR] | 92.0 | 90.0 | 93.0 | 94.3 | [VVvRI] | 90.0 | 84.0 | 93.0 | 95.2 |
| [AOVvI] | 92.0 | 86.0 | 95.0 | 93.1 | [AOVVvR] | 93.3 | 88.0 | 96.0 | 97.1 |
| [AORI] | 91.3 | 84.0 | 95.0 | 92.3 | [AOVVvI] | 93.3 | 88.0 | 96.0 | 95.9 |
| [AVVvR] | 94.0 | 90.0 | 96.0 | 97.5 | [AOVRI] | 92.7 | 84.0 | 97.0 | 94.9 |
| [AVVvI] | 92.0 | 88.0 | 94.0 | 95.5 | [AOVvRI] | 92.7 | 88.0 | 95.0 | 93.8 |
| [AVRI] | 93.3 | 84.0 | 98.0 | 94.4 | [AVVvRI] | 92.0 | 88.0 | 94.0 | 95.3 |
| [AVvRI] | 92.0 | 86.0 | 95.0 | 93.5 | [OVVvRI] | 90.7 | 86.0 | 93.0 | 95.7 |
| [OVVvR] | 93.3 | 90.0 | 95.0 | 98.0 | [AOVVvRI] | 90.7 | 86.0 | 93.0 | 96.0 |
Comparisons of diagnostic performances between experienced radiologists and CNNs for thyroid malignancy.
| Accuracy | Specificity | Sensitivity | |
|---|---|---|---|
| Faculty1 vs Faculty2 | 0.309 | <.001 | 0.006 |
| Faculty1 vs CNN1 | <.001 | <.001 | 0.163 |
| Faculty1 vs CNN2 | <.001 | <.001 | 0.424 |
| Faculty1 vs CNN3 | <.001 | <.001 | 0.046 |
| Faculty2 vs CNN1 | 0.004 | 0.257 | <.001 |
| Faculty2 vs CNN2 | 0.004 | 0.649 | <.001 |
| Faculty2 vs CNN3 | 0.003 | 0.102 | <.001 |
Interobserver variability for the prediction of thyroid malignancy among 6 radiologists and between 2 radiologists with similar levels of experience.
| Radiologist | Kappa (95% CI) |
|---|---|
| All | 0.465 (0.388, 0.535) |
| Faculties | 0.387 (0.226, 0.511) |
| Fellows | 0.663 (0.540, 0.784) |
| Residents | 0.418 (0.286, 0.557) |
Figure 1An ultrasonography (US) image of a 50-year-old woman with an incidentally detected thyroid nodule discovered on screening examination that shows a 1.2-cm sized hypoechoic solid nodule with eggshell calcifications (arrows). All 6 radiologists interpreted the nodule as a benign. In contrast, 3 CNN-combinations interpreted it as cancer. The nodule was diagnosed as papillary thyroid cancer by surgery.
Figure 2An ultrasonography (US) image of a left thyroid nodule in a 77-year-old woman who was confirmed with cancer in the right thyroid gland. A 1-cm sized isoechoic nodule with internal echogenic spots was seen (arrows). Four radiologists (1 faculty, 1 fellow, and two residents) interpreted the nodule as cancer. In contrast, 3 CNN-combinations interpreted it as benign. The nodule was diagnosed as adenomatous hyperplasia.
Figure 3Two feature extraction strategies using pre-trained CNN: Feature extraction from pre-trained CNN without fine-tuning (a) or with fine-tuning (b).
Figure 4Example of feature concatenation: Feature concatenation of features extracted from three different CNNs.
Figure 5Example of classification ensemble: Two CNNs were used as feature extractors and then classification ensembles were applied for SVM and RF of CNN-A(a) and CNN-B(b) to observe results. For further objective results, the classification ensemble was again applied for ensemble results(c).