| Literature DB >> 35880248 |
Tetsu Hayashida1, Erina Odani1, Masayuki Kikuchi1, Aiko Nagayama1, Tomoko Seki1, Maiko Takahashi1, Noriyuki Futatsugi2, Akiko Matsumoto3, Takeshi Murata4, Rurina Watanuki5, Takamichi Yokoe5, Ayako Nakashoji6, Hinako Maeda7, Tatsuya Onishi5, Sota Asaga8, Takashi Hojo9, Hiromitsu Jinno3, Keiichi Sotome7, Akira Matsui6, Akihiko Suto4, Shigeru Imoto8, Yuko Kitagawa1.
Abstract
Although the categorization of ultrasound using the Breast Imaging Reporting and Data System (BI-RADS) has become widespread worldwide, the problem of inter-observer variability remains. To maintain uniformity in diagnostic accuracy, we have developed a system in which artificial intelligence (AI) can distinguish whether a static image obtained using a breast ultrasound represents BI-RADS3 or lower or BI-RADS4a or higher to determine the medical management that should be performed on a patient whose breast ultrasound shows abnormalities. To establish and validate the AI system, a training dataset consisting of 4028 images containing 5014 lesions and a test dataset consisting of 3166 images containing 3656 lesions were collected and annotated. We selected a setting that maximized the area under the curve (AUC) and minimized the difference in sensitivity and specificity by adjusting the internal parameters of the AI system, achieving an AUC, sensitivity, and specificity of 0.95, 91.2%, and 90.7%, respectively. Furthermore, based on 30 images extracted from the test data, the diagnostic accuracy of 20 clinicians and the AI system was compared, and the AI system was found to be significantly superior to the clinicians (McNemar test, p < 0.001). Although deep-learning methods to categorize benign and malignant tumors using breast ultrasound have been extensively reported, our work represents the first attempt to establish an AI system to classify BI-RADS3 or lower and BI-RADS4a or higher successfully, providing important implications for clinical actions. These results suggest that the AI diagnostic system is sufficient to proceed to the next stage of clinical application.Entities:
Keywords: AI diagnosis; BI-RADS; artificial intelligence; breast ultrasound; deep learning
Mesh:
Year: 2022 PMID: 35880248 PMCID: PMC9530860 DOI: 10.1111/cas.15511
Source DB: PubMed Journal: Cancer Sci ISSN: 1347-9032 Impact factor: 6.518
Number of images and lesions in the training and test datasets obtained using ultrasound devices by different manufacturers
| Training data | Test data | Total | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Manufactures of the ultrasound devices | GE Healthcare Systems | FUJIFILM Health care (HITACHI Aloka) | Canon Medical Systems | Total No. of training data | GE Healthcare Systems | FUJIFILM Health care (HITACHI Aloka) | Canon Medical Systems | Total No. of test data | |
| No. of images | 747 | 1104 | 2177 | 4028 | 650 | 531 | 1985 | 3166 | 7194 |
| No. of lesions | 849 | 1720 | 2445 | 5014 | 695 | 871 | 2090 | 3656 | 8670 |
Properties of the ultrasound image dataset. The table shows the number of lesions determined to be in each Breast Imaging Reporting and Data System (BI‐RADS) category and the number of benign or malignant tumors included in these categories
| Training data | Test data | Total No. of lesions | % of malignant tumor | |||||
|---|---|---|---|---|---|---|---|---|
| BI‐RADS | No. of lesions diagnosed as malignant | No. of lesions diagnosed as benign | No. of lesions in training data | No. of lesions diagnosed as malignant | No. of lesions diagnosed as benign | No. of lesions in test data | ||
| 1 | 0 | 0 | 0 | 0 | 1470 | 1470 | 1470 | 0 |
| 2 | 0 | 437 | 437 | 0 | 176 | 176 | 613 | 0 |
| 3 | 0 | 579 | 579 | 0 | 278 | 278 | 857 | 0 |
| 4a | 44 | 701 | 745 | 16 | 317 | 333 | 1078 | 5.57 |
| 4b | 291 | 653 | 944 | 148 | 251 | 399 | 1343 | 32.7 |
| 4c | 978 | 127 | 1105 | 420 | 48 | 468 | 1573 | 88.9 |
| 5 | 1189 | 15 | 1204 | 524 | 8 | 532 | 1736 | 98.7 |
FIGURE 1Receiver‐operating characteristic (ROC) curve by possible thresholds of the confidence score for the detection in each image of Breast Imaging Reporting and Data System (BI‐RADS) 4a or higher. A, ROC curve with an area under the curve (AUC) of 0.95. B, Sensitivity and specificity with variations in thresholds of the confidence score
FIGURE 2Sensitivity and specificity of diagnosis by artificial Intelligence (AI) and 20 clinicians for 30 images. X: diagnosis by each clinician, ▲: AI diagnosis, ●: average of clinicians