Tomoyuki Noguchi1,2,3,4, Fumiya Uchiyama1, Yusuke Kawata1, Akihiro Machitori1, Yoshitaka Shida1, Takashi Okafuji1, Kota Yokoyama1, Yosuke Inaba5, Tsuyoshi Tajima1. 1. Department of Radiology, Center Hospital, National Center for Global Health and Medicine. 2. Education and Training Office, Department of Clinical Research, Center for Clinical Sciences, National Center for Global Health and Medicine. 3. Department of Radiology, National Hospital Organization Kyushu Medical Center. 4. Department of Clinical Research, National Hospital Organization Kyushu Medical Center. 5. Biostatistics Section, Department of Data Science, Center for Clinical Sciences, National Center for Global Health and Medicine.
Abstract
PURPOSE: Increased use of deep convolutional neural networks (DCNNs) in medical imaging diagnosis requires determinate evaluation of diagnostic performance. We performed the fundamental investigation of diagnostic performance of DCNNs using the detection task of brain metastasis. METHODS: We retrospectively investigated AlexNet and GoogLeNet using 3117 positive and 37961 negative MRI images with and without metastasis regarding (1) diagnostic biases, (2) the optimal K number of K-fold cross validations (K-CVs), (3) the optimal positive versus negative image ratio, (4) the accuracy improvement curves, (5) the accuracy range prediction by the bootstrap method, and (6) metastatic lesion detection by regions with CNNs (R-CNNs). RESULTS: Respectively, AlexNet and GoogLeNet had (1) 50 ± 4.6% and 50 ± 4.9% of the maximal mean ± 95% confidence intervals (95% CIs) measured with equal-sized negative versus negative image datasets and positive versus positive image datasets, (2) no less than 10 and 4 of K number in K-CVs fell within the respective maximum biases of 4.6% or 4.9%, (3) 74% of the highest accuracy with equal positive versus negative image ratio dataset and 91% of that with four times of negative-to-positive image ratio dataset, (4) the accuracy improvement curves increasing from 69% to 74% and 73% to 88% as positive versus negative pairs of the training images increased from 500 to 2495, (5) at least nine and six out of 10-CV result sets essential to predict the accuracy ranges by the bootstrap method, and (6) 50% and 45% of metastatic lesion detection accuracies by R-CNNs. CONCLUSIONS: Our research presented methodological fundamentals to evaluate diagnostic features in the visual recognition of DCNNs. Our series will help to conduct the accuracy investigation of computer diagnosis in medical imaging.
PURPOSE: Increased use of deep convolutional neural networks (DCNNs) in medical imaging diagnosis requires determinate evaluation of diagnostic performance. We performed the fundamental investigation of diagnostic performance of DCNNs using the detection task of brain metastasis. METHODS: We retrospectively investigated AlexNet and GoogLeNet using 3117 positive and 37961 negative MRI images with and without metastasis regarding (1) diagnostic biases, (2) the optimal K number of K-fold cross validations (K-CVs), (3) the optimal positive versus negative image ratio, (4) the accuracy improvement curves, (5) the accuracy range prediction by the bootstrap method, and (6) metastatic lesion detection by regions with CNNs (R-CNNs). RESULTS: Respectively, AlexNet and GoogLeNet had (1) 50 ± 4.6% and 50 ± 4.9% of the maximal mean ± 95% confidence intervals (95% CIs) measured with equal-sized negative versus negative image datasets and positive versus positive image datasets, (2) no less than 10 and 4 of K number in K-CVs fell within the respective maximum biases of 4.6% or 4.9%, (3) 74% of the highest accuracy with equal positive versus negative image ratio dataset and 91% of that with four times of negative-to-positive image ratio dataset, (4) the accuracy improvement curves increasing from 69% to 74% and 73% to 88% as positive versus negative pairs of the training images increased from 500 to 2495, (5) at least nine and six out of 10-CV result sets essential to predict the accuracy ranges by the bootstrap method, and (6) 50% and 45% of metastatic lesion detection accuracies by R-CNNs. CONCLUSIONS: Our research presented methodological fundamentals to evaluate diagnostic features in the visual recognition of DCNNs. Our series will help to conduct the accuracy investigation of computer diagnosis in medical imaging.