Tomomi Nobashi1, Claudia Zacharias2, Jason K Ellis3, Valentina Ferri1, Mary Ellen Koran1, Benjamin L Franc1, Andrei Iagaru1, Guido A Davidzon4. 1. Division of Nuclear Medicine and Molecular Imaging, Department of Radiology, Stanford University, 300 Pasteur Drive, Office H2228, Stanford, CA, 94305, USA. 2. Clinic for Nuclear Medicine, University Hospital Essen, Essen, Germany. 3. DimensionalMechanics Inc.®, 2821 Northup Way Suite, Bellevue, WA, #200, USA. 4. Division of Nuclear Medicine and Molecular Imaging, Department of Radiology, Stanford University, 300 Pasteur Drive, Office H2228, Stanford, CA, 94305, USA. gdavidzon@stanford.edu.
Abstract
The high-background glucose metabolism of normal gray matter on [18F]-fluoro-2-D-deoxyglucose (FDG) positron emission tomography (PET) of the brain results in a low signal-to-background ratio, potentially increasing the possibility of missing important findings in patients with intracranial malignancies. To explore the strategy of using a deep learning classifier to aid in distinguishing normal versus abnormal findings on PET brain images, this study evaluated the performance of a two-dimensional convolutional neural network (2D-CNN) to classify FDG PET brain scans as normal (N) or abnormal (A). METHODS: Two hundred eighty-nine brain FDG-PET scans (N; n = 150, A; n = 139) resulting in a total of 68,260 images were included. Nine individual 2D-CNN models with three different window settings for axial, coronal, and sagittal axes were trained and validated. The performance of these individual and ensemble models was evaluated and compared using a test dataset. Odds ratio, Akaike's information criterion (AIC), and area under curve (AUC) on receiver-operative-characteristic curve, accuracy, and standard deviation (SD) were calculated. RESULTS: An optimal window setting to classify normal and abnormal scans was different for each axis of the individual models. An ensembled model using different axes with an optimized window setting (window-triad) showed better performance than ensembled models using the same axis and different windows settings (axis-triad). Increase in odds ratio and decrease in SD were observed in both axis-triad and window-triad models compared with individual models, whereas improvements of AUC and AIC were seen in window-triad models. An overall model averaging the probabilities of all individual models showed the best accuracy of 82.0%. CONCLUSIONS: Data ensemble using different window settings and axes was effective to improve 2D-CNN performance parameters for the classification of brain FDG-PET scans. If prospectively validated with a larger cohort of patients, similar models could provide decision support in a clinical setting.
The high-background glucose metabolism of normal gray matter on [18F]-fluoro-2-D-deoxyglucose (FDG) positron emission tomography (PET) of the brain results in a low signal-to-background ratio, potentially increasing the possibility of missing important findings in patients with intracranial malignancies. To explore the strategy of using a deep learning classifier to aid in distinguishing normal versus abnormal findings on PET brain images, this study evaluated the performance of a two-dimensional convolutional neural network (2D-CNN) to classify FDG PET brain scans as normal (N) or abnormal (A). METHODS: Two hundred eighty-nine brain FDG-PET scans (N; n = 150, A; n = 139) resulting in a total of 68,260 images were included. Nine individual 2D-CNN models with three different window settings for axial, coronal, and sagittal axes were trained and validated. The performance of these individual and ensemble models was evaluated and compared using a test dataset. Odds ratio, Akaike's information criterion (AIC), and area under curve (AUC) on receiver-operative-characteristic curve, accuracy, and standard deviation (SD) were calculated. RESULTS: An optimal window setting to classify normal and abnormal scans was different for each axis of the individual models. An ensembled model using different axes with an optimized window setting (window-triad) showed better performance than ensembled models using the same axis and different windows settings (axis-triad). Increase in odds ratio and decrease in SD were observed in both axis-triad and window-triad models compared with individual models, whereas improvements of AUC and AIC were seen in window-triad models. An overall model averaging the probabilities of all individual models showed the best accuracy of 82.0%. CONCLUSIONS: Data ensemble using different window settings and axes was effective to improve 2D-CNN performance parameters for the classification of brain FDG-PET scans. If prospectively validated with a larger cohort of patients, similar models could provide decision support in a clinical setting.
Entities:
Keywords:
2D-CNN; Brain; Cancer; Ensemble; FDG-PET; S: Deep learning
Authors: Geert Litjens; Thijs Kooi; Babak Ehteshami Bejnordi; Arnaud Arindra Adiyoso Setio; Francesco Ciompi; Mohsen Ghafoorian; Jeroen A W M van der Laak; Bram van Ginneken; Clara I Sánchez Journal: Med Image Anal Date: 2017-07-26 Impact factor: 8.545
Authors: Jose Bernal; Kaisar Kushibar; Daniel S Asfaw; Sergi Valverde; Arnau Oliver; Robert Martí; Xavier Lladó Journal: Artif Intell Med Date: 2018-09-06 Impact factor: 5.326
Authors: Hossein Jadvar; Patrick M Colletti; Roberto Delgado-Bolton; Giuseppe Esposito; Bernd J Krause; Andrei H Iagaru; Helen Nadel; David I Quinn; Eric Rohren; Rathan M Subramaniam; Katherine Zukotynski; Julie Kauffman; Sukhjeet Ahuja; Landis Griffeth Journal: J Nucl Med Date: 2017-10-12 Impact factor: 10.057