Stephanie T Jünger1,2, Ulrike Cornelia Isabel Hoyer3, Diana Schaufler4, Kai Roman Laukamp3, Lukas Goertz3, Frank Thiele3,5, Jan-Peter Grunz6, Marc Schlamann3, Michael Perkuhn3,5, Christoph Kabbasch3, Thorsten Persigehl3, Stefan Grau1,2, Jan Borggrefe3,7, Matthias Scheffler4, Rahil Shahzad3,5, Lenhard Pennig3. 1. Department of General Neurosurgery, Center for Neurosurgery, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany. 2. Centre for Integrated Oncology, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany. 3. Institute for Diagnostic and Interventional Radiology, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany. 4. Department of Internal Medicine, Center for Integrated Oncology Aachen Bonn Cologne Duesseldorf, Network Genomic Medicine, Lung Cancer Group Cologne, Faculty of Medicine and University Hospital of Cologne, University of Cologne, Cologne, Germany. 5. Philips GmbH Innovative Technologies, Aachen, Germany. 6. Department of Diagnostic and Interventional Radiology, University Hospital Würzburg, Würzburg, Germany. 7. Department of Radiology, Neuroradiology and Nuclear Medicine, Johannes Wesling University Hospital, Ruhr University Bochum, Bochum, Germany.
Abstract
BACKGROUND: Non-small cell lung cancer (NSCLC) is the most common tumor entity spreading to the brain and up to 50% of patients develop brain metastases (BMs). Detection of BMs on MRI is challenging with an inherent risk of missed diagnosis. PURPOSE: To train and evaluate a deep learning model (DLM) for fully automated detection and 3D segmentation of BMs in NSCLC on clinical routine MRI. STUDY TYPE: Retrospective. POPULATION: Ninety-eight NSCLC patients with 315 BMs on pretreatment MRI, divided into training (66 patients, 248 BMs) and independent test (17 patients, 67 BMs) and control (15 patients, 0 BMs) cohorts. FIELD STRENGTH/SEQUENCE: T1 -/T2 -weighted, T1 -weighted contrast-enhanced (T1 CE; gradient-echo and spin-echo sequences), and FLAIR at 1.0, 1.5, and 3.0 T from various vendors and study centers. ASSESSMENT: A 3D convolutional neural network (DeepMedic) was trained on the training cohort using 5-fold cross-validation and evaluated on the independent test and control sets. Three-dimensional voxel-wise manual segmentations of BMs by a neurosurgeon and a radiologist on T1 CE served as the reference standard. STATISTICAL TESTS: Sensitivity (recall) and false positive (FP) findings per scan, dice similarity coefficient (DSC) to compare the spatial overlap between manual and automated segmentations, Pearson's correlation coefficient (r) to evaluate the relationship between quantitative volumetric measurements of segmentations, and Wilcoxon rank-sum test to compare the volumes of BMs. A P value <0.05 was considered statistically significant. RESULTS: In the test set, the DLM detected 57 of the 67 BMs (mean volume: 0.99 ± 4.24 cm3 ), resulting in a sensitivity of 85.1%, while FP findings of 1.5 per scan were observed. Missed BMs had a significantly smaller volume (0.05 ± 0.04 cm3 ) than detected BMs (0.96 ± 2.4 cm3 ). Compared with the reference standard, automated segmentations achieved a median DSC of 0.72 and a good volumetric correlation (r = 0.95). In the control set, 1.8 FPs/scan were observed. DATA CONCLUSION: Deep learning provided a high detection sensitivity and good segmentation performance for BMs in NSCLC on heterogeneous scanner data while yielding a low number of FP findings. Level of Evidence 3 Technical Efficacy Stage 2.
BACKGROUND:Non-small cell lung cancer (NSCLC) is the most common tumor entity spreading to the brain and up to 50% of patients develop brain metastases (BMs). Detection of BMs on MRI is challenging with an inherent risk of missed diagnosis. PURPOSE: To train and evaluate a deep learning model (DLM) for fully automated detection and 3D segmentation of BMs in NSCLC on clinical routine MRI. STUDY TYPE: Retrospective. POPULATION: Ninety-eight NSCLCpatients with 315 BMs on pretreatment MRI, divided into training (66 patients, 248 BMs) and independent test (17 patients, 67 BMs) and control (15 patients, 0 BMs) cohorts. FIELD STRENGTH/SEQUENCE: T1 -/T2 -weighted, T1 -weighted contrast-enhanced (T1 CE; gradient-echo and spin-echo sequences), and FLAIR at 1.0, 1.5, and 3.0 T from various vendors and study centers. ASSESSMENT: A 3D convolutional neural network (DeepMedic) was trained on the training cohort using 5-fold cross-validation and evaluated on the independent test and control sets. Three-dimensional voxel-wise manual segmentations of BMs by a neurosurgeon and a radiologist on T1 CE served as the reference standard. STATISTICAL TESTS: Sensitivity (recall) and false positive (FP) findings per scan, dice similarity coefficient (DSC) to compare the spatial overlap between manual and automated segmentations, Pearson's correlation coefficient (r) to evaluate the relationship between quantitative volumetric measurements of segmentations, and Wilcoxon rank-sum test to compare the volumes of BMs. A P value <0.05 was considered statistically significant. RESULTS: In the test set, the DLM detected 57 of the 67 BMs (mean volume: 0.99 ± 4.24 cm3 ), resulting in a sensitivity of 85.1%, while FP findings of 1.5 per scan were observed. Missed BMs had a significantly smaller volume (0.05 ± 0.04 cm3 ) than detected BMs (0.96 ± 2.4 cm3 ). Compared with the reference standard, automated segmentations achieved a median DSC of 0.72 and a good volumetric correlation (r = 0.95). In the control set, 1.8 FPs/scan were observed. DATA CONCLUSION: Deep learning provided a high detection sensitivity and good segmentation performance for BMs in NSCLC on heterogeneous scanner data while yielding a low number of FP findings. Level of Evidence 3 Technical Efficacy Stage 2.