Hyunsuk Yoo1, Sang Hyup Lee1, Chiara Daniela Arru2,3, Ruhani Doda Khera2,3, Ramandeep Singh2,3, Sean Siebert2,3, Dohoon Kim4, Yuna Lee4, Ju Hyun Park5, Hye Joung Eom6, Subba R Digumarthy2,3, Mannudeep K Kalra7,8. 1. Lunit, Seoul, Korea. 2. Division of Thoracic Imaging, Department of Radiology, Massachusetts General Hospital, 75 Blossom Court, Boston, MA, 02114, USA. 3. Harvard Medical School, Boston, MA, USA. 4. Department of Radiology, Seoul National University College of Medicine, Seoul, Korea. 5. Suwon Total Healthcare Center, Kangbuk Samsung Hospital, Sungkyunkwan University School of Medicine, Youngin-si, Gyeongi-do, 16954, Korea. 6. Cheju Halla General Hospital, 65 Doryeong-ro, Yeon-dong, Jeju-si, Jeju-do, Korea. 7. Division of Thoracic Imaging, Department of Radiology, Massachusetts General Hospital, 75 Blossom Court, Boston, MA, 02114, USA. mkalra@mgh.harvard.edu. 8. Harvard Medical School, Boston, MA, USA. mkalra@mgh.harvard.edu.
Abstract
OBJECTIVE: Assess if deep learning-based artificial intelligence (AI) algorithm improves reader performance for lung cancer detection on chest X-rays (CXRs). METHODS: This reader study included 173 images from cancer-positive patients (n = 98) and 346 images from cancer-negative patients (n = 196) selected from National Lung Screening Trial (NLST). Eight readers, including three radiology residents, and five board-certified radiologists, participated in the observer performance test. AI algorithm provided image-level probability of pulmonary nodule or mass on CXRs and a heatmap of detected lesions. Reader performance was compared with AUC, sensitivity, specificity, false-positives per image (FPPI), and rates of chest CT recommendations. RESULTS: With AI, the average sensitivity of readers for the detection of visible lung cancer increased for residents, but was similar for radiologists compared to that without AI (0.61 [95% CI, 0.55-0.67] vs. 0.72 [95% CI, 0.66-0.77], p = 0.016 for residents, and 0.76 [95% CI, 0.72-0.81] vs. 0.76 [95% CI, 0.72-0.81, p = 1.00 for radiologists), while false-positive findings per image (FPPI) was similar for residents, but decreased for radiologists (0.15 [95% CI, 0.11-0.18] vs. 0.12 [95% CI, 0.09-0.16], p = 0.13 for residents, and 0.24 [95% CI, 0.20-0.29] vs. 0.17 [95% CI, 0.13-0.20], p < 0.001 for radiologists). With AI, the average rate of chest CT recommendation in patients positive for visible cancer increased for residents, but was similar for radiologists (54.7% [95% CI, 48.2-61.2%] vs. 70.2% [95% CI, 64.2-76.2%], p < 0.001 for residents and 72.5% [95% CI, 68.0-77.1%] vs. 73.9% [95% CI, 69.4-78.3%], p = 0.68 for radiologists), while that in cancer-negative patients was similar for residents, but decreased for radiologists (11.2% [95% CI, 9.6-13.1%] vs. 9.8% [95% CI, 8.0-11.6%], p = 0.32 for residents and 16.4% [95% CI, 14.7-18.2%] vs. 11.7% [95% CI, 10.2-13.3%], p < 0.001 for radiologists). CONCLUSIONS: AI algorithm can enhance the performance of readers for the detection of lung cancers on chest radiographs when used as second reader. KEY POINTS: • Reader study in the NLST dataset shows that AI algorithm had sensitivity benefit for residents and specificity benefit for radiologists for the detection of visible lung cancer. • With AI, radiology residents were able to recommend more chest CT examinations (54.7% vs 70.2%, p < 0.001) for patients with visible lung cancer. • With AI, radiologists recommended significantly less proportion of unnecessary chest CT examinations (16.4% vs. 11.7%, p < 0.001) in cancer-negative patients.
OBJECTIVE: Assess if deep learning-based artificial intelligence (AI) algorithm improves reader performance for lung cancer detection on chest X-rays (CXRs). METHODS: This reader study included 173 images from cancer-positive patients (n = 98) and 346 images from cancer-negative patients (n = 196) selected from National Lung Screening Trial (NLST). Eight readers, including three radiology residents, and five board-certified radiologists, participated in the observer performance test. AI algorithm provided image-level probability of pulmonary nodule or mass on CXRs and a heatmap of detected lesions. Reader performance was compared with AUC, sensitivity, specificity, false-positives per image (FPPI), and rates of chest CT recommendations. RESULTS: With AI, the average sensitivity of readers for the detection of visible lung cancer increased for residents, but was similar for radiologists compared to that without AI (0.61 [95% CI, 0.55-0.67] vs. 0.72 [95% CI, 0.66-0.77], p = 0.016 for residents, and 0.76 [95% CI, 0.72-0.81] vs. 0.76 [95% CI, 0.72-0.81, p = 1.00 for radiologists), while false-positive findings per image (FPPI) was similar for residents, but decreased for radiologists (0.15 [95% CI, 0.11-0.18] vs. 0.12 [95% CI, 0.09-0.16], p = 0.13 for residents, and 0.24 [95% CI, 0.20-0.29] vs. 0.17 [95% CI, 0.13-0.20], p < 0.001 for radiologists). With AI, the average rate of chest CT recommendation in patients positive for visible cancer increased for residents, but was similar for radiologists (54.7% [95% CI, 48.2-61.2%] vs. 70.2% [95% CI, 64.2-76.2%], p < 0.001 for residents and 72.5% [95% CI, 68.0-77.1%] vs. 73.9% [95% CI, 69.4-78.3%], p = 0.68 for radiologists), while that in cancer-negative patients was similar for residents, but decreased for radiologists (11.2% [95% CI, 9.6-13.1%] vs. 9.8% [95% CI, 8.0-11.6%], p = 0.32 for residents and 16.4% [95% CI, 14.7-18.2%] vs. 11.7% [95% CI, 10.2-13.3%], p < 0.001 for radiologists). CONCLUSIONS: AI algorithm can enhance the performance of readers for the detection of lung cancers on chest radiographs when used as second reader. KEY POINTS: • Reader study in the NLST dataset shows that AI algorithm had sensitivity benefit for residents and specificity benefit for radiologists for the detection of visible lung cancer. • With AI, radiology residents were able to recommend more chest CT examinations (54.7% vs 70.2%, p < 0.001) for patients with visible lung cancer. • With AI, radiologists recommended significantly less proportion of unnecessary chest CT examinations (16.4% vs. 11.7%, p < 0.001) in cancer-negative patients.
Authors: Barbara L McComb; Jonathan H Chung; Traves D Crabtree; Darel E Heitkamp; Mark D Iannettoni; Clinton Jokerst; Anthony G Saleh; Rakesh D Shah; Robert M Steiner; Tan-Lucien H Mohammed; James G Ravenel Journal: J Thorac Imaging Date: 2016-03 Impact factor: 3.000
Authors: Denise R Aberle; Sarah DeMello; Christine D Berg; William C Black; Brenda Brewer; Timothy R Church; Kathy L Clingan; Fenghai Duan; Richard M Fagerstrom; Ilana F Gareen; Constantine A Gatsonis; David S Gierada; Amanda Jain; Gordon C Jones; Irene Mahon; Pamela M Marcus; Joshua M Rathmell; JoRean Sicks Journal: N Engl J Med Date: 2013-09-05 Impact factor: 91.245
Authors: Denise R Aberle; Amanda M Adams; Christine D Berg; William C Black; Jonathan D Clapp; Richard M Fagerstrom; Ilana F Gareen; Constantine Gatsonis; Pamela M Marcus; JoRean D Sicks Journal: N Engl J Med Date: 2011-06-29 Impact factor: 91.245
Authors: Ju Gang Nam; Sunggyun Park; Eui Jin Hwang; Jong Hyuk Lee; Kwang-Nam Jin; Kun Young Lim; Thienkai Huy Vu; Jae Ho Sohn; Sangheum Hwang; Jin Mo Goo; Chang Min Park Journal: Radiology Date: 2018-09-25 Impact factor: 11.105
Authors: Jeong Hoon Lee; Jong Seok Ahn; Myung Jin Chung; Yeon Joo Jeong; Jin Hwan Kim; Jae Kwang Lim; Jin Young Kim; Young Jae Kim; Jong Eun Lee; Eun Young Kim Journal: Sensors (Basel) Date: 2022-07-02 Impact factor: 3.847
Authors: Fatemeh Homayounieh; Subba Digumarthy; Shadi Ebrahimian; Johannes Rueckel; Boj Friedrich Hoppe; Bastian Oliver Sabel; Sailesh Conjeti; Karsten Ridder; Markus Sistermanns; Lei Wang; Alexander Preuhs; Florin Ghesu; Awais Mansoor; Mateen Moghbel; Ariel Botwin; Ramandeep Singh; Samuel Cartmell; John Patti; Christian Huemmer; Andreas Fieselmann; Clemens Joerger; Negar Mirshahzadeh; Victorine Muse; Mannudeep Kalra Journal: JAMA Netw Open Date: 2021-12-01