Xiang Zhang1, Yi Yang1, Yi-Wei Shen1, Ke-Rui Zhang1, Ze-Kun Jiang2, Li-Tai Ma1, Chen Ding1, Bei-Yu Wang1, Yang Meng1, Hao Liu3. 1. Department of Orthopedics, Orthopedic Research Institute, West China Hospital, Sichuan University, No. 37 Guo Xue Rd, Chengdu, 610041, China. 2. West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, 610000, China. 3. Department of Orthopedics, Orthopedic Research Institute, West China Hospital, Sichuan University, No. 37 Guo Xue Rd, Chengdu, 610041, China. liuhao6304@126.com.
Abstract
OBJECTIVES: To systematically quantify the diagnostic accuracy and identify potential covariates affecting the performance of artificial intelligence (AI) in diagnosing orthopedic fractures. METHODS: PubMed, Embase, Web of Science, and Cochrane Library were systematically searched for studies on AI applications in diagnosing orthopedic fractures from inception to September 29, 2021. Pooled sensitivity and specificity and the area under the receiver operating characteristic curves (AUC) were obtained. This study was registered in the PROSPERO database prior to initiation (CRD 42021254618). RESULTS: Thirty-nine were eligible for quantitative analysis. The overall pooled AUC, sensitivity, and specificity were 0.96 (95% CI 0.94-0.98), 90% (95% CI 87-92%), and 92% (95% CI 90-94%), respectively. In subgroup analyses, multicenter designed studies yielded higher sensitivity (92% vs. 88%) and specificity (94% vs. 91%) than single-center studies. AI demonstrated higher sensitivity with transfer learning (with vs. without: 92% vs. 87%) or data augmentation (with vs. without: 92% vs. 87%), compared to those without. Utilizing plain X-rays as input images for AI achieved results comparable to CT (AUC 0.96 vs. 0.96). Moreover, AI achieved comparable results to humans (AUC 0.97 vs. 0.97) and better results than non-expert human readers (AUC 0.98 vs. 0.96; sensitivity 95% vs. 88%). CONCLUSIONS: AI demonstrated high accuracy in diagnosing orthopedic fractures from medical images. Larger-scale studies with higher design quality are needed to validate our findings. KEY POINTS: • Multicenter study design, application of transfer learning, and data augmentation are closely related to improving the performance of artificial intelligence models in diagnosing orthopedic fractures. • Utilizing plain X-rays as input images for AI to diagnose fractures achieved results comparable to CT (AUC 0.96 vs. 0.96). • AI achieved comparable results to humans (AUC 0.97 vs. 0.97) but was superior to non-expert human readers (AUC 0.98 vs. 0.96, sensitivity 95% vs. 88%) in diagnosing fractures.
OBJECTIVES: To systematically quantify the diagnostic accuracy and identify potential covariates affecting the performance of artificial intelligence (AI) in diagnosing orthopedic fractures. METHODS: PubMed, Embase, Web of Science, and Cochrane Library were systematically searched for studies on AI applications in diagnosing orthopedic fractures from inception to September 29, 2021. Pooled sensitivity and specificity and the area under the receiver operating characteristic curves (AUC) were obtained. This study was registered in the PROSPERO database prior to initiation (CRD 42021254618). RESULTS: Thirty-nine were eligible for quantitative analysis. The overall pooled AUC, sensitivity, and specificity were 0.96 (95% CI 0.94-0.98), 90% (95% CI 87-92%), and 92% (95% CI 90-94%), respectively. In subgroup analyses, multicenter designed studies yielded higher sensitivity (92% vs. 88%) and specificity (94% vs. 91%) than single-center studies. AI demonstrated higher sensitivity with transfer learning (with vs. without: 92% vs. 87%) or data augmentation (with vs. without: 92% vs. 87%), compared to those without. Utilizing plain X-rays as input images for AI achieved results comparable to CT (AUC 0.96 vs. 0.96). Moreover, AI achieved comparable results to humans (AUC 0.97 vs. 0.97) and better results than non-expert human readers (AUC 0.98 vs. 0.96; sensitivity 95% vs. 88%). CONCLUSIONS: AI demonstrated high accuracy in diagnosing orthopedic fractures from medical images. Larger-scale studies with higher design quality are needed to validate our findings. KEY POINTS: • Multicenter study design, application of transfer learning, and data augmentation are closely related to improving the performance of artificial intelligence models in diagnosing orthopedic fractures. • Utilizing plain X-rays as input images for AI to diagnose fractures achieved results comparable to CT (AUC 0.96 vs. 0.96). • AI achieved comparable results to humans (AUC 0.97 vs. 0.97) but was superior to non-expert human readers (AUC 0.98 vs. 0.96, sensitivity 95% vs. 88%) in diagnosing fractures.
Authors: Alejandro Rodriguez-Ruiz; Kristina Lång; Albert Gubern-Merida; Mireille Broeders; Gisella Gennaro; Paola Clauser; Thomas H Helbich; Margarita Chevalier; Tao Tan; Thomas Mertelmeier; Matthew G Wallis; Ingvar Andersson; Sophia Zackrisson; Ritse M Mann; Ioannis Sechopoulos Journal: J Natl Cancer Inst Date: 2019-09-01 Impact factor: 13.506
Authors: Andre Esteva; Brett Kuprel; Roberto A Novoa; Justin Ko; Susan M Swetter; Helen M Blau; Sebastian Thrun Journal: Nature Date: 2017-01-25 Impact factor: 49.962
Authors: Alejandro Rodríguez-Ruiz; Elizabeth Krupinski; Jan-Jurre Mordang; Kathy Schilling; Sylvia H Heywang-Köbrunner; Ioannis Sechopoulos; Ritse M Mann Journal: Radiology Date: 2018-11-20 Impact factor: 11.105
Authors: Scott Mayer McKinney; Marcin Sieniek; Varun Godbole; Jonathan Godwin; Natasha Antropova; Hutan Ashrafian; Trevor Back; Mary Chesus; Greg S Corrado; Ara Darzi; Mozziyar Etemadi; Florencia Garcia-Vicente; Fiona J Gilbert; Mark Halling-Brown; Demis Hassabis; Sunny Jansen; Alan Karthikesalingam; Christopher J Kelly; Dominic King; Joseph R Ledsam; David Melnick; Hormuz Mostofi; Lily Peng; Joshua Jay Reicher; Bernardino Romera-Paredes; Richard Sidebottom; Mustafa Suleyman; Daniel Tse; Kenneth C Young; Jeffrey De Fauw; Shravya Shetty Journal: Nature Date: 2020-01-01 Impact factor: 49.962