Xue-Lian Zhou1, Er-Gang Wang2,3, Qiang Lin4, Guan-Ping Dong1, Wei Wu1, Ke Huang1, Can Lai5, Gang Yu6, Hai-Chun Zhou5, Xiao-Hui Ma5, Xuan Jia5, Lei Shi4, Yong-Sheng Zheng4, Lan-Xuan Liu4, Da Ha4, Hao Ni4, Jun Yang4, Jun-Fen Fu1. 1. The Children's Hospital, Zhejiang University School of Medicine, Division of Endocrinology, National Clinical Research Center for Child Health, Hangzhou 310052, China. 2. Center for Genomics and Computational Biology, Duke University, Durham, NC, USA. 3. Department of Biomedical Engineering, Duke University, Durham, NC, USA. 4. Hangzhou YITU Healthcare Technology Co., Ltd, Hangzhou 310012, China. 5. The Children's Hospital, Zhejiang University School of Medicine, Division of Radiology, National Clinical Research Center for Child Health, Hangzhou 310052, China. 6. The Children's Hospital, Zhejiang University School of Medicine, Division of Information Science, National Clinical Research Center for Child Health, Hangzhou 310052, China.
Abstract
BACKGROUND: Bone age can reflect the true growth and development status of a child; thus, it plays a critical role in evaluating growth and endocrine disorders. This study established and validated an optimized Tanner-Whitehouse 3 artificial intelligence (TW3-AI) bone age assessment (BAA) system based on a convolutional neural network (CNN). METHODS: A data set of 9,059 clinical radiographs of the left hand was obtained from the picture archives and communication systems (PACS) between January 2012 and December 2016. Among these, 8,005/9,059 (88%) samples were treated as the training set for model implementation, 804/9,059 (9%) samples as the validation set for parameters optimization, and the remaining 250/9,059 (3%) samples were used to verify the accuracy and reliability of the model compared to that of 4 experienced endocrinologists and 2 experienced radiologists. The overall variation of TW3-metacarpophalangeal, radius, ulna and short bones (RUS) and TW3-Carpal bone score, as well as each bone (13 RUS + 7 Carpal) between reviewers and the AI, were compared by Bland-Altman (BA) chart and Kappa test, respectively. Furthermore, the time consumption between the model and reviewers was also compared. RESULTS: The performance of TW3-AI model was highly consistent with the reviewers' overall estimation, and the root mean square (RMS) was 0.50 years. The accuracy of the BAA of the TW3-AI model was better than the estimate of the reviewers. Further analysis revealed that human interpretations of the male capitate, hamate, the first distal and fifth middle phalanx and female capitate, the trapezoid, and the third and fifth middle phalanx, were most inconsistent. The average image processing time was 1.5±0.2 s in the TW3-AI model, which was significantly shorter than manual interpretation. CONCLUSIONS: The diagnostic performance of CNN-based TW3 BAA was accurate and timesaving, and possesses better stability compared to diagnostics made by experienced experts. 2020 Quantitative Imaging in Medicine and Surgery. All rights reserved.
BACKGROUND: Bone age can reflect the true growth and development status of a child; thus, it plays a critical role in evaluating growth and endocrine disorders. This study established and validated an optimized Tanner-Whitehouse 3 artificial intelligence (TW3-AI) bone age assessment (BAA) system based on a convolutional neural network (CNN). METHODS: A data set of 9,059 clinical radiographs of the left hand was obtained from the picture archives and communication systems (PACS) between January 2012 and December 2016. Among these, 8,005/9,059 (88%) samples were treated as the training set for model implementation, 804/9,059 (9%) samples as the validation set for parameters optimization, and the remaining 250/9,059 (3%) samples were used to verify the accuracy and reliability of the model compared to that of 4 experienced endocrinologists and 2 experienced radiologists. The overall variation of TW3-metacarpophalangeal, radius, ulna and short bones (RUS) and TW3-Carpal bone score, as well as each bone (13 RUS + 7 Carpal) between reviewers and the AI, were compared by Bland-Altman (BA) chart and Kappa test, respectively. Furthermore, the time consumption between the model and reviewers was also compared. RESULTS: The performance of TW3-AI model was highly consistent with the reviewers' overall estimation, and the root mean square (RMS) was 0.50 years. The accuracy of the BAA of the TW3-AI model was better than the estimate of the reviewers. Further analysis revealed that human interpretations of the male capitate, hamate, the first distal and fifth middle phalanx and female capitate, the trapezoid, and the third and fifth middle phalanx, were most inconsistent. The average image processing time was 1.5±0.2 s in the TW3-AI model, which was significantly shorter than manual interpretation. CONCLUSIONS: The diagnostic performance of CNN-based TW3 BAA was accurate and timesaving, and possesses better stability compared to diagnostics made by experienced experts. 2020 Quantitative Imaging in Medicine and Surgery. All rights reserved.
Authors: David B Larson; Matthew C Chen; Matthew P Lungren; Safwan S Halabi; Nicholas V Stence; Curtis P Langlotz Journal: Radiology Date: 2017-11-02 Impact factor: 11.105
Authors: Andre Esteva; Brett Kuprel; Roberto A Novoa; Justin Ko; Susan M Swetter; Helen M Blau; Sebastian Thrun Journal: Nature Date: 2017-01-25 Impact factor: 49.962
Authors: Safwan S Halabi; Luciano M Prevedello; Jayashree Kalpathy-Cramer; Artem B Mamonov; Alexander Bilbily; Mark Cicero; Ian Pan; Lucas Araújo Pereira; Rafael Teixeira Sousa; Nitamar Abdala; Felipe Campos Kitamura; Hans H Thodberg; Leon Chen; George Shih; Katherine Andriole; Marc D Kohli; Bradley J Erickson; Adam E Flanders Journal: Radiology Date: 2018-11-27 Impact factor: 29.146