Chun-Jung Juan1,2,3,4,5, Shao-Chieh Lin2,6, Ya-Hui Li2,7, Chia-Ching Chang2,8, Yi-Hung Jeng2,5, Hsu-Hsia Peng5, Teng-Yi Huang9, Hsiao-Wen Chung7,10, Wu-Chung Shen2,3, Chon-Haw Tsai11, Ruey-Feng Chang12,13, Yi-Jui Liu14. 1. Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan, Republic of China. 2. Department of Medical Imaging, China Medical University Hsinchu Hospital, Hsinchu, Taiwan, Republic of China. 3. Department of Radiology, School of Medicine, College of Medicine, China Medical University, Taichung, Taiwan, Republic of China. 4. Department of Medical Imaging, China Medical University Hospital, Taichung, Taiwan, Republic of China. 5. Department of Biomedical Engineering and Environmental Sciences, National Tsing Hua University, Hsinchu, Taiwan, Republic of China. 6. Ph.D. Program in Electrical and Communication Engineering, Feng Chia University, Taichung, Taiwan, Republic of China. 7. Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, Taipei, Taiwan, Republic of China. 8. Department of Management Science, National Chiao-Tung University, Hsinchu, Taiwan, Republic of China. 9. Department of Electrical Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan, Republic of China. 10. Department of Electrical Engineering, National Taiwan University, Taipei, Taiwan, Republic of China. 11. Department of Neurology, China Medical University Hospital, Taichung, Taiwan, Republic of China. 12. Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan, Republic of China. rfchang@csie.ntu.edu.tw. 13. Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, Taipei, Taiwan, Republic of China. rfchang@csie.ntu.edu.tw. 14. Department of Automatic Control Engineering, Feng Chia University, No. 100 Wenhwa Rd., Seatwen, 40724, Taichung, Taiwan, Republic of China. ericliu6@ms35.hinet.net.
Abstract
OBJECTIVES: To examine the role of ADC threshold on agreement across observers and deep learning models (DLMs) plus segmentation performance of DLMs for acute ischemic stroke (AIS). METHODS: Twelve DLMs, which were trained on DWI-ADC-ADC combination from 76 patients with AIS using 6 different ADC thresholds with ground truth manually contoured by 2 observers, were tested by additional 67 patients in the same hospital and another 78 patients in another hospital. Agreement between observers and DLMs were evaluated by Bland-Altman plot and intraclass correlation coefficient (ICC). The similarity between ground truth (GT) defined by observers and between automatic segmentation performed by DLMs was evaluated by Dice similarity coefficient (DSC). Group comparison was performed using the Mann-Whitney U test. The relationship between the DSC and ADC threshold as well as AIS lesion size was evaluated by linear regression analysis. A p < .05 was considered statistically significant. RESULTS: Excellent interobserver agreement and intraobserver repeatability in the manual segmentation (all ICC > 0.98, p < .001) were achieved. The 95% limit of agreement was reduced from 11.23 cm2 for GT on DWI to 0.59 cm2 for prediction at an ADC threshold of 0.6 × 10-3 mm2/s combined with DWI. The segmentation performance of DLMs was improved with an overall DSC from 0.738 ± 0.214 on DWI to 0.971 ± 0.021 on an ADC threshold of 0.6 × 10-3 mm2/s combined with DWI. CONCLUSIONS: Combining an ADC threshold of 0.6 × 10-3 mm2/s with DWI reduces interobserver and inter-DLM difference and achieves best segmentation performance of AIS lesions using DLMs. KEY POINTS: • Higher Dice similarity coefficient (DSC) in predicting acute ischemic stroke lesions was achieved by ADC thresholds combined with DWI than by DWI alone (all p < .05). • DSC had a negative association with the ADC threshold in most sizes, both hospitals, and both observers (most p < .05) and a positive association with the stroke size in all ADC thresholds, both hospitals, and both observers (all p < .001). • An ADC threshold of 0.6 × 10-3 mm2/s eliminated the difference of DSC at any stroke size between observers or between hospitals (p = .07 to > .99).
OBJECTIVES: To examine the role of ADC threshold on agreement across observers and deep learning models (DLMs) plus segmentation performance of DLMs for acute ischemic stroke (AIS). METHODS: Twelve DLMs, which were trained on DWI-ADC-ADC combination from 76 patients with AIS using 6 different ADC thresholds with ground truth manually contoured by 2 observers, were tested by additional 67 patients in the same hospital and another 78 patients in another hospital. Agreement between observers and DLMs were evaluated by Bland-Altman plot and intraclass correlation coefficient (ICC). The similarity between ground truth (GT) defined by observers and between automatic segmentation performed by DLMs was evaluated by Dice similarity coefficient (DSC). Group comparison was performed using the Mann-Whitney U test. The relationship between the DSC and ADC threshold as well as AIS lesion size was evaluated by linear regression analysis. A p < .05 was considered statistically significant. RESULTS: Excellent interobserver agreement and intraobserver repeatability in the manual segmentation (all ICC > 0.98, p < .001) were achieved. The 95% limit of agreement was reduced from 11.23 cm2 for GT on DWI to 0.59 cm2 for prediction at an ADC threshold of 0.6 × 10-3 mm2/s combined with DWI. The segmentation performance of DLMs was improved with an overall DSC from 0.738 ± 0.214 on DWI to 0.971 ± 0.021 on an ADC threshold of 0.6 × 10-3 mm2/s combined with DWI. CONCLUSIONS: Combining an ADC threshold of 0.6 × 10-3 mm2/s with DWI reduces interobserver and inter-DLM difference and achieves best segmentation performance of AIS lesions using DLMs. KEY POINTS: • Higher Dice similarity coefficient (DSC) in predicting acute ischemic stroke lesions was achieved by ADC thresholds combined with DWI than by DWI alone (all p < .05). • DSC had a negative association with the ADC threshold in most sizes, both hospitals, and both observers (most p < .05) and a positive association with the stroke size in all ADC thresholds, both hospitals, and both observers (all p < .001). • An ADC threshold of 0.6 × 10-3 mm2/s eliminated the difference of DSC at any stroke size between observers or between hospitals (p = .07 to > .99).
Authors: Bin Zhao; Zhiyang Liu; Guohua Liu; Chen Cao; Song Jin; Hong Wu; Shuxue Ding Journal: Comput Math Methods Med Date: 2021-01-29 Impact factor: 2.238