| Literature DB >> 34901045 |
Wei Shan1,2,3, Yunyun Duan1, Yu Zheng2, Zhenzhou Wu2, Shang Wei Chan2, Qun Wang1,2,3, Peiyi Gao2, Yaou Liu2, Kunlun He4,5, Yongjun Wang1,2.
Abstract
Objective: Reliable quantification of white matter hyperintensities (WHMs) resulting from cerebral small vessel diseases (CSVD) is essential for understanding their clinical impact. We aim to develop and clinically validate a deep learning system for automatic segmentation of CSVD-WMH from fluid-attenuated inversion recovery (FLAIR) imaging using large multicenter data. Method: A FLAIR imaging dataset of 1,156 patients diagnosed with CSVD associated WMH (median age, 54 years; 653 males) obtained between September 2018 and September 2019 from Beijing Tiantan Hospital was retrospectively analyzed in this study. Locations of CSVD-WMH on the FLAIR scans were manually marked by two experienced neurologists. Using the manually labeled data of 996 patients (development set), a U-shaped novel 2D convolutional neural network (CNN) architecture was trained for automatic segmentation of CSVD-WMH. The segmentation performance of the network was evaluated with per pixel and lesion level dice scores using an independent internal test set (n = 160) and a multi-center external test set (n = 90, three medical centers). The clinical suitability of the segmentation results, classified as acceptable, acceptable with minor revision, acceptable with major revision, and not acceptable, was analyzed by three independent neuroradiologists. The inter-neuroradiologists agreement rate was assessed by the Kendall-W test.Entities:
Keywords: clinical evaluation; deep learning; masking white matter hyperintensities; neural network; segmentation
Year: 2021 PMID: 34901045 PMCID: PMC8656685 DOI: 10.3389/fmed.2021.681183
Source DB: PubMed Journal: Front Med (Lausanne) ISSN: 2296-858X
Data distribution in the manuscript.
|
| ||
|---|---|---|
| DLS development | Traning set | 870 patients |
| Validation set | 126 patients | |
| Inner test set | 160 patients | |
| Summary | 1,156 patients | |
| Clinical evaluation | Test data set | 90 patients |
| Summary | 1,246 patients |
inner test set used for the code optimizzatio program only.
Test data set used for the clinical evaluation only.
Validation Test 1.
|
|
|
| |
|---|---|---|---|
|
| |||
| Small | <20 (plex*spacing) | 462 (64.71%) | |
| Medium | 20 ~ 150 (plex*spacing) | (80.09%) | |
| Large | >150 (plex*spacing) | 39 (96.12%) | 0.722 |
|
| |||
| Small | <20 (plex*spacing) | 601 (68.37%) | |
| Medium | 20 ~ 150 (plex*spacing) | 909 (82.86%) | |
| Large | >150 (plex*spacing) | 325 (96.73%) | 0.776 |
|
| |||
| Small | <20 (plex*spacing) | 361 (50.14%) | |
| Medium | 20 ~ 150 (plex*spacing) | 425 (68.77%) | |
| Large | >150 (plex*spacing) | 234 (92.49%) | 0.722 |
Figure 1Flowchart of the distribution of patients in the training and clinical evaluation steps. The distribution and classification of all samples in each step was used for the model training and clinical evaluation steps.
Figure 2(A) Example cases of white matter hyperintensities (WMHs) labeled manually and by the DLS system. (B) WMH lesion distribution in the training and validation step. (C) Data distribution in the model development for training and validation.
Figure 3Network architecture of the proposed two-dimensional (2D) convolutional neural network (CNN). The network has 19 layers integrating nine Convolution blocks. Bilinear interpolating arrows indicate up sampling operations to make predictions for the segmentation task. The pool arrow indicates the down sampling operation to gradually increasing the receptive field for the segmentation task. Concatenate connections are used to fuse Multi-scale features in the network. Batch normalization is a linear transformation of the features performed to reduce the covariance shift, thus speeding up the training procedure. Convolution bars indicate the convolution operation, which computes the features. The number 16, 32, 64, 128, 256 indicates the number of channels in that layer, and 3·3·3·3·3·3 denotes the size of the 2D CNN kernels.
Figure 4Model performance in terms of the training loss, validation score, training accuracy and validation accuracy.
Models head-to head analysis (Data set 2)/Correct labled ratio.
|
| |||
|---|---|---|---|
| U-Resnet | 551, 62.68% | 863, 78.66% | 321, 95.54% |
| 3D-unet | 365, 41.52% | 778, 70.92% | 328, 97.62% |
| Our model | 601, 68.37% | 909, 82.86% | 325, 96.73% |
Models head-to head analysis.
|
|
|
| ||
|---|---|---|---|---|
| ACC. | 0.97 | 0.906 | 0.97 | 0.93 |
| Sensitivity | 0.7244 | 0.5706 | 0.6024 | 0.6499 |
| Specificity | 0.9989 | 0.9998 | 0.9998 | 0.9997 |
| AUC | 0.9959 | 0.9944 | 0.9958 | 0.9896 |
Figure 5(A) Overall framework for the testing stage. (B) Clinical evaluation of the testing data set and Segmentation model ROC-curve and AUC score analysis. Number of neuroradiologists are 3. ***P < 0.001.
Clinical evaluation 1.
|
|
|
| |
|---|---|---|---|
| Perfect (score 3) | 34 | 33 | 45 |
| Minor revision (score 2) | 54 | 53 | 35 |
| Major revision (score 1) | 1 | 3 | 8 |
| Not acceptable (score 0) | 1 | 1 | 2 |