| Literature DB >> 33168895 |
Ali Arab1, Betty Chinda2,3, George Medvedev4, William Siu5, Hui Guo3,6, Tao Gu3,7, Sylvain Moreno8,9, Ghassan Hamarneh1, Martin Ester10, Xiaowei Song11,12.
Abstract
This project aimed to develop and evaluate a fast and fully-automated deep-learning method applying convolutional neural networks with deep supervision (CNN-DS) for accurate hematoma segmentation and volume quantification in computed tomography (CT) scans. Non-contrast whole-head CT scans of 55 patients with hemorrhagic stroke were used. Individual scans were standardized to 64 axial slices of 128 × 128 voxels. Each voxel was annotated independently by experienced raters, generating a binary label of hematoma versus normal brain tissue based on majority voting. The dataset was split randomly into training (n = 45) and testing (n = 10) subsets. A CNN-DS model was built applying the training data and examined using the testing data. Performance of the CNN-DS solution was compared with three previously established methods. The CNN-DS achieved a Dice coefficient score of 0.84 ± 0.06 and recall of 0.83 ± 0.07, higher than patch-wise U-Net (< 0.76). CNN-DS average running time of 0.74 ± 0.07 s was faster than PItcHPERFeCT (> 1412 s) and slice-based U-Net (> 12 s). Comparable interrater agreement rates were observed between "method-human" vs. "human-human" (Cohen's kappa coefficients > 0.82). The fully automated CNN-DS approach demonstrated expert-level accuracy in fast segmentation and quantification of hematoma, substantially improving over previous methods. Further research is warranted to test the CNN-DS solution as a software tool in clinical settings for effective stroke management.Entities:
Year: 2020 PMID: 33168895 PMCID: PMC7652921 DOI: 10.1038/s41598-020-76459-7
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Patient demographics.
| Training set | Testing set | |
|---|---|---|
| N | 45 | 10 |
| Age (years, mean ± std) | 62.6 ± 15.7 | 70.7 ± 8.4 |
| Sex (male, %) | 51 | 60 |
| Deep-seated intracerebral hemorrhage (ICH, %) | 40 | 30 |
| Hemorrhage reaching cortical surface (%) | 77.8 | 100 |
| Intraventricular hemorrhage (%) | 40 | 30 |
| Oral anticoagulants (%) | 24.4 | 40 |
| Antiplatelet (%) | 6.7 | 0 |
| Hematoma evacuation (%) | 11.1 | 0 |
Agreement measures for each pair of raters on the testing data set.
| Raters | Cohen’s kappa | β (p-value) | α (p-value) |
|---|---|---|---|
| E1, E2 | 0.88 ± 0.05 | 1.13 (p = 7.99e−10*) | − 2.30 (p = 0.141) |
| E1, E3 | 0.87 ± 0.07 | 1.45 (p = 1.11e−06*) | − 8.26 (p = 0.107) |
| E2, E3 | 0.89 ± 0.07 | 1.29 (p = 1.15e−07*) | − 5.60 (p = 0.126) |
| E1, M | 0.84 ± 0.06 | 1.05 (p = 6.12e−08*) | − 1.24(p = 0.600) |
| E2, M | 0.82 ± 0.08 | 0.93 (p = 4.97e−08*) | 1.01 (p = 0.648) |
| E3, M | 0.82 ± 0.08 | 0.69 (p = 1.31e−05*) | 6.36 (p = 0.142) |
Figure 1Disagreement percentages between each pair of raters. E1, E2, E3 represents expert 1, 2, and 3, respectively, while M indicates the CNN-DS method. Disagreements rate is displayed in gray-scale blocks; the darker the block, the higher the disagreement rate. Figure 1 was created using Matlab R2017b (https://www.mathworks.com).
Figure 2Examples showing the segmentation outcomes using the CNN-DS method. In each panel, the left, middle, and right images are the original CT slice, the ‘ground truth’ labels, and the CNN-DS predicted segmentation, respectively. The pointing arrows indicate the error. (A) Represents a case where the CNN-DS method demonstrates an expert-level performance. (B) Shows a false positive instance where a calcified structure is labelled as a hemorrhagic area due to its Hounsfield Unit values being higher than those of its surrounding tissues. (C) Shows a false negative example in which the CNN-DS method identified part of the hemorrhage but missed some blood close to the bone. (D) Illustrates a more complicated case of complex hemorrhage where the discrepancies between the ‘ground truth’ and the predicted segmentation cannot necessarily be attributed to erroneous prediction.
Segmentation quantitative performance.
| Method | Dice score | Precision | Recall | F1 score | Processing time (s) |
|---|---|---|---|---|---|
| Patch-wise U-Net[ | 0.74 ± 0.09 | 0.73 ± 0.17 | 0.76 ± 0.09 | 0.74 | 9.4 ± 0.2 |
| Slice-based U-Net[ | 0.80 ± 0.7 | 0.78 ± 0.10 | 0.84 ± 0.08 | 0.80 | 12.3 ± 3.6 |
| PItcHPERFeCT[ | 0.76 ± − 0.11 | 0.63 ± 0.15 | 0.77 | 1412.34 + 20.05 | |
| CNN-DS (present study) | 0.83 ± 0.07 |
Figure 3The training and validation loss for the U-Net model with and without deep supervision. The x-axis indicates the number of epochs, which is the number of times the deep learning model has passed through the entire training data during the training phase. The y-axis represents the loss value which implies how well the model behaves after each epoch; the lower the loss, the better a model. The dashed lines show the validation losses while the solid lines show the training losses. For the model with the deep supervision (blue lines), the training loss converges at a considerably faster rate, and the converged loss value is lower than the converged value of the model without deep supervision (green lines).
Segmentation quantitative performance for the two models with and without deep supervision (DS).
| Method/evaluation | Dice score | Precision | Recall | F1 score | Processing time (s) |
|---|---|---|---|---|---|
| Without DS | 0.79 ± 0.11 | 0.81 ± 0.09 | 0.77 ± 0.09 | 0.74 | |
| With DS | 0.74 ± 0.07 |
Figure 4The architecture of the CNN-DS neural network model. The dashed lines show the skip connections while the solid lines show the normal ones. The neural network learns features of the image based on a hierarchy framework starting with simple features such as edges and shapes and going through more complex and high-level features in the deeper levels. The contracting path extracts the features while the expansive path reconstructs the final labelling. Google Slides was used to produce this figure (https://docs.google.com/presentation).
The model hyper-parameters.
| Hyper-parameter | Value |
|---|---|
| Network depth | 4 |
| Initial learning rate | 0.005 |
| Dropout rate | 0.3 |
| Batch size | 1 |
| First moment estimate | 0.9 |
| Second moment estimate | 0.999 |