| Literature DB >> 36035283 |
Zhengyin Zhou1, Zhihui Fu2, Juncheng Jia1, Jun Lv3.
Abstract
Rib fractures are common injuries caused by chest trauma, which may cause serious consequences. It is essential to diagnose rib fractures accurately. Low-dose thoracic computed tomography (CT) is commonly used for rib fracture diagnosis, and convolutional neural network- (CNN-) based methods have assisted doctors in rib fracture diagnosis in recent years. However, due to the lack of rib fracture data and the irregular, various shape of rib fractures, it is difficult for CNN-based methods to extract rib fracture features. As a result, they cannot achieve satisfying results in terms of accuracy and sensitivity in detecting rib fractures. Inspired by the attention mechanism, we proposed the CFSG U-Net for rib fracture detection. The CSFG U-Net uses the U-Net architecture and is enhanced by a dual-attention module, including a channel-wise fusion attention module (CFAM) and a spatial-wise group attention module (SGAM). CFAM uses the channel attention mechanism to reweight the feature map along the channel dimension and refine the U-Net's skip connections. SGAM uses the group technique to generate spatial attention to adjust feature maps in the spatial dimension, which allows the spatial attention module to capture more fine-grained semantic information. To evaluate the effectiveness of our proposed methods, we established a rib fracture dataset in our research. The experimental results on our dataset show that the maximum sensitivity of our proposed method is 89.58%, and the average FROC score is 81.28%, which outperforms the existing rib fracture detection methods and attention modules.Entities:
Mesh:
Year: 2022 PMID: 36035283 PMCID: PMC9410867 DOI: 10.1155/2022/8945423
Source DB: PubMed Journal: Comput Math Methods Med ISSN: 1748-670X Impact factor: 2.809
Review of deep learning applications of rib fracture detection.
| Reference | Dataset | Method used | Evaluation metrics | Research challenges |
|---|---|---|---|---|
| [ | In-house dataset. No information about the dataset is mentioned in the paper | Rib region extract method and spatial coherence convolutional neural network | Accuracy, recall, and speed | There were limited comparative experiments and the potential of CNN networks was not fully researched |
| [ | 1,079 patients and 25,054 2D annotations from 3 different hospitals, slice thicknesses range from 1 to 5 mm | Faster R-CNN | Precision recall, and | This work only used 2D CNN, and no 3D information was combined. The precision and recall of this work were not particularly high |
| [ | A total of 7,473 annotated traumatic rib fractures from 900 patients from a single center, slice thicknesses range from 1 to 1.25 mm | Sliding widow mechanism and a modified U-Net called FracNet | Free response receiver-operating characteristic (FROC) analysis | This article only carried out a single-center study, and the landscape of deep neural networks was not fully explored |
| [ | 8,529 chest CT images and 33,828 annotations, slice thickness of CT images was 0.625 mm | Rib fracture detection pipeline consisting of five stages: rib segmentation, vertebra detection, rib labeling, rib fracture detection, and rib fracture classification. VRB-Net for rib fracture detection | Recall, precision, and | The ground truth for detection and classification may include incorrect cases caused by incorrect annotation |
Figure 1An example of rib fracture annotation (see the red mask). (a) Axial view. (b) Coronal view. (c) Sagittal view.
Figure 2The procedures of preprocessing: (a) the original CT image, (b) the binary bone region mask after thresholding at 180 HU, (c) the mask after removing small connected components, (d) the mask after morphological dilation, (e) the mask after extracting the largest connected component, (f) the extracted bone regions by applying the mask, (g) the bone regions after data normalization, and (h) the 3D view of the preprocessing output.
Figure 3Our proposed CFSG U-Net. The model uses a U-Net structure consisting of four encoder blocks (E1-E4) and three decoder blocks (D1-D3). The number next to each block indicates the input size of each block. In the decoder block, F denotes the feature map from the encoder path, and F represents the feature map from the decoder path.
Figure 4(a) The proposed channel-wise fuse attention module (CFAM). F denotes the feature map from the encoder path, and F denotes the feature map from the decoder path. GAP represents the global average pooling. The symbol next to the feature indicates the size of the corresponding feature. (b) The proposed spatial-wise group attention module (SGAM). F denotes the feature map from the decoder block, and the symbol in each block represents the number of channels of each block.
Figure 5Three true-positive segmentation results of the CSFG U-Net. The ground truth is labelled by the yellow line, and the segmentation result is labelled by the red line.
Comparison experiment results of our proposed method, several cutting-edge rib fracture detection methods, and commonly used U-Net architecture deep neural networks are included. The best results are marked in bold. FPs/scan denote false positives per scan.
| Methods | Sensitivities (FPs/scan) | |||||
|---|---|---|---|---|---|---|
| 0.5 | 1 | 2 | 4 | 8 | Avg. | |
| CFSG U-Net | 67.15 | 76.92 |
|
|
|
|
| FracNet | 60.10 | 70.03 | 79.01 | 82.21 | 85.90 | 75.45 |
| VRBNet |
|
| 82.21 | 85.25 | 85.25 | 80.32 |
| 3D U-Net | 59.46 | 69.23 | 77.72 | 81.73 | 85.26 | 74.68 |
| MutiResUnet | 61.70 | 71.79 | 80.93 | 83.81 | 87.18 | 77.08 |
| Attention U-Net | 61.54 | 72.12 | 81.09 | 84.29 | 87.34 | 77.28 |
| ResUNet | 60.90 | 70.99 | 80.13 | 83.33 | 86.70 | 76.41 |
Comparison experiment results of different attention methods. The best results are marked in bold. ResUNet denotes the backbone proposed in [41].
| Methods | Sensitivities (FPs/scan) | |||||
|---|---|---|---|---|---|---|
| 0.5 | 1 | 2 | 4 | 8 | Avg. | |
| CFSG U-Net |
|
|
|
|
|
|
| ResUNet+CBAM | 65.22 | 75.48 | 83.49 | 86.70 | 88.94 | 79.97 |
| ResUNet+ECA | 62.66 | 73.08 | 81.41 | 84.78 | 87.98 | 77.98 |
| ResUNet+SE | 63.14 | 73.72 | 81.73 | 85.10 | 87.98 | 78.33 |
Ablation experiment results for our proposed method. The best results are marked in bold. w/o denotes “without,” w/ denotes “with.” G denotes the number of groups of SGAM.
| Methods | Sensitivities (FPs/scan) | |||||
|---|---|---|---|---|---|---|
| 0.5 | 1 | 2 | 4 | 8 | Avg. | |
| CFSG U-Net |
|
|
|
|
|
|
| w/o CFAM | 62.18 | 72.76 | 81.41 | 84.46 | 87.50 | 77.66 |
| w/o SGAM | 64.10 | 74.52 | 82.53 | 85.74 | 88.46 | 79.07 |
| w/ SGAM | 66.35 | 76.28 | 84.13 | 87.50 | 89.26 | 80.71 |
Figure 6Three false-positive cases: (a) false-positive caused by uneven bone mineral density; (b) false-positive around the vertebra area; (c) false-positive around the costochondral joint. The ground truth is labelled by the yellow line, and the segmentation result is labelled by the red line.