| Literature DB >> 33318503 |
Chuanbo Wang1, D M Anisuzzaman1, Victor Williamson1, Mrinal Kanti Dhar1, Behrouz Rostami2, Jeffrey Niezgoda3, Sandeep Gopalakrishnan4, Zeyun Yu5,6.
Abstract
Acute and chronic wounds have varying etiologies and are an economic burden to healthcare systems around the world. The advanced wound care market is expected to exceed $22 billion by 2024. Wound care professionals rely heavily on images and image documentation for proper diagnosis and treatment. Unfortunately lack of expertise can lead to improper diagnosis of wound etiology and inaccurate wound management and documentation. Fully automatic segmentation of wound areas in natural images is an important part of the diagnosis and care protocol since it is crucial to measure the area of the wound and provide quantitative parameters in the treatment. Various deep learning models have gained success in image analysis including semantic segmentation. This manuscript proposes a novel convolutional framework based on MobileNetV2 and connected component labelling to segment wound regions from natural images. The advantage of this model is its lightweight and less compute-intensive architecture. The performance is not compromised and is comparable to deeper neural networks. We build an annotated wound image dataset consisting of 1109 foot ulcer images from 889 patients to train and test the deep learning models. We demonstrate the effectiveness and mobility of our method by conducting comprehensive experiments and analyses on various segmentation neural networks. The full implementation is available at https://github.com/uwm-bigdata/wound-segmentation .Entities:
Mesh:
Year: 2020 PMID: 33318503 PMCID: PMC7736585 DOI: 10.1038/s41598-020-78799-w
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1An illustration of images in our dataset. The first row contains the raw images collected. The second row consists of segmentation mask annotations we create with the AZH wound and vascular center.
Figure 3(a) A depth-separable convolution block. The block contains a 3 × 3 depth-wise convolutional layer and a 1 × 1 point-wise convolution layer. Each convolutional layer is followed by batch normalization and Relu6 activation. (b) An example of a convolution layer with a 3 × 3 × 3 kernel. (c) An example of a depth-wise separable convolution layer equivalent to (b).
Figure 2The encoder–decoder architecture of MobilenetV2[19].
Figure 4An illustration of the segmentation result and the post processing method. The first row illustrates images in the testing dataset. The second row shows the segmentation results predicted by our model without any post processing. The holes are marked with red boxes and the noises are marked with yellow boxes. The third row shows the final segmentation masks generated by the post processing method.
The precision, recall, and dice score evaluated using various models on our dataset.
| Model | VGG16 (%) | SegNet (%) | U-Net (%) | Mask-RCNN (%) | MobileNetV2 (%) | MobileNetV2 + CCL (%) |
|---|---|---|---|---|---|---|
| Precision | 83.91 | 83.66 | 89.04 | 90.86 | 91.01 | |
| Recall | 78.35 | 86.49 | 86.40 | 89.76 | 89.97 | |
| Dice | 81.03 | 85.05 | 90.15 | 90.20 | 90.30 |
Bold values indicate the best performance among the present models
The precision, recall, and dice score evaluated using various models on the Medetec dataset.
| Model | VGG16 (%) | SegNet (%) | U-Net (%) | Mask-RCNN (%) | MobileNetV2 (%) | MobileNetV2 + CCL (%) |
|---|---|---|---|---|---|---|
| Precision | 77.84 | 72.03 | 86.84 | 93.69 | 93.84 | |
| Recall | 80.69 | 73.87 | 81.33 | 88.60 | 94.06 | |
| Dice | 79.24 | 72.94 | 84.01 | 93.20 | 93.88 |
Bold values indicate the best performance among the present models
Comparison of total numbers of trainable parameters.
| Model name | FCN-VGG16 | SegNet | U-Net | Mask-RCNN | MobileNetV2 |
|---|---|---|---|---|---|
| Number of parameters | 134,264,641 | 902,561 | 4,834,839 | 63,621,918 | 2,141,505 |
Figure 5An illustration of the idea of multi-stream model that recognize pixels intensities and shape information separately. To predict segmentation masks, the output tensors from both streams can be fused using carefully designed concatenation layers or multi-scale pooling layers.