| Literature DB >> 35194385 |
Fradi Marwa1,2, El-Hadi Zahzah2, Kais Bouallegue3, Mohsen Machhout1.
Abstract
Deep-learning techniques have led to technological progress in the area of medical imaging segmentation especially in the ultrasound domain. In this paper, the main goal of this study is to optimize a deep-learning-based neural network architecture for automatic segmentation in Ultrasonic Computed Tomography (USCT) bone images in a short time process. The proposed method is based on an end to end neural network architecture. First, the novelty is shown by the improvement of Variable Structure Model of Neuron (VSMN), which is trained for both USCT noise removal and dataset augmentation. Second, a VGG-SegNet neural network architecture is trained and tested on new USCT images not seen before for automatic bone segmentation. Therefore, we offer a free USCT dataset. In addition, the proposed model is implemented on both the CPU and the GPU, hence overcoming previous works by a value of 97.38% and 96% for training and validation and achieving high segmentation accuracy for testing with a small error of 0.006, in a short time process. The suggested method demonstrates its ability to augment USCT data and then to automatically segment USCT bone structures achieving excellent accuracy outperforming the state of the art.Entities:
Keywords: GPU; Segmentation; Time process; USCT; VSMN-VGG-SegNet
Year: 2022 PMID: 35194385 PMCID: PMC8853291 DOI: 10.1007/s11042-022-12322-3
Source DB: PubMed Journal: Multimed Tools Appl ISSN: 1380-7501 Impact factor: 2.577
Fig. 1Ultrasonic Computed Tomography device
Fig. 2Synoptic flow of proposed method
Fig. 3Neuron Model
Fig. 4VSMN architecture (seven Layers L)
Mathmetical analysis of SVMN architecture in Fig. 4
| Layers | Parameters: n, p, q, k | |
|---|---|---|
| L1 | n = 0, p = q, k = 1 | g(x) = exp(−x + p)2 |
| L2 | n = 1, p = q, k = 1 | g(x) = (−x + q)1 exp(−x + p)2 |
| L3 | n = 2, p = q, k = 1 | g(x) = (−x + q)2 exp(−x + p)2 |
| L4 | n = 3, p = q, k = 1 | g(x) = (−x + q)3 exp(−x + p)2 |
| L5 | n = 4, p = q, k = 1 | g(x) = (−x + q)4 exp(−x + p)2 |
| L6 | n = 5, p = q, k = 1 | g(x) = (−x + q)5 exp(−x + p)2 |
| L7 | n = 6, p = q, k = 1 | g(x) = (−x + q)6 exp(−x + p)2 |
Mathematical analysis via internal architecture of layer
| Layers | Parameters: n, p, q, k | Y = Output | |
|---|---|---|---|
| L1 | n = 0, p = q, k = 1 | g(x) = exp(−x + p)2 | Y0 = exp(−x + p)2 Y1 = exp(−Y0 + 1)2 Y2 = exp(−Y1 + 1)2 Y3 = exp(−Y2 + 1)2 |
Fig. 5VSMN with stable behavior for n = 0 and k = 1
Fig. 6USCT output through layer 0 VSMN with stable behavior for n = 0 and k = 1, (a): Patella adult bone used to be imaged by USCT (b): USCT Results via layer 0
Fig. 7VSMN behavior, n = 1, (a) VSMN curve, (b) output USCT image
Fig. 8VSMN behavior, n = 2: (a) VSMN curve, (b) Output USCT image
Fig. 9VSMN behavior: (a) n = 3, p = q = 0.5, (b) n = 3, p = q = 0.75
Fig. 10Internal architecture of VGG-SegNet
VGG-SegNet architecture
| Encoder | Decoder | ||||
|---|---|---|---|---|---|
| Block | Image Size | Filter | Block | Image Size | Filter |
| Block 1 | Block 1 | ||||
Conv1 + Relu Conv2 + Relu Maxpooling | Input (256*256) Output (224*224) | (64, (3,3)) (64,(3,3)) 2D (2,2) | Up sampling Zero Padding B.N | Input (14*14) Output14*14 | 512,((3,3)) |
| Block 2 | Block 2 | ||||
Conv1 + Relu Conv2 + Relu Maxpooling | Input (224*224) Output 112*112 | (128, (3,3)) (128, (3,3)) 2D (2,2) | Up sampling Zero Padding B.N | Input (28*28) Output (28*28) | 512,((3,3)) |
| Block 3 | Block 3 | ||||
Conv1 + Relu Conv2 + Relu Conv3 + Relu Maxpooling | Input 112*112 Output 56*56 | 256,((3,3)) 256,((3,3)) 256,((3,3)) 2D (2,2) | Up sampling Zero Padding Conv2DLayer B.N | Input (56*56) Output (56*56 | 256,((3,3)) |
| Block 4 | Block 4 | ||||
Conv1 + Relu Conv2 + Relu Conv3 + Relu Maxpooling | Input 56*56 Output 28*28 | 512,((3,3)) 512,((3,3)) 512,((3,3)) 2D (2,2) | Up sampling Zero Padding Con2D Layer B.N | Input (112*112) Output(112*112) | 128,((3,3)) |
| Block 5 | Block 5 | ||||
Conv1 + Relu Conv2 + Relu Conv3 + Relu Maxpooling | Input 28*28 Output 14*14 | 512,((3,3)) 512,((3,3)) 512,((3,3)) 2D (2,2) | Up sampling Zero Padding Con2D Layer B.N | Input (224*224 Output(224*224) | 64,((3,3)) |
Fig. 11Results of VSMN implementation: (a): Layer 4 (b): Layer 5(c): Layer 6 (d): Layer 7 (For g(x) = (−x + 1)3 exp(−x + 1)2): Output
SNR results of subsamples of USCT images
| Images | Number of images | Mean SNR |
|---|---|---|
USCT images used for training USCT images used for validation USCT images used for testing | 200 100 100 | 15.87 15.42 14.36 |
Fig. 12USCT dataset augmentation
Fig. 13USCT image labeling, (a): USCT bone image, (b): Annotated USCT, (c): USCT image mask
Accuracy results during training and validation processes
| Epochs | Train accuracy | Train loss | Validation accuracy | Validation loss |
|---|---|---|---|---|
| 1 | 82.64% | 0.519 | 82.66% | 0.4764 |
| 2 | 83.43% | 0.4488 | 83.56% | 0.4603 |
| 3 | 84.10% | 0.4181 | 79.86% | 0.506 |
| 4 | 84.95% | 0.3927 | 83.91% | 0.393 |
| 5 | 87.28% | 0.3203 | 86.11% | 0.2008 |
| 6 | 92.67% | 0.1919 | 89.95% | 0.1474 |
| 7 | 95.33% | 0.1288 | 94.17% | 0.1274 |
| 8 | 96.43% | 0.1014 | 95.12% | 0.1215 |
| 9 | 96.99% | 0.0888 | 94.53% | 0.1450 |
| 10 | 97.38% | 0.079 | 95.82% | 0.1115 |
Fig. 14Model accuracy during the Training and validation processes
Fig. 15Model loss during the training and validation processes
Fig. 16Dataset of USCT bone images for validation
Fig. 17Segmented validation results
Fig. 18Comparison of Segmented validation results with the ground truth, (a): Input USCT images, (b): Ground Truth, (c): Segmented USCT images
Fig. 19USCT bone images used for testing
Fig. 20Segmented USCT bone images used for testing
PSNR, MSE and IOU for subsamples of USCT bone images used for test
| Parameters | USCT used for the prediction process |
|---|---|
| Mean PSNR | 10.44 |
| Mean MSE | 0.0061 |
| IOU | 0.96 |
Implementation results on GPU and CPU
| Network | GPU | GPU inference memory | GPU runtime in training | GPU runtime in testing | CPU runtime | Energy consummation |
|---|---|---|---|---|---|---|
| VGG-SegNet | 10 MB | 12,194 MB | 1 s/step | 0.15 s/step (Appendix3) | 3 s/step | 17 W/200 W |
| VGG-Unet | 10 MB | 12,194 MB | + | 0.15 s/step (Appendix3) | – | – |
Fig. 21Real scene images
Fig. 22Segmentation results of real scene images during the testing process on GPU
Accuracy comparative study with state of the art
| Neural network model | Train accuracy | Validation accuracy | Dataset |
|---|---|---|---|
| SVMN-VGG-SegNet | Bone USCT images | ||
| SVMN-VGG-Unet | 96% | Bone USCT images | |
| SegNet [25] | 91.47% | – | MRI brain images |
| CNN [26] | 92% | – | CT bone images |
| CNN-UNet | 92% | – | CT scans bone images |
| CNN [ | 85% | – | MRI vertebral bone |
| Fully-Automated deep learning based CNN [ | 94%(1 year) 90%(1 year) | – | Human bones |
| SegNet [ | – | 95% | CT lung images |
| UNet [ | – | 91% | CT lung images |
| VGG-SegNet [ | 95.86% | – | Lung CT Parenchyma images |
| SegNet [ | 63.89% | – | Gastric cancer images |