| Literature DB >> 35401724 |
Tian-Yu Jiang1,2,3,4, Feng-Lan Ju5, Ya-Xun Dai5, Jie Li1,2,3,4, Yi-Fan Li1,2,3,4, Yun-Jie Bai1,2,3,4, Ze-Qian Cui1,2,3,4, Zheng-Han Xu1,2,3,4, Zun-Qian Zhang1,2,3,4.
Abstract
In order to reveal the dissolution behavior of iron tailings in blast furnace slag, the main component of iron tailings, SiO2, was used for research. Aiming at the problem of information loss and inaccurate extraction of tracking molten SiO2 particles in high temperature, a method based on the improved DeepLab v3+ network was proposed to track, segment, and extract small object particles in real time. First, by improving the decoding layer of the DeepLab v3+ network, construct dense ASPP (atrous spatial pyramid pooling) modules with different dilation rates to optimize feature extraction, increase the shallow convolution of the backbone network, and merge it into the upper convolution decoding part to increase detailed capture. Secondly, integrate the lightweight network MobileNet v3 to reduce network parameters, further speed up image detection, and reduce the memory usage to achieve real-time image segmentation and adapt to low-level configuration hardware. Finally, improve the expression of the loss function for the binary classification model of small object in this paper, combining the advantages of the Dice Loss binary classification segmentation and the Focal Loss balance of positive and negative samples, solving the problem of unbalanced dataset caused by the small proportion of positive samples. Experimental results show that MIoU (mean intersection over union) of the proposed model for small object segmentation is 6% higher than that of the original model, the overall MIoU is increased by 3%, and the execution time and memory consumption are only half of the original model, which can be well applied to real-time tracking and segmentation of small particles.Entities:
Mesh:
Substances:
Year: 2022 PMID: 35401724 PMCID: PMC8986418 DOI: 10.1155/2022/2309317
Source DB: PubMed Journal: Comput Intell Neurosci
Figure 1Atrous spatial pyramid pooling.
Figure 2(a) DeepLabv3 model diagram; (b) encoding and decoding methods; (c) improved DeepLab v3+ model diagram influenced by decoding ideas; (d) the model structure realized in this paper.
Figure 3Model loss and accuracy training graph.
Figure 4Research overall flow chart.
Comparison of test results of ASPP module improvement schemes.
| Group | Dilation rate | HFS | DSAConv | MIoU/% | Training time hour |
|
|---|---|---|---|---|---|---|
| 1 | [ | 74.52 | 23.85 | 275.3 | ||
| 2 | [ | 74.98 | 25.62 | 310.8 | ||
| 3 | [ | √ | 75.39 | 27.37 | 322.4- | |
| 4 | [ | √ | 75.82 | 30.44 | 372.0 | |
| 5 | [ | √ | √ | 75.36 | 21.45 | 253.2 |
| 6 | [ | √ | √ | 75.62 | 25.60 | 312.4 |
Figure 5(a) The accuracy of the model in this paper fluctuates. (b) The loss of the model in this paper fluctuates. (c) The loss of the original model fluctuates.
Performance comparison of different network architectures.
| Network | Top 1/% | Params (M) | MAdds (M) | CPU | Advantage |
|---|---|---|---|---|---|
| MobileNet v1 | 70.6 | 4.2 | 575 | 113 ms | Proposed depthwise separable convolution |
| MobileNet v2 | 72.0 | 3.4 | 300 | 75 ms | Proposed inverted residuals and linear bottlenecks |
| ShuffleNet(×2) [ | 73.7 | 5.4 | 524 | - | Combined grouped convolution and channel shuffle |
| NasNet-A | 74.0 | 5.3 | 564 | 183 ms | Designed NasNet search space |
| Proxyless [ | 74.6 | 4 | 320 | 156 ms | A new path pruning method was proposed, which reduced memory consumption |
| MobileNet v3 | 75.2 | 5.4 | 219 | 69 ms | Combined complementary search technology and introduced the h-swish activation function |
Performance comparison before and after model improvement.
| Algorithm name | Pre-MIoU/% | Mid-MIoU/% | Last-MIoU/% | Time/ms | RAM/MB |
|---|---|---|---|---|---|
| DeepLabv3 + basic model | 91.4 | 88.6 | 80.3 | 424 | 52 |
| Improved DeepLabv3 + model | 92.6 | 90.1 | 86.2 | 215 | 23 |
| FastFCN | 91.6 | 89.1 | 82.2 | 220 | 31 |
| VisTR | 92.4 | 90.1 | 85.2 | 230 | 41 |
Figure 6SiO2 comparison of different melting times.
Explanation of special symbols in the text.
| Symbol | Explanation | Page |
|---|---|---|
|
| Definition of dice coefficient | 6 |
|
| Pixels of the whole image | 6 |
| | | Intersection between | 6 |
| | | The number of elements in | 6 |
|
| Definition of Focal Loss function | 8 |
|
| Tunable hyperparameters | 8 |
|
| Model target predicted value | 8 |
|
| Preventing nonexistence of the loss function from occurring | 8 |
|
| The modified loss function definition | 8 |
|
| k is the number of categories (except for empty categories) | 8 |
|
| Mean intersection over union | 11 |
|
| Pixels correctly segmented into SiO2 regions | 11 |
|
| Pixels in the SiO2 region that were incorrectly marked as background | 11 |
|
| Wrongly segmented into background pixels | 11 |
|
| All pixels correctly segmented into SiO2 regions | 11 |
|
| All pixels in SiO2 regions that were incorrectly marked as background | 11 |
|
| All are wrongly segmented into background pixels | 11 |