| Literature DB >> 35746220 |
Sijia Li1, Furkat Sultonov1, Qingshan Ye2, Yong Bai2, Jun-Hyun Park1, Chilsig Yang3, Minseok Song3, Sungwoo Koo3, Jae-Mo Kang1.
Abstract
Road segmentation has been one of the leading research areas in the realm of autonomous driving cars due to the possible benefits autonomous vehicles can offer. Significant reduction of crashes, greater independence for the people with disabilities, and reduced traffic congestion on the roads are some of the vivid examples of them. Considering the importance of self-driving cars, it is vital to develop models that can accurately segment drivable regions of roads. The recent advances in the area of deep learning have presented effective methods and techniques to tackle road segmentation tasks effectively. However, the results of most of them are not satisfactory for implementing them into practice. To tackle this issue, in this paper, we propose a novel model, dubbed as TA-Unet, that is able to produce quality drivable road region segmentation maps. The proposed model incorporates a triplet attention module into the encoding stage of the U-Net network to compute attention weights through the triplet branch structure. Additionally, to overcome the class-imbalance problem, we experiment on different loss functions, and confirm that using a mixed loss function leads to a boost in performance. To validate the performance and efficiency of the proposed method, we adopt the publicly available UAS dataset, and compare its results to the framework of the dataset and also to four state-of-the-art segmentation models. Extensive experiments demonstrate that the proposed TA-Unet outperforms baseline methods both in terms of pixel accuracy and mIoU, with 98.74% and 97.41%, respectively. Finally, the proposed method yields clearer segmentation maps on different sample sets compared to other baseline methods.Entities:
Keywords: TA-Unet; U-Net; road feasible domain segmentation; triplet attention module
Mesh:
Year: 2022 PMID: 35746220 PMCID: PMC9231296 DOI: 10.3390/s22124438
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.847
Figure 1Detailed architecture of the triplet attention mechanism, which calculates attention weights based on a three-branch structure to capture cross-dimensional interactions. The first branch (green) computes channel dimension C and spatial dimension W, the second branch (yellow) captures channel dimension C and spatial dimension H, and the third branch (blue) computes spatial dependencies between H and W. The final output is an average of the resultant feature maps of the branches.
Figure 2Illustration of the proposed TA-Unet. The model receives a sample size of 640 × 368 pixels as an input. Each blue arrow represents convolution operations with a 3 × 3 convolutional kernel followed by ReLU nonlinearity and batch normalization, the orange arrows represent triplet attention, and the red and green arrows stand for max-pooling and upsampling operations, respectively. The gray arrows connect the output of encoder layers with the input of corresponding decoder layers. The purple box in the decoder layer is the final segmentation map of the model.
The mIoU scores of the proposed TA-Unet and SCGN framework on the UAS dataset.
| Dataset | SGSN | TA-Unet |
|---|---|---|
| Dusk set | 98.04 | 98.18 |
| Night set | 94.01 | 94.39 |
| Rain set | 97.04 | 98.03 |
| Sun set | 97.58 | 97.85 |
| UAS | 96.40 | 97.41 |
Figure 3Pixel accuracy and mIoU of different networks on the validation set. The x-axis represents pixel accuracy (PA) and mean IoU in subfigures (a,b), respectively, while the y-axis stands for number of iterations in both subfigures.
Quantitative results.
| Method | Accuracy | mIoU | Parameters |
|---|---|---|---|
| FCN | 98.32 | 96.50 | 97.25 M |
| U-Net | 97.46 | 95.97 | 13.40 M |
| DANet | 98.68 | 97.20 | 47.51 M |
| Attention U-Net | 98.01 | 96.04 | 34.89 M |
| TA-Unet | 98.74 | 97.40 | 31.05 M |
Performance of TA-Unet when trained on different loss functions.
| Cross-Entropy Loss Function | Lovasz-Softmax Loss Function | Mixed Loss Function | |
|---|---|---|---|
| acc | 98.66 | 98.68 | 98.74 |
| mIoU | 97.29 | 97.30 | 97.41 |
Figure 4Road segmentation results of different methods in different conditions.