| Literature DB >> 36236574 |
Junjie Fu1, Xiaomei Yi1,2, Guoying Wang1,2, Lufeng Mo1,2, Peng Wu1,2, Kasanda Ernest Kapula1.
Abstract
Ground-object classification using remote-sensing images of high resolution is widely used in land planning, ecological monitoring, and resource protection. Traditional image segmentation technology has poor effect on complex scenes in high-resolution remote-sensing images. In the field of deep learning, some deep neural networks are being applied to high-resolution remote-sensing image segmentation. The DeeplabV3+ network is a deep neural network based on encoder-decoder architecture, which is commonly used to segment images with high precision. However, the segmentation accuracy of high-resolution remote-sensing images is poor, the number of network parameters is large, and the cost of training network is high. Therefore, this paper improves the DeeplabV3+ network. Firstly, MobileNetV2 network was used as the backbone feature-extraction network, and an attention-mechanism module was added after the feature-extraction module and the ASPP module to introduce focal loss balance. Our design has the following advantages: it enhances the ability of network to extract image features; it reduces network training costs; and it achieves better semantic segmentation accuracy. Experiments on high-resolution remote-sensing image datasets show that the mIou of the proposed method on WHDLD datasets is 64.76%, 4.24% higher than traditional DeeplabV3+ network mIou, and the mIou on CCF BDCI datasets is 64.58%. This is 5.35% higher than traditional DeeplabV3+ network mIou and outperforms traditional DeeplabV3+, U-NET, PSP-NET and MACU-net networks.Entities:
Keywords: high-resolution remote-sensing images; object classification; semantic segmentation
Mesh:
Year: 2022 PMID: 36236574 PMCID: PMC9571339 DOI: 10.3390/s22197477
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.847
Figure 1Structure diagram of DeeplabV3+.
Figure 2Structure diagram of Modified Aligned Xception.
Figure 3Improved structure diagram of improved DeeplabV3+.
MobileNetV2 network structure.
| Input | Network | Expansion Multiple of Input Channel | Number of Output Channels | Module Repetitions | Step |
|---|---|---|---|---|---|
| 256 × 256 × 3 | Conv2d | - | 32 | 1 | 2 |
| 128 × 128 × 32 | Bottleneck | 1 | 16 | 1 | 1 |
| 128 × 128 × 16 | Bottleneck | 6 | 24 | 2 | 2 |
| 64 × 64 × 24 | Bottleneck | 6 | 32 | 3 | 2 |
| 32 × 32 × 32 | Bottleneck | 6 | 64 | 4 | 2 |
| 32 × 32 × 64 | Bottleneck | 6 | 96 | 3 | 1 |
| 16 × 16 × 96 | Bottleneck | 6 | 160 | 3 | 2 |
| 8 × 8 × 160 | Bottleneck | 6 | 320 | 1 | 1 |
WHDLD Results of ablation experiment.
| Method | mPA (%) | mRecall (%) | mIou (%) |
|---|---|---|---|
| Traditional DeeplabV3+ | 72.52 | 71.74 | 60.52 |
| Scheme 1 | 74.34 | 73.78 | 62.67 |
| Scheme 2 | 76.14 | 75.45 | 64.22 |
| Scheme 3 | 75.72 | 75.07 | 63.84 |
| Scheme 4 | 76.75 | 76.12 | 64.76 |
CCF BDCI Results of ablation experiment.
| Method | mPA (%) | mRecall (%) | mIoU (%) |
|---|---|---|---|
| Traditional DeeplabV3+ | 71.24 | 70.44 | 59.23 |
| Scheme 1 | 74.09 | 73.58 | 62.47 |
| Scheme 2 | 75.91 | 75.31 | 64.07 |
| Scheme 3 | 75.46 | 74.87 | 63.64 |
| Scheme 4 | 76.57 | 75.95 | 64.58 |
Figure 4WHDLD Comparison of segmentation effect.
Figure 5CCF BDCI Comparison of segmentation effect.
WHDLD Comparison of segmentation methods.
| Method | mPA (%) | mRecall (%) | mIoU (%) |
|---|---|---|---|
| Traditional DeeplabV3+ | 72.52 | 71.74 | 60.52 |
| U-Net | 70.12 | 69.62 | 58.32 |
| PSP-Net | 67.23 | 66.56 | 55.46 |
| MACU-Net | 74.28 | 73.67 | 62.37 |
| Paper Method | 76.75 | 76.12 | 64.76 |
CCF BDCI Comparison of segmentation methods.
| Method | mPA (%) | mRecall (%) | mIoU (%) |
|---|---|---|---|
| Traditional DeeplabV3+ | 71.24 | 70.44 | 59.23 |
| U-Net | 70.01 | 69.41 | 58.11 |
| PSP-Net | 66.90 | 66.32 | 55.23 |
| MACU-Net | 73.95 | 73.36 | 62.07 |
| Paper Method | 76.57 | 75.95 | 64.58 |
Comparison of training time and number of parameters in WHDLD.
| Method | Training Time/Epoch (S) | Parameter Quantity (M) |
|---|---|---|
| Traditional DeeplabV3+ | 331 | 209.71 |
| U-Net | 245 | 10.86 |
| PSP-Net | 211 | 9.31 |
| MACU-Net | 317 | 5.12 |
| Paper method | 265 | 22.51 |
Figure 6WHDLD and CCF BDCI Comparison of segmentation effect.
Comparison between this paper and the existing ground object classification methods.
| Literature | Dataset | Feature Extraction | Feature Extraction | Feature Extraction | Feature Extraction |
|---|---|---|---|---|---|
| Literature [ | WorldView2 High resolution satellite remote-sensing Image data | 0.5 m and 1.8 m | Paper | Support vector machine | Manual extraction |
| Literature [ | Self made dataset | 1 m; 0.5 m and 0.2 m | Paper | Relief F Algorithm, genetic algorithm and support vector machine | Manual extraction |
| Literature [ | WorldView-2 & QuickBird | 0.5 m and 0.6 m | Paper | Tabu search algorithm, genetic algorithm and support vector machine | Manual extraction |
| Literature [ | WHDLD | 2 m | image | adaptive neuro-fuzzy inference system | Automatic extraction |
| Paper | WHDLD and CCF BDCI | 2 m | image | Deep neural network | Automatic extraction |