| Literature DB >> 31936791 |
Shiran Song1, Jianhua Liu1,2, Yuan Liu1, Guoqiang Feng1, Hui Han1, Yuan Yao1, Mingyi Du1,2.
Abstract
High spatial resolution remote sensing image (HSRRSI) data provide rich texture, geometric structure, and spatial distribution information for surface water bodies. The rich detail information provides better representation of the internal components of each object category and better reflects the relationships between adjacent objects. In this context, recognition methods such as geographic object-based image analysis (GEOBIA) have improved significantly. However, these methods focus mainly on bottom-up classifications from visual features to semantic categories, but ignore top-down feedback which can optimize recognition results. In recent years, deep learning has been applied in the field of remote sensing measurements because of its powerful feature extraction ability. A special convolutional neural network (CNN) based region proposal generation and object detection integrated framework has greatly improved the performance of object detection for HSRRSI, which provides a new method for water body recognition based on remote sensing data. This study uses the excellent "self-learning ability" of deep learning to construct a modified structure of the Mask R-CNN method which integrates bottom-up and top-down processes for water recognition. Compared with traditional methods, our method is completely data-driven without prior knowledge, and it can be regarded as a novel technical procedure for water body recognition in practical engineering application. Experimental results indicate that the method produces accurate recognition results for multi-source and multi-temporal water bodies, and can effectively avoid confusion with shadows and other ground features.Entities:
Keywords: deep learning; high spatial resolution remotely sensed imagery; multi-source and multi-temporal; object recognition; water body
Year: 2020 PMID: 31936791 PMCID: PMC7014233 DOI: 10.3390/s20020397
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Study area.
Figure 2(a) Experiment data, (b) Worldview-3 HSRRSI data of September 2017, (c) Worldview-3 HSRRSI data of October 2014, (d) GF-2 HSRRSI data of April 2016 and (e) GF-2 HSRRSI data of September 2018.
Figure 3Experiment data (A) overview of the study area, (B) samples of land-use class: (a) residential, (b) commercial, (c) infrastructure, (d)industrial, (e) playground, (f) water, (g) farmland, (i) breeding, (j) unused land, (k) woodland and (C) Tongzhou New Town image fused by principal component substitution (PCS).
Water Body Morphological Characteristics and Remote Sensing Image Examples.
| Type | Features | WV-2014 | WV-2017 | GF-2016 | GF-2018 |
|---|---|---|---|---|---|
| Lake | Irregular roundness |
|
|
|
|
| Rivers | Banded, Trunk distinct |
|
|
|
|
| Paddy Field | Clustered, Regular, Block |
|
|
|
|
| Small Rivers | Slender strip |
|
|
|
|
Figure 4Mask R-CNN network architecture.
Figure 5Residual learning: a building block.
Figure 6Shortcut connections.
Figure 7ResNet basic network structure.
Figure 8A top-down architecture with lateral connections.
Figure 9Overall flow chart of regional proposal network (RPN).
Figure 10General workflow of object recognition of an urban water body based on deep learning.
Layers and details of ResNet structure of different depths.
| Layer Name | ResNet 50 | ResNet 65-C3 | ResNet 80-C3 | ResNet 65-C4 | ResNet 80-C4 | ResNet 110 |
|---|---|---|---|---|---|---|
| Conv1 | 7 | |||||
| Conv2_x | 3 | |||||
|
|
|
|
|
|
| |
| Conv3_x |
|
|
|
|
|
|
| Conv4_x |
|
|
|
|
|
|
| Conv5_x |
|
|
|
|
|
|
| Average pool, 1000-d fc, SoftMax | ||||||
Performance comparison of different depth networks.
| Category | Index | ResNet 50 | ResNet 65-C3 | ResNet 80-C3 | ResNet 65-C4 | ResNet 80-C4 | ResNet 110 |
|---|---|---|---|---|---|---|---|
| Water | Actual | 118 | 118 | 118 | 118 | 118 | 118 |
| Prediction | 95 | 102 | 112 | 105 | 116 | 135 | |
| Match | 72 | 83 | 94 | 87 | 98 | 106 | |
| Precision | 0.7579 | 0.8137 | 0.8393 | 0.8286 | 0.8448 | 0.7852 | |
| Recall | 0.6102 | 0.7034 | 0.7966 | 0.7373 | 0.8305 | 0.8983 |
The network structure of ResNet-116.
| ResNet 116 | ||||
|---|---|---|---|---|
| Conv1 | Conv2_x | Conv3_x | Conv4_x | Conv5_x |
| 7 × 7, 64, stride 2 | 3 × 3 max pool, stride 2 |
|
|
|
|
| ||||
The strategy of training/testing division for different datasets.
| Strategy | Training Samples | Testing Samples | |
|---|---|---|---|
| Dataset | |||
| WV-2014 | 500 | 100 | |
| WV-2017 | 500 | 100 | |
| GF-2016 | 500 | 100 | |
| GF-2018 | 500 | 100 | |
Figure 11The technical process of water sample construction method. The original image was manually labeled to generate the label, mask and the position information of water bodies.
Figure 12Examples of the WV-2014 (a), WV-2017 (b), GF-2016 (c), GF-2018 (d) sub-datasets. From top to bottom: image, label. From left to right: water bodies with different morphological characteristics.
Main parameter information of the model.
| Parameter | Values | Parameter | Values |
|---|---|---|---|
| GPU_COUNT | 1 | TRAIN_ROIS_PER_IMAG | 200 |
| IMAGES_PER_GPU | 1 | MAX_GT_INSTANCES | 200 |
| BACKBONE | ResNet | DETECTION_MAX_INSTANCES | 200 |
| BACKBONE_STRIDES | (4, 8, 16, 32, 64) | BATCH SIZE | 1 |
| NUM_CLASSES | 2 | EPOCHS | 30 |
| RPN_ANCHOR_SCALES | (32, 64, 128, 256, 512) | LEARNING_RATE | 0.0001 |
| RPN_ANCHOR_RATIOS | (0.5, 1, 2) | LEARNING_MOMENTUM | 0.9 |
| RPN_NMS_THRESHOLD | 0.7 | WEIGHT_DECAY | 0.0001 |
Figure 13Examples of water recognition results of the Mask R-CNN model trained by WV-2014 (a), WV-2017 (b), GF-2016 (c) and GF-2018 (d) datasets. From top to bottom: image, label, water recognition results. From left to right: water body recognition results based on different morphological characteristics of corresponding data sets.
Water recognition accuracy of the Mask R-CNN model trained with four different datasets at the object levels in four different test area.
| Training | Test | Actual | Prediction | Match | Precision | Recall |
|---|---|---|---|---|---|---|
| WV-2014 | WV2014 | 121 | 125 | 118 | 0.9440 | 0.9752 |
| WV2017 | 121 | 123 | 115 | 0.9350 | 0.9504 | |
| GF2016 | 150 | 157 | 131 | 0.8344 | 0.8733 | |
| GF2018 | 150 | 156 | 130 | 0.8333 | 0.8667 | |
| WV-2017 | WV2014 | 121 | 139 | 116 | 0.8345 | 0.9587 |
| WV2017 | 121 | 138 | 117 | 0.8478 | 0.9669 | |
| GF2016 | 150 | 169 | 133 | 0.7870 | 0.8867 | |
| GF2018 | 150 | 171 | 131 | 0.7661 | 0.8733 | |
| GF-2016 | WV2014 | 121 | 136 | 104 | 0.7647 | 0.8595 |
| WV2017 | 121 | 137 | 105 | 0.7664 | 0.8678 | |
| GF2016 | 150 | 179 | 145 | 0.8101 | 0.9667 | |
| GF2018 | 150 | 180 | 143 | 0.7944 | 0.9533 | |
| GF-2018 | WV2014 | 121 | 138 | 104 | 0.7536 | 0.8595 |
| WV2017 | 121 | 139 | 105 | 0.7554 | 0.8678 | |
| GF2016 | 150 | 182 | 140 | 0.7692 | 0.9333 | |
| GF2018 | 150 | 181 | 143 | 0.7901 | 0.9533 |
Figure 14Examples of water recognition results of different methods. From left to right: original image, masks, water recognition results of the Mask R-CNN, Cart, SVM, KNN and Random Trees; From top to bottom: water recognition results of different datasets: (a) WV-2014, (b) WV-2017, (c) GF-2016 and (d) GF-2018.
Water recognition accuracy of different methods in four different test areas.
| Datasets | Index | Cart | SVM | KNN | Random Trees | Mask R-CNN |
|---|---|---|---|---|---|---|
| WV-2014 | Actual | 121 | 121 | 121 | 121 | 121 |
| Prediction | 127 | 127 | 126 | 125 | 125 | |
| Match | 113 | 114 | 116 | 117 | 118 | |
| Precision | 0.8898 | 0.8976 | 0.9206 | 0.9360 | 0.9440 | |
| Recall | 0.9339 | 0.9421 | 0.9587 | 0.9669 | 0.9752 | |
| WV-2017 | Actual | 121 | 121 | 121 | 121 | 121 |
| Prediction | 135 | 136 | 137 | 137 | 138 | |
| Match | 112 | 113 | 115 | 116 | 117 | |
| Precision | 0.8296 | 0.8309 | 0.8394 | 0.8467 | 0.8478 | |
| Recall | 0.9256 | 0.9339 | 0.9504 | 0.9587 | 0.9669 | |
| GF-2016 | Actual | 150 | 150 | 150 | 150 | 150 |
| Prediction | 177 | 179 | 180 | 179 | 179 | |
| Match | 140 | 142 | 143 | 146 | 145 | |
| Precision | 0.7910 | 0.7933 | 0.7944 | 0.8156 | 0.8101 | |
| Recall | 0.9333 | 0.9467 | 0.9533 | 0.9733 | 0.9667 | |
| GF-2018 | Actual | 150 | 150 | 150 | 150 | 150 |
| Prediction | 178 | 179 | 180 | 183 | 181 | |
| Match | 137 | 139 | 140 | 142 | 143 | |
| Precision | 0.7697 | 0.7765 | 0.7778 | 0.7760 | 0.7901 | |
| Recall | 0.9133 | 0.9267 | 0.9333 | 0.9467 | 0.9533 | |
| Average | Precision | 0.8200 | 0.8246 | 0.8331 | 0.8436 | 0.8480 |
| Accuracy | Recall | 0.9265 | 0.9373 | 0.9489 | 0.9614 | 0.9655 |