| Literature DB >> 30441771 |
Wuttichai Boonpook1, Yumin Tan2, Yinghua Ye3, Peerapong Torteeka4, Kritanai Torsri5,6, Shengxian Dong7.
Abstract
Buildings along riverbanks are likely to be affected by rising water levels, therefore the acquisition of accurate building information has great importance not only for riverbank environmental protection but also for dealing with emergency cases like flooding. UAV-based photographs are flexible and cloud-free compared to satellite images and can provide very high-resolution images up to centimeter level, while there exist great challenges in quickly and accurately detecting and extracting building from UAV images because there are usually too many details and distortions on UAV images. In this paper, a deep learning (DL)-based approach is proposed for more accurately extracting building information, in which the network architecture, SegNet, is used in the semantic segmentation after the network training on a completely labeled UAV image dataset covering multi-dimension urban settlement appearances along a riverbank area in Chongqing. The experiment results show that an excellent performance has been obtained in the detection of buildings from untrained locations with an average overall accuracy more than 90%. To verify the generality and advantage of the proposed method, the procedure is further evaluated by training and testing with another two open standard datasets which have a variety of building patterns and styles, and the final overall accuracies of building extraction are more than 93% and 95%, respectively.Entities:
Keywords: UAV dataset; building extraction; deep learning; river bank monitoring
Year: 2018 PMID: 30441771 PMCID: PMC6264059 DOI: 10.3390/s18113921
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1The study area; outer large red box in top left shows the entire study area; enlarged cyan boxes refers to training set area and yellow boxes are two testing areas.
Figure 2Buildings with different perspectives in study area. (a) building architectures; (b) high and medium-size building rooftops; (c) small-size building rooftops; (d) small building rooftops; (e) dense and tall buildings; (f) different sides of a building; (g) greening on building roofs; (h) playground on building roofs.
Figure 3UAV images and their corresponding annotation images.
Figure 4SegNet architecture for semantic labeling.
The number of samples for training, validating, and testing from our UAV datasets.
| Dataset | Training | Validating | Testing | |
|---|---|---|---|---|
| Area_1 | Area_2 | |||
| UAV dataset | 1600 | 400 | 120 | 100 |
Figure 5Two open standard datasets used: (a) Inria Aerial Image Labeling Dataset; (b) ISPRS Potsdam semantic labeling dataset.
The number of samples for training, validating, and testing from the two standard datasets.
| Dataset | Training | Validating | Testing |
|---|---|---|---|
| Inria Aerial Image Labeling Dataset | 23,100 | 3850 | 770 |
| ISPRS Potsdam semantic labeling dataset | 320 | 80 | 80 |
Classification accuracy results of UAV dataset (%).
| Dataset | Building | Non-Building | mIoU | Overall Acc. |
|---|---|---|---|---|
| Numerical evaluation on validating set | ||||
| UAV dataset | 92.01 | 94.67 | 84.39 | 92.47 |
| Numerical evaluation on two testing sets | ||||
| Area_1 | 84.12 | 93.59 | 81.27 | 92.59 |
| Area_2 | 90.59 | 88.35 | 80.97 | 89.50 |
Figure 6Visual segmentation results on two testing sets which consist of input RGB image (top), reference data (middle), and building extraction result (bottom). (a,b) are from testing area (area_1); (c,d) are from testing area (area_2).
Classification accuracy results of two standard datasets (%).
| Dataset | Building | Non-Building | mIoU | Overall Acc. |
|---|---|---|---|---|
| Inria aerial image labeling dataset | 91.40 | 94.84 | 85.32 | 93.42 |
| ISPRS Potsdam dataset | 92.12 | 96.65 | 87.80 | 95.79 |
Figure 7Visual segmentation results on Inria aerial dataset (left) and ISPRS Potsdam dataset (right) which consist of input color image (top), reference segmentation data (middle), and building extraction result (bottom).