| Literature DB >> 35432624 |
André Luiz C Ottoni1,2, Raphael M de Amorim2, Marcela S Novo3, Dayana B Costa4.
Abstract
Deep Learning methods have important applications in the building construction image classification field. One challenge of this application is Convolutional Neural Networks adoption in a small datasets. This paper proposes a rigorous methodology for tuning of Data Augmentation hyperparameters in Deep Learning to building construction image classification, especially to vegetation recognition in facades and roofs structure analysis. In order to do that, Logistic Regression models were used to analyze the performance of Convolutional Neural Networks trained from 128 combinations of transformations in the images. Experiments were carried out with three architectures of Deep Learning from the literature using the Keras library. The results show that the recommended configuration (Height Shift Range = 0.2; Width Shift Range = 0.2; Zoom Range =0.2) reached an accuracy of 95.6 % in the test step of first case study. In addition, the hyperparameters recommended by proposed method also achieved the best test results for second case study: 93.3 % .Entities:
Keywords: Building construction image classification; Convolutional neural networks; Data augmentation; Deep learning; Hyperparameter tuning
Year: 2022 PMID: 35432624 PMCID: PMC9005628 DOI: 10.1007/s13042-022-01555-1
Source DB: PubMed Journal: Int J Mach Learn Cybern ISSN: 1868-8071 Impact factor: 4.012
Fig. 1Examples of images of the class 0 - without vegetation on the facade
Fig. 2Examples of images of the class 1 - with vegetation on the facade
Fig. 3Examples of images generated by Keras data augmentation: a original image; b–d rotation range (); e horizontal flip (); f vertical flip (); g and h height shift range (); l shear range (); j–l width shift range (); m–p zoom range ()
Fig. 4Examples of images of the class 0 (roofs with clean gutters) in second small dataset
Fig. 5Examples of images of the class 1 (roofs with dirty gutters) in second small dataset
Results of Data Augmentation hyperparameter tuning with logistic regression in stage 1.
| Hyperparameter | |||||
|---|---|---|---|---|---|
| Rotation R. (R) | – 0.005 | 0.86 | 0 | 0 | 0.995 |
| Hor. Flip (H) | – 0.118 | 0.00 | False | 0 | 0.888 |
| Vertical Flip (V) | – 0.130 | 0.00 | False | 0 | 0.878 |
| Height S. R. (He) | 0.179 | 0.00 | 0.2 | 1 | |
| Shear Range (S) | – 0.008 | 0.79 | 0 | 0 | 0.992 |
| Width S. R. (W) | 0.114 | 0.00 | 0.2 | 1 | |
| Zoom Range (Z) | 0.111 | 0.00 | 0.2 | 1 |
Bold values indicate the hyperparameters with OR > 1 and p < 0.05
Hyperparameter combinations of data augmentation selected in stage 1
| Comb. | He | W | Z | |||
|---|---|---|---|---|---|---|
| 1 | 0 | 0 | 0 | 0 | 0 | 0 |
| 2 | 0 | 0 | 0.2 | 0 | 0 | 1 |
| 3 | 0 | 0.2 | 0 | 0 | 1 | 0 |
| 4 | 0 | 0.2 | 0.2 | 0 | 1 | 1 |
| 5 | 0.2 | 0 | 0 | 1 | 0 | 0 |
| 6 | 0.2 | 0 | 0.2 | 1 | 0 | 1 |
| 7 | 0.2 | 0.2 | 0 | 1 | 1 | 0 |
| 8 | 0.2 | 0.2 | 0.2 | 1 | 1 | 1 |
Results of validation accuracy () and statistical metrics for each method (CNN architecture + data augmentation combination)
| Arch. | Comb. | 1 | 2 | 3 | 4 | 5 | Mean | OR | |
|---|---|---|---|---|---|---|---|---|---|
| 1 | 78.0 | 78.0 | 78.0 | 76.0 | 82.0 | 78.4 | 1.000 | – | |
| 2 | 86.0 | 82.0 | 78.0 | 78.0 | 76.0 | 80.0 | 1.102 | 0.66 | |
| 3 | 86.0 | 82.0 | 86.0 | 86.0 | 84.0 | 84.8 | 1.537 | 0.07 | |
| CNN8 | 4 | 92.0 | 88.0 | 86.0 | 84.0 | 92.0 | 88.4 | 2.099 | |
| 5 | 84.0 | 90.0 | 80.0 | 86.0 | 86.0 | 85.2 | 1.586 | ||
| 6 | 84.0 | 84.0 | 84.0 | 86.0 | 82.0 | 84.0 | 1.445 | 0.11 | |
| 7 | 90.0 | 92.0 | 88.0 | 90.0 | 88.0 | 89.6 | 2.374 | ||
| 8 | 88.0 | 88.0 | 86.0 | 90.0 | 86.0 | 87.6 | 1.946 | ||
| 1 | 88.0 | 84.0 | 90.0 | 88.0 | 84.0 | 86.8 | 1.000 | – | |
| 2 | 90.0 | 86.0 | 94.0 | 92.0 | 84.0 | 89.2 | 1.256 | 0.41 | |
| 3 | 92.0 | 92.0 | 88.0 | 90.0 | 86.0 | 89.6 | 1.310 | 0.33 | |
| DenseNet-121 | 4 | 90.0 | 88.0 | 86.0 | 90.0 | 92.0 | 89.2 | 1.256 | 0.41 |
| 5 | 86.0 | 90.0 | 94.0 | 88.0 | 86.0 | 88.8 | 1.206 | 0.49 | |
| 6 | 90.0 | 88.0 | 90.0 | 88.0 | 92.0 | 89.6 | 1.310 | 0.33 | |
| 7 | 90.0 | 94.0 | 92.0 | 92.0 | 90.0 | 91.6 | 1.658 | 0.09 | |
| 8 | 92.0 | 96.0 | 96.0 | 92.0 | 94.0 | 94.0 | 2.382 | ||
| 1 | 76.0 | 74.0 | 76.0 | 78.0 | 76.0 | 76.0 | 1.000 | – | |
| 2 | 84.0 | 88.0 | 88.0 | 88.0 | 90.0 | 87.6 | 2.231 | ||
| 3 | 82.0 | 84.0 | 90.0 | 82.0 | 78.0 | 83.2 | 1.564 | ||
| MobileNet | 4 | 92.0 | 88.0 | 90.0 | 88.0 | 90.0 | 89.6 | 2.720 | |
| 5 | 82.0 | 80.0 | 78.0 | 84.0 | 82.0 | 81.2 | 1.364 | 0.16 | |
| 6 | 84.0 | 86.0 | 86.0 | 86.0 | 90.0 | 86.4 | 2.006 | ||
| 7 | 88.0 | 92.0 | 88.0 | 86.0 | 94.0 | 89.6 | 2.721 | ||
| 8 | 92.0 | 90.0 | 90.0 | 92.0 | 90.0 | 90.8 | 3.117 |
Bold values indicate the data augmentation combinations with p < 0.05
Fig. 6Examples of images generated by data augmentation (; ; ) for class 0 (without vegetation on the building facade)
Fig. 7Examples of images generated by data augmentation (; ; ) for class 1 (with vegetation on the building facade)
Maximum accuracy in the test step in each of the repetitions and respective mean accuracy (M) for first small dataset
| Arch. | C. | 1 | 2 | 3 | 4 | 5 | M. |
|---|---|---|---|---|---|---|---|
| CNN8 | P | 87.8 | 92.2 | 93.3 | 91.1 | ||
| CNN8 | L | 91.1 | 80.0 | 93.3 | 91.1 | 93.3 | 89.8 |
| DenseNet | P | 77.8 | 77.8 | 73.3 | 83.3 | 65.6 | 75.6 |
| DenseNet | L | 87.8 | 77.8 | 83.3 | 71.1 | 70.0 | 78.0 |
| MobileNet | P | 71.1 | 65.6 | 81.1 | 74.4 | 63.3 | 71.1 |
| MobileNet | L | 54.4 | 87.8 | 58.9 | 55.6 | 53.3 | 62.0 |
Bold values indicate the best result of test accuracy and mean accuracy
Comparison between methods recommended by the data augmentation configurations (Proposed (P) and Literature (L)) and architectures
Confusion matrix with the best results for the test step (first small dataset)
Bold values indicate the number of true positives (TP) and true negatives (TN)
Selected hyperparameters for first small dataset
| Hyperparameter | Recommendation |
|---|---|
| Architecture | CNN8 |
| Height Shift Range (He) | 0.2 |
| Width Shfit Rang (W) | 0.2 |
| Zoom Range (Z) | 0.2 |
Results of data augmentation hyperparameter tuning with logistic regression (second small dataset)
| Hyperparameter | |||||
|---|---|---|---|---|---|
| Rotation R. (R) | – 0.245 | 0.00 | 0 | 0 | 0.783 |
| Hor. Flip (H) | 0.071 | 0.07 | False | 0 | 1.074 |
| Vertical Flip (V) | – 0.027 | 0.48 | False | 0 | 0.973 |
| Height S. R. (He) | – 0.008 | 0.85 | 0 | 0 | 0.992 |
| Shear Range (S) | 0.089 | 0.02 | 0.2 | 1 | |
| Width S. R. (W) | – 0.024 | 0.53 | 0 | 0 | 0.976 |
| Zoom Range (Z) | – 0.091 | 0.02 | 0 | 0 | 0.913 |
Bold value indicates the hyperparameters OR > 1 and p < 0.05
Maximum accuracy in the test step in each of the repetitions and respective mean accuracy (M) for second small dataset
| Arch. | C. | 1 | 2 | 3 | 4 | 5 | M. |
|---|---|---|---|---|---|---|---|
| CNN8 | P | 83.3 | 86.7 | 83.3 | |||
| CNN8 | L | 56.7 | 70.0 | 66.7 | 56.7 | 73.3 | 64.7 |
| DenseNet | P | 70.0 | 83.3 | 90.0 | 90.0 | 70.0 | 80.7 |
| DenseNet | L | 86.7 | 86.7 | 90.0 | 80.0 | 86.7 | 86.0 |
| MobileNet | P | 70.0 | 70.0 | 60.0 | 70.0 | 63.3 | 66.7 |
| MobileNet | L | 70.0 | 80.0 | 73.3 | 83.3 | 70.0 | 75.3 |
Bold values indicate the best result of test accuracy and mean accuracy
Comparison between methods recommended by the data augmentation configurations (Proposed (P) and Literature (L)) and architectures
Comparison of this proposal with different papers that applied CNNs in the image processing of building construction: I [32], II [3], III [48] , IV [36] and V [54].
| Proposed | I | II | III | IV | V | ||
|---|---|---|---|---|---|---|---|
| [ | [ | [ | [ | [ | |||
| Classification | – | – | – | – | |||
| Detection | – | – | – | ||||
| Segmentation | – | – | – | – | |||
| Crack detection | – | – | – | ||||
| Bridge inspection | – | – | – | – | – | ||
| Roofs defects classification | – | – | – | – | – | ||
| Vegetation in facades | – | – | – | – | |||
| Rotation Range | |||||||
| Horinzontal Flip | – | – | |||||
| Vertical Flip | – | – | – | – | |||
| Height Shift Range | – | – | – | – | |||
| Shear Range | – | – | – | ||||
| Width Shift Range | – | – | – | – | |||
| Zoom Range | – | – | – | – | |||
| Others | – | – | – | – | |||
| Yes | – | – | – | ||||
| No | – | – | – |