| Literature DB >> 31060327 |
Jun Zhu1,2, Ziwu Pan3,4, Hang Wang5,6,7, Peijie Huang8, Jiulin Sun9, Fen Qin10,11,12, Zhenzhen Liu13,14.
Abstract
As tea is an important economic crop in many regions, efficient and accurate methods for remotely identifying tea plantations are essential for the implementation of sustainable tea practices and for periodic monitoring. In this study, we developed and tested a method for tea plantation identification based on multi-temporal Sentinel-2 images and a multi-feature Random Forest (RF) algorithm. We used phenological patterns of tea cultivation in China's Shihe District (such as the multiple annual growing, harvest, and pruning stages) to extracted multi-temporal Sentinel-2 MSI bands, their derived first spectral derivative, NDVI and textures, and topographic features. We then assessed feature importance using RF analysis; the optimal combination of features was used as the input variable for RF classification to extract tea plantations in the study area. A comparison of our results with those achieved using the Support Vector Machine method and statistical data from local government departments showed that our method had a higher producer's accuracy (96.57%) and user's accuracy (96.02%). These results demonstrate that: (1) multi-temporal and multi-feature classification can improve the accuracy of tea plantation recognition, (2) RF classification feature importance analysis can effectively reduce feature dimensions and improve classification efficiency, and (3) the combination of multi-temporal Sentinel-2 images and the RF algorithm improves our ability to identify and monitor tea plantations.Entities:
Keywords: China; Random Forest algorithm; Sentinel-2; feature selection; remote sensing; tea plantation identification
Year: 2019 PMID: 31060327 PMCID: PMC6540259 DOI: 10.3390/s19092087
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Study area location in Henan Province, China, and digital elevation model (DEM).
Figure 2Annual growth, picking, and pruning stages of tea in the study area delineated by the first ten (E), middle ten (M), and last ten (L) days of each month.
Figure 3Effect of pruning on tea plantations in the study area: (a,b) field photos of tea plantations before and after pruning, respectively; (c,d) Sentinel-2 false colour images of tea plantations before and after pruning, respectively.
Figure 4Distribution of samples in the study area: (a) 20,565 training sample pixels, of which 5654 are tea plantations; (b) 8756 validation sample pixels, of which 2449 are tea plantations.
Figure 5Flowchart for the tea plantation identification method proposed in this study.
Figure 6Spectral curves of the eight LULC types in the study area on (a) 18 April 2018, (b) 12 June 2018, (c) 15 September 2017, and (d) 19 December 2017.
Figure 7NDVI values for the eight LULC types on the four image dates.
Feature parameters extracted from the four Sentinel-2 images and topographic features.
| Feature Type | Feature Phase | Feature Name | Feature Variable | Feature Number |
|---|---|---|---|---|
| Spectral feature | 2018-4-18, | Reflectance | B2, B3, B4, B5, B6, B7, B8, B8A | 32 |
| First derivative spectral | Der1_B2, Der1_B3, Der1_B4, Der1_B5, Der1_B6, Der1_B7, Der1_B8, Der1_B8A | 32 | ||
| Vegetation index | 2018-6-12, | NDVI | Ndvi_12-19, Ndvi_12-19- Ndvi_6-12 | 2 |
| Texture feature (GLCM) | 2018-4-18, | Mean | Mea_B2, Mea_B3, Mea_B4, Mea_B5, Mea_B6, Mea_B7, Mea_B8, Mea_B8A | 32 |
| Variance | Var_B2, Var_B3, Var_B4, Var_B5, Var_B6, Var_B7, Var_B8, Var_B8A | 32 | ||
| Contrast | Con_B2, Con_B3, Con_B4, Con_B5, Con_B6, Con_B7, Con_B8, Con_B8A | 32 | ||
| Homogeneity | Hom_B2, Hom_B3, Hom_B4, Hom_B5, Hom_B6, Hom_B7, Hom_B8, Hom_B8A | 32 | ||
| Dissimilarity | Dis_B2, Dis_B3, Dis_B4, Dis_B5, Dis_B6, Dis_B7, Dis_B8, Dis_B8A | 32 | ||
| Correlation | Cor_B2, Cor_B3, Cor_B4, Cor_B5, Cor_B6, Cor_B7, Cor_B8, Cor_B8A | 32 | ||
| Entropy | Ent_B2, Ent_B3, Ent_B4, Ent_B5, Ent_B6, Ent_B7, Ent_B8, Ent_B8A | 32 | ||
| Angular second moment | Asm_B2, Asm_B3, Asm_B4, Asm_B5, Asm_B6, Asm_B7, Asm_B8, Asm_B8A | 32 | ||
| Topographic feature | - | Elevation | Ele | 1 |
| Slope | Slo | 1 | ||
| Aspect | Asp | 1 | ||
| Sum | 325 | |||
Figure 8Effect of decision tree number on overall classification accuracy and modeling time in the RF model used in this study.
Eight groups of feature models used for accuracy analysis.
| Feature Model | Feature Dimension | Description |
|---|---|---|
| S1 | 16 | 8-band spectral features on 2018-4-18 |
| S2 | 16 | 8-band spectral features on 2018-6-12 |
| S3 | 16 | 8-band spectral features on 2017-9-15 |
| S4 | 16 | 8-band spectral features on 2017-12-19 |
| S | 64 | 8-band spectral features of all four images |
| GLCM | 256 | 8-band texture features of all four images |
| S+NDVI+DEM | 69 | 8-band spectral features of all four images + Vegetation index features + Topographic features |
| S+NDVI+DEM+GLCM | 325 | 8-band spectral features of all four images + Vegetation index features + Topographic features + 8-band texture features of 4 phases |
Comparison of classification accuracy of different feature models using the RF method.
| Feature Model | Monoculture Tea Plantation | Polyculture Tea Plantation | OA/% | Kappa | ||
|---|---|---|---|---|---|---|
| PA /% | UA /% | PA /% | UA /% | |||
| S1 | 75.46 | 71.76 | 59.11 | 61.71 | 86.28 | 0.6890 |
| S2 | 87.13 | 83.92 | 69.28 | 58.83 | 89.06 | 0.7585 |
| S3 | 80.92 | 83.00 | 74.77 | 63.24 | 88.88 | 0.7512 |
| S4 | 85.56 | 77.71 | 73.60 | 70.63 | 89.60 | 0.7693 |
| S | 93.47 | 92.71 | 81.19 | 84.45 | 95.89 | 0.9059 |
| GLCM | 89.14 | 91.55 | 76.64 | 74.97 | 93.38 | 0.8485 |
| S+NDVI+DEM | 93.91 | 92.92 | 82.01 | 83.08 | 96.05 | 0.9099 |
| S+NDVI+DEM+GLCM | 94.85 | 93.56 | 82.24 | 85.02 | 96.33 | 0.9163 |
Accuracy comparison of classification results with different numbers of optimum features.
| Mean Value of Feature Importance | Feature Dimension | Monoculture Tea Plantation | Polyculture Tea Plantation | OA/% | Kappa | ||
|---|---|---|---|---|---|---|---|
| PA /% | UA /% | PA /% | UA /% | ||||
| ≥1.00 | 10 | 93.28 | 90.28 | 78.39 | 78.76 | 95.01 | 0.8870 |
| ≥0.90 | 17 | 94.10 | 91.46 | 80.37 | 81.71 | 95.68 | 0.9020 |
| ≥0.80 | 28 | 94.29 | 91.75 | 81.66 | 84.62 | 96.05 | 0.9100 |
| ≥0.75 | 39 | 93.28 | 91.81 | 81.19 | 82.94 | 95.91 | 0.9068 |
Figure 9Feature importance ranking for the optimal feature combination.
Figure 10Final map of tea plantations in the study area.
Comparison of classification accuracies of the RF and SVM methods.
| Classes | RF | SVM | ||
|---|---|---|---|---|
| PA /% | UA /% | PA /% | UA /% | |
| Tea plantation | 96.57 | 96.02 | 92.45 | 92.45 |
| Others | 98.45 | 98.67 | 97.97 | 97.09 |
| OA/% | 97.92 | 96.43 | ||
| Kappa | 0.9485 | 0.9107 | ||
| Feature dimension | 28 | 28 | ||