| Literature DB >> 28695041 |
Ram C Sharma1, Keitarou Hara1, Hidetake Hirayama2.
Abstract
This paper presents the performance and evaluation of a number of machine learning classifiers for the discrimination between the vegetation physiognomic classes using the satellite based time-series of the surface reflectance data. Discrimination of six vegetation physiognomic classes, Evergreen Coniferous Forest, Evergreen Broadleaf Forest, Deciduous Coniferous Forest, Deciduous Broadleaf Forest, Shrubs, and Herbs, was dealt with in the research. Rich-feature data were prepared from time-series of the satellite data for the discrimination and cross-validation of the vegetation physiognomic types using machine learning approach. A set of machine learning experiments comprised of a number of supervised classifiers with different model parameters was conducted to assess how the discrimination of vegetation physiognomic classes varies with classifiers, input features, and ground truth data size. The performance of each experiment was evaluated by using the 10-fold cross-validation method. Experiment using the Random Forests classifier provided highest overall accuracy (0.81) and kappa coefficient (0.78). However, accuracy metrics did not vary much with experiments. Accuracy metrics were found to be very sensitive to input features and size of ground truth data. The results obtained in the research are expected to be useful for improving the vegetation physiognomic mapping in Japan.Entities:
Year: 2017 PMID: 28695041 PMCID: PMC5485338 DOI: 10.1155/2017/9806479
Source DB: PubMed Journal: Scientifica (Cairo) ISSN: 2090-908X
Vegetation physiognomy types used in the research.
| Physiognomy types | Description |
|---|---|
| (1) Evergreen Coniferous Forest (ECF) | Forests dominated by conifer trees that retain leaves throughout the year |
| (2) Evergreen Broadleaf Forest (EBF) | Forests dominated by broadleaf trees that retain leaves throughout the year |
| (3) Deciduous Coniferous Forest (DCF) | Forests dominated by conifer trees that shed leaves seasonally |
| (4) Deciduous Broadleaf Forest (DBF) | Forests dominated by broadleaf trees that shed leaves seasonally |
| (5) Shrubs | Woody vegetation either evergreen or deciduous with less than 3 meters tall and more than 10% cover |
| (6) Herbs | Vegetation land covered by natural grasses or herbs with cover over 10% |
Description of the input features used in the research.
| Features | Temporal composites | Subtotal features | |
|---|---|---|---|
| Monthly | Percentiles | ||
| Spectral: Red, Near Infrared, Blue, Green, Mid Infrared, Shortwave Infrared 1, and Shortwave Infrared 2 | 12 × 7 | 11 × 7 | 161 |
| Spectral indices: NDVI, EVI, and LSWI | 12 × 3 | 11 × 3 | 69 |
|
| |||
| Total features | 230 | ||
List of experiments conducted in the research.
| Experiments | |
|---|---|
| 1 |
|
| 2 |
|
| 3 | Naive Bayes (algorithm = Gaussian) |
| 4 | Random Forests (trees = 10) |
| 5 | Random Forests (trees = 50) |
| 6 | Random Forests (trees = 100) |
| 7 | Support Vector Machines (kernel = linear) |
| 8 | Multilayer Perceptron (hidden units = 100; hidden layers = 1) |
| 9 | Multilayer Perceptron (hidden units = 100; hidden layers = 3) |
| 10 | Multilayer Perceptron (hidden units = 150; hidden layers = 5) |
Cross-validation results (size of ground truth data sets for each class = 300). The Standard Deviations (SD) across the 10-fold cross-validation in the case of optimum number of features are also shown.
| Number of experiments | Max. overall accuracy | Overall accuracy SD | Max. kappa coefficient | Kappa coefficient SD | Optimum number of features |
|---|---|---|---|---|---|
| 1 | 0.79 | 0.03 | 0.75 | 0.04 | 92 |
| 2 | 0.79 | 0.03 | 0.75 | 0.04 | 99 |
| 3 | 0.70 | 0.03 | 0.64 | 0.04 | 106 |
| 4 | 0.80 | 0.03 | 0.76 | 0.03 | 153 |
| 5 | 0.81 | 0.03 | 0.77 | 0.03 | 98 |
| 6 | 0.81 | 0.03 | 0.78 | 0.03 | 160 |
| 7 | 0.80 | 0.03 | 0.76 | 0.04 | 210 |
| 8 | 0.79 | 0.04 | 0.75 | 0.05 | 211 |
| 9 | 0.80 | 0.03 | 0.76 | 0.04 | 211 |
| 10 | 0.80 | 0.03 | 0.76 | 0.04 | 134 |
Figure 1Confusion matrices for each experiment computed in the case of optimum number of features.
Figure 2Variation of the kappa coefficient by increasing the number of input features.
Figure 3Variation of the kappa coefficient by increasing the ground truth data size.