| Literature DB >> 30891200 |
Hailu Shiferaw1,2, Woldeamlak Bewket2, Sandra Eckert3.
Abstract
In recent years, an increasing number of distribution maps of invasive alien plant species (IAPS) have been published using different machine learning algorithms (MLAs). However, for designing spatially explicit management strategies, distribution maps should include information on the local cover/abundance of the IAPS. This study compares the performances of five MLAs: gradient boosting machine in two different implementations, random forest, support vector machine and deep learning neural network, one ensemble model and a generalized linear model; thereby identifying the best-performing ones in mapping the fractional cover/abundance and distribution of IPAS, in this case called Prosopis juliflora (SW. DC.). Field level Prosopis cover and spatial datasets of seventeen biophysical and anthropogenic variables were collected, processed, and used to train and validate the algorithms so as to generate fractional cover maps of Prosopis in the dryland ecosystem of the Afar Region, Ethiopia. Out of the seven tested algorithms, random forest performed the best with an accuracy of 92% and sensitivity and specificity >0.89. The next best-performing algorithms were the ensemble model and gradient boosting machine with an accuracy of 89% and 88%, respectively. The other tested algorithms achieved comparably low performances. The strong explanatory variables for Prosopis distributions in all models were NDVI, elevation, distance to villages and distance to rivers; rainfall, temperature, near-infrared and red reflectance, whereas topographic variables, except for elevation, did not contribute much to the current distribution of Prosopis. According to the random forest model, a total of 1.173 million ha (12.33% of the study region) was found to be invaded by Prosopis to varying degrees of cover. Our findings demonstrate that MLAs can be successfully used to develop fractional cover maps of plant species, particularly IAPS so as to design targeted and spatially explicit management strategies.Entities:
Keywords: Afar Region; Ethiopia; Prosopis juliflora; dryland ecosystems; fractional cover mapping; invasive alien plant species; machine learning algorithms
Year: 2019 PMID: 30891200 PMCID: PMC6405495 DOI: 10.1002/ece3.4919
Source DB: PubMed Journal: Ecol Evol ISSN: 2045-7758 Impact factor: 2.912
Figure 1Location of the study area, Afar National Regional State, in Ethiopia (a). The detailed map shows the main towns, roads, and rivers, as well as the locations where Prosopis was first introduced. The shading indicates elevation, ranging from 175 m below sea level (dark gray) to 2,992 m above sea level (white), and photos of Prosopis plant (b)
List of spatial data and explanatory variables used for the modeling of Prosopis fractional cover
| Variable abbreviations | Description | Source |
|---|---|---|
| Rain | Mean annual rainfall | Ethiopian National Meteorol. Agency |
| Temp | Mean monthly temperature | |
| LSTd | Monthly land surface temperature during daytime and nighttime; for the modeling 5‐year averages were calculated | MODIS, NASA |
| LSTn | Monthly land surface temperature during nighttime; for the modeling 5‐year averages were calculated | MODIS, NASA |
| PAN | Panchromatic reflectance | Landsat 8 OLI, USGS |
| Red | Red reflectance | Landsat 8 OLI, USGS |
| NIR | Near‐infrared reflectance | Landsat 8 OLI, USGS |
| SWIR1 | Shortwave‐infrared band 6 reflectance | Landsat 8 OLI, USGS |
| NDVI | Normalized difference vegetation index | |
| Elevation | Shuttle Radar Topography Mission digital elevation model (30 m spatial resolution) | USGS |
| Slope | Derived from elevation | |
| Relief | Derived from elevation (contour) differences | Adediran, Parcharidis, Poscolieric, and Pavlopoulos ( |
| Landform | Topographic position index derived from elevation, aspect and slope | Dikau ( |
| Rugged | An index derived from elevation | Riley, DeGloria, & Elliot ( |
| DistRoad | Distances derived from road network data | Ethiopian Road Authority |
| DistVillage | Distances derived from settlement data | EthioGIS and Central Statistical Agency |
| DistRiver | Distances derived from data on watercourses | EthioGIS |
Parameters used to assess model performance
| Perf ormance parameter | Description | Sources |
|---|---|---|
| Confidence interval (CI) | It provides a range of values within which the population parameter is likely to lie. In a normal distribution, the general expression of the confidence interval is: Estimate ± | Newcombe ( |
| Correlation | Agreement between fractional cover measured in the field samples and the predicted fractional cover for the same samples |
Harrington ( |
| Sensitivity | Known as true‐positive rate (TPR); measures the proportion of positives that were correctly identified as locations where |
Metz ( |
| Specificity | Known as true‐negative rate (TNR); measures the proportion of negatives that were correctly identified as locations where |
Fuchs et al. ( |
| Accuracy | Class accuracy is calculated by dividing the number of correct pixels in that category by the total number of pixels in either the corresponding row or the corresponding column; it indicates the probability of a reference pixel being correctly classified and is really a measure of omission error. Calculated as: |
Congalton ( |
| AUC | Area under the receiver operating characteristics (ROC) curve; indicates the model's accuracy in handling true values (presence of | Landis & Koch ( |
| Kappa coefficient | Statistical measure of inter‐rater agreement, excluding agreements occurring by chance. It is calculated in a confusion matrix as | Metz ( |
| Balanced accuracy | Average of all class accuracies; takes into account unbalanced class sizes. In our case, with two classes (presence and absence of | Brodersen, Ong, Stephan, and Buhmann ( |
| Threshold (max @ TPR + TNR) | Maximum value at which the true‐positive rate (TPR, or sensitivity) and the true‐negative rate (TNR, or specificity) intersect. It is often used as a threshold level in dichotomies. In our case, values above the threshold indicate that | Metz ( |
Figure 2Relative influence of explanatory variables in the different algorithms after removal of the least‐contributing ones: (a) generalized linear model (GLM), (b) gradient boosting machine (GBM), (c) gradient boosting machine using boosted regression trees package (GBM‐BRT), (d) random forest (RF), (e) support vector machine (SVM), (f) deep learning neural network (DNN), (g) ensemble model (ENS)
Summarized performance parameters of the evaluation of current fractional cover maps of Prosopis in the Afar Region produced by means of different models. Additionally, AUC plots for each model are provided in the Supporting information Figure S2
| Model type | 95% CI | Accuracy | Kappa | Balanced accuracy | Sensitivity | Specificity | Pos. pred. value | Neg. pred. value | AUC | Correlation | Threshold |
|---|---|---|---|---|---|---|---|---|---|---|---|
| GLM | 0.763, 0.834 | 0.801 | 0.498 | 0.744 | 0.612 | 0.882 | 0.651 | 0.858 | 0.852 | 0.564 | 0.285 |
| GBM | 0.837, 0.897 | 0.877 | 0.678 | 0.848 | 0.802 | 0.895 | 0.738 | 0.924 | 0.944 | 0.756 | 0.397 |
| GBM‐BRT | 0.761, 0.841 | 0.789 | 0.316 | 0.727 | 0.632 | 0.712 | 0.728 | 0.868 | 0.945 | 0.794 | 0.258 |
| RF | 0.897, 0.945 | 0.918 | 0.797 | 0.911 | 0.894 | 0.926 | 0.818 | 0.959 | 0.971 | 0.829 | 0.326 |
| SVM | 0.856, 0.907 | 0.872 | 0.677 | 0.827 | 0.817 | 0.918 | 0.864 | 0.891 | 0.876 | 0.741 | 0.151 |
| DNN | 0.689, 0.767 | 0.729 | 0.434 | 0.574 | 0.014 | 0.995 | 0.568 | 0.724 | 0.595 | 0.206 | 0.392 |
| Ensemble | 0.871, 0.925 | 0.891 | 0.771 | 0.873 | 0.846 | 0.919 | 0.808 | 0.939 | 0.962 | 0.841 | 0.349 |
DNN: deep learning neural network; ENS or ensemble: ensemble model; GBM: gradient boosting machine; GBM‐BRT: gradient boosting machine using boosted regression trees package; GLM: Generalized linear model; RF: random forest; SVM: support vector machine.
Figure 3The current fractional cover maps of Prosopis distribution were produced by using different machine learning algorithms. (a) generalized linear model (GLM), (b) gradient boosting machine (GBM), (c) gradient boosting machine using boosted regression trees package (GBM‐BRT), (d) random forest (RF), (e) support vector machine (SVM), (f) deep learning neural network (DNN), (g) ensemble model (ENS)
Figure 4Current fractional cover of Prosopis (after matching to the ground cover level) in the Afar Region according to the RF model. For better readability, fractional cover was grouped in six fractional cover classes