| Literature DB >> 28316894 |
Narkis S Morales1,2, Ignacio C Fernández2,3, Victoria Baca-González4.
Abstract
Environmental niche modeling (ENM) is commonly used to develop probabilistic maps of species distribution. Among available ENM techniques, MaxEnt has become one of the most popular tools for modeling species distribution, with hundreds of peer-reviewed articles published each year. MaxEnt's popularity is mainly due to the use of a graphical interface and automatic parameter configuration capabilities. However, recent studies have shown that using the default automatic configuration may not be always appropriate because it can produce non-optimal models; particularly when dealing with a small number of species presence points. Thus, the recommendation is to evaluate the best potential combination of parameters (feature classes and regularization multiplier) to select the most appropriate model. In this work we reviewed 244 articles published between 2013 and 2015 to assess whether researchers are following recommendations to avoid using the default parameter configuration when dealing with small sample sizes, or if they are using MaxEnt as a "black box tool." Our results show that in only 16% of analyzed articles authors evaluated best feature classes, in 6.9% evaluated best regularization multipliers, and in a meager 3.7% evaluated simultaneously both parameters before producing the definitive distribution model. We analyzed 20 articles to quantify the potential differences in resulting outputs when using software default parameters instead of the alternative best model. Results from our analysis reveal important differences between the use of default parameters and the best model approach, especially in the total area identified as suitable for the assessed species and the specific areas that are identified as suitable by both modelling approaches. These results are worrying, because publications are potentially reporting over-complex or over-simplistic models that can undermine the applicability of their results. Of particular importance are studies used to inform policy making. Therefore, researchers, practitioners, reviewers and editors need to be very judicious when dealing with MaxEnt, particularly when the modelling process is based on small sample sizes.Entities:
Keywords: Auto-features; Environmental niche modelling; Maximum entropy; Parameters configuration; Regularization multiplier; Species distribution; User-defined features
Year: 2017 PMID: 28316894 PMCID: PMC5354112 DOI: 10.7717/peerj.3093
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Figure 1Number of published articles (2004–2015) containing both “MaxEnt” and “species distribution” within the topic in the Web of Knowledge Databases (see ‘Methods’ section for databases details).
Figure 2PRISMA flow diagram of the used search protocol following Moher et al. (2009).
Number of articles published during the years 2013–2015 available through the Web of Knowledge Databases.
Articles are presented per year and sample size.
| Year | Total articles | Articles ( | Articles | Articles (no info) |
|---|---|---|---|---|
| 2013 | 246 | 176 | 65 | 5 |
| 2014 | 285 | 187 | 92 | 6 |
| 2015 | 285 | 186 | 87 | 12 |
Notes.
Only articles with sample size ≤ 90 were used for the analyses.
No info refers to articles that do not provide information about the sample size used for modelling.
Figure 3Feature classes (A) and regularization multipliers (B) reported to be used for modelling in the analyzed articles.
Columns show the percentage of articles using user-defined, software default, and articles not providing information. Numbers on top of columns represent the number of articles pertaining to each category per year. Columns on the right of each category show the percentage and number of articles for the 2013–2015 period.
Figure 4Replicability of the modelling process performed in analyzed articles.
Columns show the percentage of articles providing information about GC, geographical coordinates, FC, feature classes, RM, regularization multiplier. Numbers above columns report the number of articles pertaining to each category. Only articles providing information regarding the three inputs (i.e., GC + F + RM column) are considered to provide enough information for replicating the modelling process.
Estimation of resulting differences when using MaxEnt’s default parameters or a best model approach for modelling species distribution.
Spatial correlation values are based in the spatial correlation analysis of MaxEnt’s logistic output. Fuzzy kappa was calculated after applying the 10 percentile training presence logistic threshold to generate the species distribution maps. Area values are based on binary maps generated after applying the 10 percentile training presence logistic threshold. Best model parameters represent the combination of feature classes and regularization multipliers of the model identified as of best performance for each study case.
| Sample size | Spatial correlation | Fuzzy Kappa | Area (Km2) | Area (Km2) | Shared/not shared ratio | Best model parameters | Source | ||
|---|---|---|---|---|---|---|---|---|---|
| Default | Best model | Shared | Not shared | ||||||
| 7 | 0.856 | 0.864 | 144,129 | 447,092 | 142,612 | 0.466 | 0.466 | T2 | |
| 8 | 0.957 | 0.799 | 76 | 66 | 66 | 6.600 | 6.600 | Q5 | |
| 9 | 0.905 | 0.797 | 15,907 | 9,771 | 9,212 | 1.270 | 1.270 | LQP5 | |
| 10 | 0.943 | 0.781 | 861 | 1,939 | 843 | 0.758 | 0.758 | Q1 | |
| 11 | 0.992 | 0.943 | 122,415 | 149,775 | 121,283 | 4.094 | 4.094 | L1 | |
| 12 | 0.983 | 0.841 | 428,209 | 551,196 | 425,674 | 3.324 | 3.324 | L2 | |
| 12 | 0.836 | 0.906 | 175,166 | 174,543 | 156,798 | 4.342 | 4.342 | TQ5 | |
| 13 | 0.960 | 0.843 | 33,421 | 26,169 | 24,317 | 2.219 | 2.219 | TQ2 | |
| 13 | 0.995 | 0.965 | 22,013 | 26,445 | 21,820 | 4.528 | 4.528 | LQ1 | |
| 14 | 0.948 | 0.916 | 363 | 907 | 353 | 0.625 | 0.625 | LQP1 | |
| 15 | 0.967 | 0.900 | 5,004 | 8,845 | 4,991 | 1.291 | 1.291 | QH2 | |
| 16 | 0.769 | 0.652 | 13,466 | 28,948 | 12,848 | 0.768 | 0.768 | LQPT5 | |
| 26 | 0.865 | 0.847 | 5,655,316 | 7,383,714 | 5,003,914 | 1.651 | 1.651 | QP1 | |
| 26 | 0.945 | 0.705 | 32,020 | 36,420 | 28,695 | 2.597 | 2.597 | L2 | |
| 31 | 0.937 | 0.879 | 243,764 | 248,513 | 196,113 | 1.960 | 1.960 | PT1 | |
| 49 | 0.962 | 0.880 | 135,239 | 103,330 | 100,192 | 2.624 | 2.624 | PT1 | |
| 54 | 0.945 | 0.858 | 2,491,722 | 1,723,084 | 1,598,103 | 1.569 | 1.569 | LQPT1 | |
| 55 | 0.841 | 0.863 | 1,649,518 | 1,570,127 | 1,362,351 | 2.753 | 2.753 | TQ2 | |
| 58 | 0.827 | 0.862 | 5,822,694 | 5,37,0521 | 4,439,531 | 1.918 | 1.918 | T1 | |
| 76 | 0.934 | 0.858 | 3,904,018 | 3,700,108 | 3,406,765 | 4.309 | 4.309 | TQ1 | |