| Literature DB >> 30427869 |
Vitor Dias Tarli1,2, Philippe Grandcolas1, Roseli Pellens1.
Abstract
Since two decades the richness and potential of natural history collections (NHC) were rediscovered and emphasized, promoting a revolution in the access on data of species occurrence, and fostering the development of several disciplines. Nevertheless, due to their inherent erratic nature, NHC data are plagued by several biases. Understanding these biases is a major issue, particularly because ecological niche models (ENMs) are based on the assumption that data are not biased. Based on it, a recent body of research have focused on searching adequate methods for dealing with biased data and proposed the use of filters in geographical and environmental space. Although the strength of filtering in environmental space has been shown with virtual species, nothing has yet been tested with a real dataset including field validation. In order to contribute to this task, we explore this issue by comparing a dataset from NHC to a recent targeted sampling of the cockroach genus Monastria Saussure, 1864 in the Brazilian Atlantic forest. We showed that, despite strong similarities, the area modeled with NHC data was much smaller. These differences were due to strong climate biases, which increased model's specificity and reduced sensitivity. By applying two forms of rarefaction in the environmental space, we showed that deleting points at random in the most biased climate class is a powerful way for increasing model's sensitivity, so making predictions more suitable to the reality.Entities:
Mesh:
Year: 2018 PMID: 30427869 PMCID: PMC6235285 DOI: 10.1371/journal.pone.0205710
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
The eight bioclim variables used in this study.
Abbreviation, full name, minimum and maximum values of the occurrence records from the target sampling (TS), and natural history collections and literature (NHC) dataset. The last columns present the difference between the two datasets and the sum of these differences.
| Abbreviation | Variable | TS | NHC | TS—NHC | ||||
|---|---|---|---|---|---|---|---|---|
| Min | Max | Min | Max | Min | Max | SUMM | ||
| bio01 | Annual Mean Temperature | 154 | 242 | 152 | 255 | 2 | -13 | -11 |
| bio02 | Mean Diurnal Range | 63 | 130 | 64 | 140 | -1 | -10 | -11 |
| bio03 | Isothermality | 46 | 69 | 47 | 67 | -1 | 2 | 1 |
| bio05 | Max Temperature of Warmest Month | 233 | 321 | 248 | 338 | -15 | -17 | -32 |
| bio12 | Annual Precipitation | 1197 | 2102 | 1177 | 2171 | 20 | -69 | -49 |
| bio13 | Precipitation of Wettest Month | 173 | 313 | 132 | 338 | 41 | -25 | 16 |
| bio14 | Precipitation of Driest Month | 11 | 124 | 8 | 156 | 3 | -32 | -29 |
| bio15 | Precipitation Seasonality | 10 | 81 | 9 | 86 | 1 | -5 | -4 |
Temperature values are given in°C*10, precipitation in mm.
a Mean of monthly (max temp—min temp)
(mean diurnal range/annual range) (*100)
Coefficient of Variation of monthly precipitation).
Fig 1Distribution of the sampling records of Monastria in the Brazilian Atlantic forest.
Data from NHC: full circle; Data from TS: presence (full triangle), absence (empty triangle).
Fig 2Ecological niche models of the cockroach Monastria in the Neotropical Atlantic forest.
Ecological niche of the cockroach Monastria in the Neotropical Atlantic Forest modeled with two different datasets. A) Data from TS; B) Data from NHC. Values of AUC training, test and area are the mean of 20 replicates.
Relative contributions and permutation importance of the variables used for modeling the niche of Monastria with data issuing from two different datasets.
| TS | NHC | ||||
|---|---|---|---|---|---|
| Variable | Percent contribution | Permutation importance | Percent contribution | Permutation importance | |
| bio01 | Annual Mean Temp | 0.2 | 0.2 | 0.7 | 1.2 |
| bio02 | Mean Diurnal Range | 29.2 | 20.4 | 31.1 | 25.2 |
| bio03 | Isothermality | 1.4 | 7.6 | 24.5 | 48.2 |
| bio05 | Max Temp Warmest Month | 16.7 | 6.5 | 8.6 | 2.1 |
| bio12 | Annual Precipitation | 0.5 | 0.1 | 12.9 | 16 |
| bio13 | Precip of Wettest Month | 20.1 | 33.8 | 2.7 | 0.8 |
| bio14 | Precip of Driest Month | 27 | 27.9 | 18.8 | 1.9 |
| bio15 | Precip Seasonality | 4.9 | 3.6 | 0.7 | 4.6 |
Fig 3Distribution of the nine species of Monastria in the ENM’s dataset from NHC.
According to the article 8.2 and 8.3 of the International Code of Zoological Nomenclature, the present publication is not issued for the purposes of zoological nomenclature and the names or acts displayed are not available and disclaimed.
Fig 4The response curves of the eight bioclim variables used in this study.
The curves show the mean response of the 20 replicate MaxEnt runs (red) and the mean +/- one standard deviation (blue).
Values of biasd calculated with data from a target sampling (TS) and data from natural history collections and literature (NHC) for eight climatic variables used to estimate ENMs of Monastria in the Brazilian Atlantic forest.
Highest values are indicated in bold.
| Bio01 | Bio02 | Bio03 | Bio05 | Bio12 | Bio13 | Bio14 | Bio15 | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Mean Annual Temperature | Mean Diurnal Range in Temp | Isothermality | Max Temp of Warmest Month | Annual Precipitation | Precipitation of Wettest Month | Precipitation of Driest Month | Precipitation Seasonality | |||||||||
| Climate classes | TS | NHC | TS | NHC | TS | NHC | TS | NHC | TS | NHC | TS | NHC | TS | NHC | TS | NHC |
| 1 | -2.17 | -2.08 | -0.61 | 0.00 | 0.00 | 3.56 | 2.46 | -1.62 | -1.02 | -2.34 | -1.02 | -1.44 | -1.63 | -0.94 | -1.02 | 4.04 |
| 2 | -3.40 | -2.67 | 0.00 | -0.52 | 2.04 | 2.88 | 1.47 | 2.37 | -1.84 | -2.59 | -2.17 | 0.40 | 0.74 | 0.36 | -1.23 | 0.43 |
| 3 | 1.00 | 0.00 | 1.84 | 1.30 | 1.47 | 0.00 | -0.61 | 0.00 | 1.47 | 0.00 | 0.00 | 0.00 | 2.46 | 0.38 | -1.63 | -1.82 |
| 4 | -2.42 | 0.70 | 1.47 | 1.82 | 0.00 | 0.00 | -1.09 | -0.40 | 5.44 | -2.18 | ||||||
| 5 | -1.40 | 1.73 | -1.09 | -2.18 | 1.02 | 1.91 | -1.23 | -0.93 | 3.07 | 0.86 | -0.74 | -0.34 | 3.68 | -0.40 | 3.68 | 1.78 |
| 6 | -0.93 | 1.73 | -1.00 | 0.34 | -0.54 | -2.29 | -1.47 | 1.19 | -1.49 | 0.52 | 2.17 | -0.43 | 1.47 | 2.02 | -1.63 | 1.30 |
| 7 | -1.40 | 0.86 | -1.40 | -1.73 | 0.47 | 0.52 | 1.63 | -1.82 | -1.84 | -3.06 | -0.74 | 1.19 | -0.61 | 0.52 | -1.02 | -1.62 |
| 8 | -0.47 | 0.43 | -1.02 | -3.14 | -2.79 | -3.91 | -1.23 | -2.42 | -2.17 | -2.34 | 0.74 | 1.73 | -2.49 | -2.42 | -1.09 | 1.78 |
| 9 | -2.49 | -0.94 | -1.02 | -1.44 | -0.74 | -1.78 | -1.47 | -0.52 | -1.02 | -2.83 | 1.23 | -1.15 | -1.02 | 1.44 | -1.84 | -2.16 |
Fig 5AUC training, AUC test and area estimated with NHC and literature data rarefied in two different ways.
ϒ Mean and SD (gray line) using a dataset in which points were deleted at random only from the most biased climate class of Annual Precipitation (class 4 in Table 2); × Mean and SD (black line) using a dataset in which points were deleted at random in the entire dataset. In both cases the same number of points was deleted. They represented 30, 40, 45 and 55% of the points in the most biased climate class. Dotted line: Mean values estimated with the entire dataset from NHC. Dashed Line: Mean values estimated with the entire dataset from TS.
Results of two-way ANOVA comparing the effect of rarefaction on the collection data (See Fig 2 for more information).
| Mean Square | d.f. | Significance | ||
|---|---|---|---|---|
| AUC Training | ||||
| Entire dataset X Most Biased dataset | 0.002 | 1 | 18.0288 | |
| Number of Points Deleted | 0.0003 | 3 | 2.7477 | |
| Interaction | 0.0002 | 3 | 2.1301 | 0.0987 |
| AUC Test | ||||
| Entire dataset X Most Biased dataset | 0.003 | 1 | 2.9773 | 0.0864 |
| Number of Points Deleted | 0.0021 | 3 | 2.0755 | 0.1058 |
| Interaction | 0.0046 | 3 | 4.5134 | |
| Area | ||||
| Entire dataset X Most Biased dataset | 500478428 | 1 | 3.1422 | 0.0782 |
| Number of Points Deleted | 1700097789 | 3 | 10.674 | |
| Interaction | 593681737 | 3 | 3.7274 |
Bold numbers correspond to a statistical significance (p <0.05)