Literature DB >> 19323182

Sample selection bias and presence-only distribution models: implications for background and pseudo-absence data.

Steven J Phillips1, Miroslav Dudík, Jane Elith, Catherine H Graham, Anthony Lehmann, John Leathwick, Simon Ferrier.   

Abstract

Most methods for modeling species distributions from occurrence records require additional data representing the range of environmental conditions in the modeled region. These data, called background or pseudo-absence data, are usually drawn at random from the entire region, whereas occurrence collection is often spatially biased toward easily accessed areas. Since the spatial bias generally results in environmental bias, the difference between occurrence collection and background sampling may lead to inaccurate models. To correct the estimation, we propose choosing background data with the same bias as occurrence data. We investigate theoretical and practical implications of this approach. Accurate information about spatial bias is usually lacking, so explicit biased sampling of background sites may not be possible. However, it is likely that an entire target group of species observed by similar methods will share similar bias. We therefore explore the use of all occurrences within a target group as biased background data. We compare model performance using target-group background and randomly sampled background on a comprehensive collection of data for 226 species from diverse regions of the world. We find that target-group background improves average performance for all the modeling methods we consider, with the choice of background data having as large an effect on predictive performance as the choice of modeling method. The performance improvement due to target-group background is greatest when there is strong bias in the target-group presence records. Our approach applies to regression-based modeling methods that have been adapted for use with occurrence data, such as generalized linear or additive models and boosted regression trees, and to Maxent, a probability density estimation method. We argue that increased awareness of the implications of spatial bias in surveys, and possible modeling remedies, will substantially improve predictions of species distributions.

Mesh:

Year:  2009        PMID: 19323182     DOI: 10.1890/07-2153.1

Source DB:  PubMed          Journal:  Ecol Appl        ISSN: 1051-0761            Impact factor:   4.657


  344 in total

1.  Mapping landscape friction to locate isolated tsetse populations that are candidates for elimination.

Authors:  Jérémy Bouyer; Ahmadou H Dicko; Giuliano Cecchi; Sophie Ravel; Laure Guerrini; Philippe Solano; Marc J B Vreysen; Thierry De Meeûs; Renaud Lancelot
Journal:  Proc Natl Acad Sci U S A       Date:  2015-11-09       Impact factor: 11.205

2.  Displaying bias in sampling effort of data accessed from biodiversity databases using ignorance maps.

Authors:  Alejandro Ruete
Journal:  Biodivers Data J       Date:  2015-07-28

3.  Plant and animal endemism in the eastern Andean slope: challenges to conservation.

Authors:  Jennifer J Swenson; Bruce E Young; Stephan Beck; Pat Comer; Jesús H Córdova; Jessica Dyson; Dirk Embert; Filomeno Encarnación; Wanderley Ferreira; Irma Franke; Dennis Grossman; Pilar Hernandez; Sebastian K Herzog; Carmen Josse; Gonzalo Navarro; Víctor Pacheco; Bruce A Stein; Martín Timaná; Antonio Tovar; Carolina Tovar; Julieta Vargas; Carlos M Zambrana-Torrelio
Journal:  BMC Ecol       Date:  2012-01-27       Impact factor: 2.964

4.  Abundance versus presence/absence data for modelling fish habitat preference with a genetic Takagi-Sugeno fuzzy system.

Authors:  Shinji Fukuda; Ans M Mouton; Bernard De Baets
Journal:  Environ Monit Assess       Date:  2011-11-09       Impact factor: 2.513

5.  Niches and distributional areas: concepts, methods, and assumptions.

Authors:  Jorge Soberón; Miguel Nakamura
Journal:  Proc Natl Acad Sci U S A       Date:  2009-09-23       Impact factor: 11.205

6.  Anthropogenic climate change drives shift and shuffle in North Atlantic phytoplankton communities.

Authors:  Andrew D Barton; Andrew J Irwin; Zoe V Finkel; Charles A Stock
Journal:  Proc Natl Acad Sci U S A       Date:  2016-02-22       Impact factor: 11.205

7.  Inference from presence-only data; the ongoing controversy.

Authors:  Trevor Hastie; Will Fithian
Journal:  Ecography (Cop.)       Date:  2013-08-01       Impact factor: 5.992

8.  Historical distribution of Sundaland's Dipterocarp rainforests at Quaternary glacial maxima.

Authors:  Niels Raes; Charles H Cannon; Robert J Hijmans; Thomas Piessens; Leng Guan Saw; Peter C van Welzen; J W Ferry Slik
Journal:  Proc Natl Acad Sci U S A       Date:  2014-11-10       Impact factor: 11.205

9.  Projected loss of a salamander diversity hotspot as a consequence of projected global climate change.

Authors:  Joseph R Milanovich; William E Peterman; Nathan P Nibbelink; John C Maerz
Journal:  PLoS One       Date:  2010-08-16       Impact factor: 3.240

10.  Re-shuffling of species with climate disruption: a no-analog future for California birds?

Authors:  Diana Stralberg; Dennis Jongsomjit; Christine A Howell; Mark A Snyder; John D Alexander; John A Wiens; Terry L Root
Journal:  PLoS One       Date:  2009-09-02       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.