Literature DB >> 33095844

A simulation study of the use of temporal occupancy for identifying core and transient species.

Sara Snell Taylor¹, Jessica R Coyle², Ethan P White^3,4, Allen H Hurlbert^1,5.

Abstract

Transient species, which do not maintain self-sustaining populations in a system where they are observed, are ubiquitous in nature and their presence often impacts the interpretation of ecological patterns and processes. Identifying transient species from temporal occupancy, the proportion of time a species is observed at a given site over a time series, is subject to classification errors as a result of imperfect detection and source-sink dynamics. We use a simulation-based approach to assess how often errors in detection or classification occur in order to validate the use of temporal occupancy as a metric for inferring whether a species is a core or transient member of a community. We found that low detection increases error in the classification of core species, while high habitat heterogeneity and high detection increase error in classification of transient species. These findings confirm that temporal occupancy is a valid metric for inferring whether a species can maintain a self-sustaining population, but imperfect detection, low abundance, and highly heterogeneous landscapes may yield high misclassification rates.

Entities: Chemical Disease Gene Species

Mesh：

Year: 2020 PMID： 33095844 PMCID： PMC7584212 DOI： 10.1371/journal.pone.0241198

Source DB: PubMed Journal: PLoS One ISSN： 1932-6203 Impact factor: 3.240

Introduction

Understanding the processes underlying community assembly is one of the primary goals of community ecology. Traditional approaches make inferences about community processes based on the set of species identified as community members, typically those observed at a study site [1, 2]. Data on communities are typically gathered via field surveys at a given site for one or more time points. However, the record of species from such community surveys often includes transient species that do not maintain self-sustaining populations in that community [3]. A growing number of studies use temporal occupancy, or the proportion of a multi-year time series over which a species is observed, to determine which species are "core" members of their communities and which species are transient [3-9]. Temporal occupancy provides a quantitative measure of persistence within a community over time and its distribution tends to be bimodal [3]. Previous studies have used arbitrary thresholds (e.g. core species are those that occur in more than 50%, or 67%, or 75% of years [3, 5, 7], and Snell Taylor et al. (2018) found that a wide range of ecological patterns were generally robust to the specific threshold used. Nevertheless, ecological data collection is imperfect and using temporal occupancy to infer core or transient classification is susceptible to classification error. One type of error is inferring that a species is transient when it is actually a core member of the community. A self-sustaining species that is present on the landscape every year may fail to be observed in some years, and hence be misclassified as a transient species, for three primary reasons (Table 1). These missed detections can occur due to low population densities [10-12], less conspicuous morphology (e.g., drab plumage) or behavior (e.g., singing quietly or infrequently; [13, 14]), and habitat structure with characteristics that limit the distance over which individuals can be detected (e.g., dense vegetation which may obscure sightings and attenuate sound; [15-18]). This latter possibility in particular may lead to potentially confounding gradients of average detectability along large-scale environmental gradients that range from open, low productivity deserts and grasslands to higher productivity forests. Although the effect of imperfect detectability on temporal occupancy and species classification is qualitatively understood, it is unclear how frequently and at what levels of detectability and abundance such errors occur.

Table 1

Ways that species can be correctly or incorrectly (boxes in red) classified as maintaining a viable population based on temporal occupancy.

	Maintains a viable population R₀ ≥ 1, "core"	Does not maintain a viable population R₀ < 1, "transient"
Low temporal occupancy, inferred to be "transient"	A: Species that occur persistently at low density or that have traits making them difficult to detect	B: Species that only irregularly occur in the local habitat because they are poorly suited to that habitat
High temporal occupancy, inferred to be "core"	C: Core members of the community that maintain viable populations and are reliably observed almost every year	D: Species that occur regularly in the local habitat despite failing to maintain positive population growth rates due to repeated immigration from adjacent source habitat
Error rates	A / (A + C)	D / (B + D)

R refers to the net reproductive rate of a species in a location.

R refers to the net reproductive rate of a species in a location. The opposite classification error is also possible, where a species is inferred to be a core member of a community based on frequent occurrence in a time series, even though it does not maintain a locally viable population (Table 1). Some individuals of a species are observed regularly in habitats in which they do not successfully reproduce by dispersing in from adjacent suitable habitat [19, 20]. For example, in plants, seeds might be regularly dispersed into inhospitable habitats [21] and in birds, younger and lower quality males are often displaced by dominant males to adjacent, suboptimal habitats [22]. In such cases, the temporal frequency with which a species is observed might be a poor indicator of the extent to which a species can actually maintain a viable population in that location. Understanding the frequency of classification errors and the factors that affect those errors is critical for properly interpreting patterns based on temporal occupancy. Here, we use a simulation-based approach to examine community dynamics—based on death, birth, dispersal, and establishment—on complex, dual-habitat landscapes in which species’ habitat associations are known. We varied average species’ detectability and habitat heterogeneity of the simulated landscapes to assess how these variables affect rates of misclassification. We expect that core species are more likely to be misclassified as transients when either detectability or abundance is low. In contrast, we expect that species that do not successfully breed in a habitat are more likely to be misclassified as core members when the local community is embedded within a more heterogeneous landscape, which increases the likelihood of mass effects from adjacent habitats.

Methods

Simulation model

Each simulation began by generating an initial landscape, species pool, and global species abundance distribution (GSAD). The 32 x 32 pixel landscape was made up of two distinct habitat types, A and B, with a parameter for the proportion of the landscape made up of habitat type A (h; Fig 1A). Each grid cell represented a local community with a fixed community carrying capacity of 100 total individuals of any species. The species pool contained 40 total species, with half that could only reproduce successfully in habitat A and half that could only reproduce successfully in habitat B. The GSAD was a vector of relative species abundances assigned from a lognormal distribution that defined the relative probability that an immigrant from outside the landscape would belong to each species. Initially, the landscape was filled to carrying capacity with individuals drawn randomly from the GSAD.

Fig 1

Schematic documenting the events that occur in a single time step of the simulation, including death, birth, dispersal, and establishment.

See text for details.

Schematic documenting the events that occur in a single time step of the simulation, including death, birth, dispersal, and establishment.

See text for details. In each time step, meant to represent one year, the following four processes were modeled: Death. The probability of mortality for each individual at a time step was 0.5 (Fig 1B). Death rates were independent of the habitat type in which the species occurred. Birth. All individuals occurring within their preferred habitat type produced two offspring per time step, while individuals occurring in a non-preferred habitat type did not reproduce. Offspring were termed “propagules” until they established in a community (see below; Fig 1C). Dispersal. Newly generated propagules dispersed in random directions by a distance drawn from a half-Gaussian distribution with a mean of 1.24 grid cells (99% of movements result in dispersal distances ≤ 4 grid cells; Fig 1D). Established individuals (i.e. adults) only dispersed if they were in non-preferred habitats. We also explored dispersal kernels that were narrower (99% of movements within 2 grid cells) or broader (99% of movements within 8 grid cells) to confirm that results were qualitatively similar. Results for these simulations are presented in S1–S4 Figs, S1 and S2 Tables. Establishment. Empty spaces in each community were colonized by either a migrant from outside the community (drawn probabilistically from the GSAD) with a constant immigration rate probability (0.001) or by an individual selected randomly from the pool of new or dispersing propagules. Once individuals became established, they only left their community via dispersal or death (Fig 1E). Propagules that did not establish were eliminated at the end of each time step. We ran simulations for 200 time steps, which was long enough for species richness to achieve equilibrium in the landscape, and used the last 15 time steps to calculate temporal occupancy. Fifteen time steps represented an ecological dataset with a 15-year time series, a sampling period used in several previous studies which provides a reasonably high resolution estimate of temporal occupancy [7, 23]. Additionally, we calculated landscape-wide abundances for each species at the end of the simulation. We ran 50 replicate simulations for values of h ∈ {0.5, 0.6, 0.7, 0.8, 0.9} to generate landscapes that were more (high h) or less (low h) homogeneous. For each simulation, we also imposed a stochastic detection process in which we varied the probability of detecting an individual known to be present, p, from 0.1 to 1.0 in increments of 0.1. Detection probability was assumed to be both species- and habitat-independent. This resulted in a vector of "observed" species abundances in each grid cell at each time step.

Simulation analysis

We examined the temporal dynamics of species within a single, centrally located pixel for each simulation run. Based on the habitat type of the focal pixel, all species either could (core) or could not (transient) reproduce within that pixel and hence maintain a viable population. We refer to this as their biological, or true, status. In addition, each species was classified as core or transient based on temporal occupancy over the last 15 years of the simulation run. Species observed in five years or fewer (≤ 33%) were classified as transient while species observed in more than ten years (> 66%) were classified as core. For these analyses, we ignored the minority of species with intermediate temporal occupancy which could not be unambiguously assigned to core or transient status. Thus, each of the species we considered fell into one of the four categories shown in Table 1. For each simulation run, we calculated the rate of misclassifying core species and the rate of misclassifying transient species (Table 1). Error rates were examined as a function of average detection probability and landscape similarity in the 7 x 7 pixel region surrounding the focal pixel, which was calculated as the proportion of the regional window that was the same habitat type as the focal pixel. We used this regional window size because it reflects the area over which most colonization events to the focal pixel would originate. Number of species and classification error rates were predicted by detection probability and landscape similarity using ordinary least squares linear models. The relationship between species abundance and core species classification at detection = 0.5 was assessed using a generalized linear model with a logit link. Code for running these simulations in R is archived at https://github.com/ssnell6/CT-sim.

Results

Communities in homogeneous landscapes (e.g., Fig 2a) typically had a large number of true core species and only a few true transient species at any given point in time (Fig 2b). Turnover in the identity of the transient species from one time step to the next resulted in a mode of low temporal occupancy within an overall bimodal distribution of temporal occupancy (Fig 2c). Communities in heterogeneous landscapes (e.g., Fig 2d) had more true transient species appear in their non-preferred habitat type in any given time step due to the greater area of potential sources of colonization (Fig 2e). Many of these transient species were maintained by repeated dispersal from the alternate habitat type in the surrounding landscape such that they had moderate to high values of temporal occupancy (Fig 2f).

Fig 2

(A) Sample landscape of one simulation run in which the proportion of the full landscape that was habitat A (in red) was set to 0.9. Landscape similarity around the focal pixel is 0.92. (B) Number of core species (that can reproduce in the red habitat, red line) and transient species (that cannot reproduce in the red habitat, gray line), plotted over time for the focal pixel from the landscape in (A). (C) Temporal occupancy distribution of the species in the focal pixel from the landscape in (A). Colors of the bars indicate the number of species according to which habitat type they can reproduce in. (D) Sample landscape of one simulation run in which the proportion of the landscape that was habitat A (red) was set to 0.5. Landscape similarity around the focal pixel is 0.49. (E) Number of core species (red line) and transient species (gray line), plotted over time for the focal pixel from the landscape in (D). (F) Temporal occupancy distribution of the species in the focal pixel from the landscape in (D). Colors of the stacked bars indicate the number of species according to which habitat type they can reproduce in. Due to the large number of simulation replicates run, all statistical relationships examined had p < 2e-16 (S1 and S2 Tables), so we focus here on reporting only the sign of effects and the variance explained. The number of true core species (those maintaining a locally viable population) observed in a pixel increased with detection probability, and even more so with landscape similarity (S5–S7 Figs). More variance in the number of true core species could be explained by landscape similarity (R2 = 36%) than detection probability (R2 = 2%). The number of true transient species (those not maintaining a viable population) observed increased with detection probability and decreased strongly with landscape similarity (S5–S7 Figs). More variance in the number of true transient species could be explained by landscape similarity (R2 = 74%) than detection (R2 = 5%). Species that were true core members of the focal community were more likely to be incorrectly inferred as transient at low detection probabilities and low landscape similarities (Fig 3a). More variance in the proportion of misclassified core species could be explained by detection (R2 = 46%) than landscape similarity (R2 = 11%). Error rates were close to zero when landscape similarity was greater than 0.6 and detection probability was greater than 0.3 and increased most noticeably when detection probability was 0.1, the lowest detection rate examined.

Fig 3

Percent of biologically core (A) species that were incorrectly inferred to be transient and biologically transient (D) species that were incorrectly inferred to be core for each combination of detection probability and landscape similarity. The x-axis is the average species detection probability for the simulation run, while the y-axis is the proportion of a 7 x 7 landscape surrounding the focal pixel that is of the same habitat type. Line graphs (B, E) show the percent of incorrect classifications of core species (B) or transient species (E) for each detection probability at low (0.3, solid line) or high (0.8, dashed line) landscape similarity. Line graphs (C, F) show the percent of incorrect classifications of core species (C) or transient species (F) with increasing landscape similarity at low (0.1, solid line) or high (0.9, dashed line) detection probability. Transient species that did not reproduce in the focal habitat but regularly occurred there were incorrectly inferred as core most often at high detection probabilities and low landscape similarities (Fig 3b). More variance in the proportion of misclassified transient species could be explained by landscape similarity (R2 = 48%) than detection (R2 = 13%). Error rates for classifying transient species were zero or near zero when landscape similarity was greater than 0.5. Transient species misclassification rates were greatest when landscape similarity was less than 0.4, where the majority of colonization events came from the opposite habitat type, such that poorly adapted species appeared in the focal habitat repeatedly over time. This was exacerbated at high detection probability, which ensured these true transient occurrences were observed and therefore misclassified. Additionally, species with low landscape-wide abundance were more likely to be misclassified as transient when they were truly core members of their community, while the odds of misclassifying a core species were less than 13% for species whose abundance was at least 12% of the most abundant species (Fig 4).

Fig 4

Probability of correct classification of biologically core species based on temporal occupancy as a function of the log of landscape wide abundance (relative to the abundance of the most abundant species, 100%).

Dashed line indicates the location of the inflection point.

Probability of correct classification of biologically core species based on temporal occupancy as a function of the log of landscape wide abundance (relative to the abundance of the most abundant species, 100%).

Dashed line indicates the location of the inflection point. Results were similar using both narrower and broader dispersal kernels (S1–S4 Figs, S1 and S2 Tables), with narrow kernels having slightly more variance in classification rate than broader kernels.

Discussion

Several studies have used temporal occupancy to infer the persistence of populations over time and the degree to which a species can be considered a core member of a community in a particular location [3, 7–9, 23]. Our simulations showed that in many realistic scenarios, this is a valid approach, but also confirmed that temporal occupancy is subject to misclassification errors where core species are inferred to be transient and transient species are inferred to be core. As expected, low detection probabilities resulted in more frequent misclassification of core species as transient. Rare species were also more likely misclassified as transient. Low landscape similarity, when combined with high detection probabilities, resulted in transient species more frequently being misclassified as core. Imperfect individual detection influenced the rate at which core species were misclassified as transients through failing to detect species when they were actually present. These species were more likely to be inferred as transient at lower detection probabilities. However, error rates for core species misidentified as transients were quite low as long as detection probabilities were greater than approximately 0.3. This threshold of 0.3 is at the low end of detection probabilities observed for most bird species, with most species exhibiting substantially higher rates of detection [24-26]. Specifically, Boulinier et al. (1998) found that across a range of habitats in North America, average detection probabilities for species richness estimates using the Breeding Bird Survey ranged from 0.65 to 0.85. Johnston et al. (2014) found that the least detectable family of birds was Paridae, which had a median detectability of 0.27, and the majority of other families had detection probabilities greater than 0.3. Overall, these findings suggest that the misclassification of core species is unlikely to be common except at unusually low detection probabilities that may be relevant for only a small minority of species. Misclassification of core species as transients was also more common for species occurring at low abundance across the landscape, and in particular, for species with abundances less than 12% of the most abundant species. The probability of detecting a species with n individuals given an individual detection probability, p, will be 1 –(1 –p), and thus this link between abundance, species detection, and potential misclassification of transient status is quite expected. Also, species that occur at low density may have large home ranges relative to the scale of the survey (e.g. woodpeckers, raptors), and may frequently be missed on surveys even if they are on territory and have high detectability when present. This is one reason that relatively small spatial scales have been shown to have fewer perceived core species and more perceived transient species compared to larger scales [3, 27]. The opposite classification error, misclassifying transient species as core species, was associated with high habitat heterogeneity. Species occurred in habitats to which they were poorly adapted because of dispersal from nearby source populations. The greater the surrounding area containing source populations, the greater the chance of repeated dispersal into nearby sink habitats causing the species to be regularly detected through time [20]. These errors became prevalent when 60% or more of the surrounding landscape was different from the focal habitat. Our simulation model assumed that dispersal of new propagules was random with respect to habitat type, but if dispersal was biased toward the preferred habitat type (which seems likely for organisms with active dispersal; i.e., Johnston et al. 2014), it would reduce the frequency of transient occurrences and therefore reduce observed error rates. The rate at which transient species were misclassified as core species also decreased with decreasing detection probability because at low detections, errors caused by repeated dispersal from adjacent source habitats were canceled out by detection errors. Overall, these results suggest that misclassification of transient species is unlikely to be common except in highly fragmented landscape configurations with unbiased dispersal. Geographic patterns in the relative prevalence of core and transient species can influence our understanding of ecological communities when failing to recognize this distinction [7, 23], especially if the probability of misclassification varies geographically. One likely source for this is detection probability, which is thought to vary along environmental gradients. In particular, it has been suggested that average detectability decreases along continental to global productivity gradients because species are more difficult to detect in more densely vegetated environments [17]. However, despite such a potential bias, past work has shown that there is typically a positive relationship between either temporal occupancy or species richness and remotely sensed proxies for productivity, meaning the observed patterns were opposite what would be predicted purely from a detectability effect [7, 17, 23]. As such, these patterns of occupancy and richness were observed despite, and not because of, geographic variation in detectability. Other studies have suggested that birds sing more frequently in densely forested habitats so aural-based sampling should not observe this effect in forests, but in open habitats [24]. In these cases, variation in detection probability alone has the potential to generate apparent patterns in richness or abundance, with misclassification rates of species varying across the gradient. While we parameterized our simulation model to loosely reflect the biology of songbirds (e.g. reproductive rate, dispersal distance), the inferences that can be made from this simulation model are more broadly generalizable. We chose to focus on birds because they are highly mobile, can disperse widely, and have been studied empirically in this core-transient context [3, 7, 23]. These first two attributes make temporal occupancy particularly useful for identifying core and transient birds in communities, but also potentially more prone to errors due to source-sink dynamics. Detectability is dependent on both species attributes and the environment. Some species are inherently more detectable due to variation in species color, size, and behavior. A large, colorful bird perched conspicuously or that sings loudly and frequently is detected more often than a little brown bird in the undergrowth, given they occur at equal densities. Our study is most relevant for considering how detection probability covaries along an environmental gradient, where detection probability likely varies on average across all species, than for considering how detection probability varies among species. Nevertheless, species known to have low detection probabilities will presumably require more targeted monitoring efforts, and temporal occupancy should be used with caution to infer population persistence and habitat suitability for such species. The aim of our simulation model was to capture how frequently species are misclassified within the core-transient temporal occupancy framework. Therefore, we focused on landscape similarity and detectability, but other parameters could also play a role in determining the effectiveness of temporal occupancy. In our study, birth rates and death rates were constant, so increasing the birth rates or decreasing death rates of species occurring in their preferred habitats could allow specialists to reach equilibrium in a habitat more quickly, decreasing the number of transient species in the community. Additionally, varying immigration across species could enable one species to immigrate more effectively into new habitats than other species, but in our study, immigration and dispersal were analogous because both allowed species to colonize new habitats. We addressed alternative dispersal distances into new cells by varying the dispersal kernels in supplementary analyses, which demonstrated that only at very low dispersal do detection and landscape similarity affect core and transient classification. In general, we found that temporal occupancy can reliably be used to infer habitat associations, as well as the likelihood of a species maintaining a viable population in the location where it was observed, under a broad range of conditions. Depending on the nature of the raw data available, occupancy modeling approaches (sensu MacKenzie et al. 2002) may have the potential to refine assignments of core and transient species status by directly accounting for detectability, and deserve further research in this context. The use of raw temporal occupancy may be most problematic in study systems made up of highly isolated habitat fragments where species commonly disperse from the surrounding landscape matrix, or in habitats or for species with uniformly low detection probabilities. Ecologists should explicitly consider whether detection probabilities vary across the environmental gradients in their study systems before using temporal occupancy. Considering the relationship of landscape similarity and detection in specific study systems will provide a guide for when and how to include temporal occupancy in ecological analyses. Percent of biologically core (A) species that were incorrectly inferred to be transient and biologically transient (D) species that were incorrectly inferred to be core for each combination of detection probability and landscape similarity at a narrower dispersal kernel (99% of movements result in dispersal distances ≤ 2 grid cells). The x-axis is the average species detection probability for the simulation run, while the y-axis is the proportion of a 7 x 7 landscape surrounding the focal pixel that is of the same habitat type. Line graphs (B, E) show the percent of incorrect classifications of core species (B) or transient species (E) for each detection probability at low (0.3, solid line) or high (0.8, dashed line) landscape similarity. Line graphs (C, F) show the percent of incorrect classifications of core species (C) or transient species (F) with increasing landscape similarity at low (0.1, solid line) or high (0.9, dashed line) detection probability. (DOCX) Click here for additional data file.

Correct or incorrect classification of biologically core species based on temporal occupancy as a function of the log of landscape wide abundance (relative to the abundance of the most abundant species) at a narrower dispersal kernel (99% of movements result in dispersal distances ≤ 2 grid cells).

(DOCX) Click here for additional data file. Percent of biologically core (A) species that were incorrectly inferred to be transient and biologically transient (D) species that were incorrectly inferred to be core for each combination of detection probability and landscape similarity at a broader dispersal kernel (99% of movements result in dispersal distances ≤ 8 grid cells). The x-axis is the average species detection probability for the simulation run, while the y-axis is the proportion of a 7 x 7 landscape surrounding the focal pixel that is of the same habitat type. Line graphs (B, E) show the percent of incorrect classifications of core species (B) or transient species (E) for each detection probability at low (0.3, solid line) or high (0.8, dashed line) landscape similarity. Line graphs (C, F) show the percent of incorrect classifications of core species (C) or transient species (F) with increasing landscape similarity at low (0.1, solid line) or high (0.9, dashed line) detection probability. (DOCX) Click here for additional data file.

Correct or incorrect classification of biologically core species based on temporal occupancy as a function of the log of landscape wide abundance (relative to the abundance of the most abundant species) at a broader dispersal kernel (99% of movements result in dispersal distances ≤ 8 grid cells).

(DOCX) Click here for additional data file. Mean number of biologically core (A) and biologically transient (D) species observed for each combination of detection probability and landscape similarity at the main analysis dispersal kernel (99% of movements result in dispersal distances ≤ 4 grid cells). Line graphs (B, E) show the mean count of core species (B) or transient species (E) for each detection probability at low (0.3, solid line) or high (0.8, dashed line) landscape similarity. Line graphs (C, F) show the mean count of core species (C) or transient species (F) with increasing landscape similarity at low (0.1, solid line) or high (0.9, dashed line) detection probability. (DOCX) Click here for additional data file. Mean number of biologically core (A) and biologically transient (D) species observed for each combination of detection probability and landscape similarity at a narrower dispersal kernel (99% of movements result in dispersal distances ≤ 2 grid cells). Line graphs (B, E) show the mean count of core species (B) or transient species (E) for each detection probability at low (0.3, solid line) or high (0.8, dashed line) landscape similarity. Line graphs (C, F) show the mean count of core species (C) or transient species (F) with increasing landscape similarity at low (0.1, solid line) or high (0.9, dashed line) detection probability. (DOCX) Click here for additional data file. Mean number of biologically core (A) and biologically transient (D) species observed for each combination of detection probability and landscape similarity at a broader dispersal kernel (99% of movements result in dispersal distances ≤ 8 grid cells). Line graphs (B, E) show the mean count of core species (B) or transient species (E) for each detection probability at low (0.3, solid line) or high (0.8, dashed line) landscape similarity. Line graphs (C, F) show the mean count of core species (C) or transient species (F) with increasing landscape similarity at low (0.1, solid line) or high (0.9, dashed line) detection probability. (DOCX) Click here for additional data file.

Parameter estimates from linear models of the number of core and transient species as a function of detection and landscape similarity for all dispersal kernels.

(DOCX) Click here for additional data file.

Parameter estimates from linear models of the percent incorrect core and transient species as a function of detection and landscape similarity for all dispersal kernels.

(DOCX) Click here for additional data file. 25 Aug 2020 PONE-D-20-24435 A simulation study of the use of temporal occupancy for identifying core and transient species PLOS ONE Dear Dr. Snell Taylor, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Both reviewers found the study interesting and generally well written. However both reviewers also felt some additional justification of the modelling framework is needed, and found the distinction between transient and sink species somewhat unclear. Please submit your revised manuscript by Oct 09 2020 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols We look forward to receiving your revised manuscript. Kind regards, Patrick R Stephens, Ph.D. Academic Editor PLOS ONE Journal requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf 2. Please include captions for your Supporting Information files at the end of your manuscript, and update any in-text citations to match accordingly. Please see our Supporting Information guidelines for more information: http://journals.plos.org/plosone/s/supporting-information. Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Yes ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: This paper uses a simulation model to examine the roles that heterogeneity in both landscapes and detection probability influence categorization of species as either 'core' or 'transient' members of a community. The results of the model are in line with the authors expectations that core species will be misidentified as transient mostly due to lower detection probabilities, and that transient species will be misidentified as core mostly due to increased landscape heterogeneity. These results, though not too surprising, reinforce the notion that mis-classification of species is possible, and quantifies under which scenarios different mis-classifications are expected to occur. The model itself is well-described in the Methods section, and the subheadings make it easy to follow this description. The code is available in the supplemental material. The Methods and Results are written very clearly and succinctly, and would be possible to reproduce in any programming language. All in all, I thought this paper flowed very naturally. I have just a few questions: 1) Are 'transient' and 'sink' species the same thing (line 54)? How does this model compare or add to the source/sink dynamic models? 2) Is landscape heterogeneity the major driver of transient species occurrences in communities? I'm not too familiar with the literature on core/transient species in communities, but it seems like factors other than habitat preference might also drive the occurrence of a transient species in a community. For example, naturally low population sizes, meta-population dynamics, etc. This is beyond the scope of this model, but it would be good to mention other factors that influence whether a species is transient. 3) Most of the model parameters make sense and are explained clearly. But why did you choose a 7x7 pixel region surrounding the focal pixel to quantify heterogeneity? Did you make sure that this sized region was truly reflective of landscape heterogeneity on the whole grid? Given enough simulations, this might not be an issue, but maybe a little explanation about why you chose this size would be helpful. Reviewer #2: This is a simulation study that aims to clarify potential biases associated with applying temporal occupancy approaches to classify core and transient species. While the results are somewhat intuitive, as misclassification depends on both detection rates and the spatial proximity between habitats, putting these recommendations “out there” will be useful for many ecologists. I have a few suggestions, but I feel they are relatively minor. Sincerely, Jonathan Belmaker Introduction: The ability to detect core or transient species will depend on how temporal occupancy is used to separate these groups. A few more words here would be useful for the naïve reader to understand how core and transient species are separated in other studies, and what approach will be used here. Methods: Very specific birth, death, immigration rate and carrying capacity parameters were used. I do not believe this will change the results, but some sensitivity analyses to the modification of these parameters would increase the robustness of the findings (this was done for dispersal rate only). For the simulations varying detection probability, the distribution of detection probabilities across species and habitats is not clear. Please clarify. I had to get to the discussion to understand that detection probability did not vary among species. I would like to better understand the rational for this, as it seems less relevant to natural communities than models that vary detection among species. Furthermore, I would expect that when detection rates vary among species the misclassification of core and transient species may increase dramatically. I would strongly suggest to explore adding such heterogeneity in detection rate to the simulations. Results: It does not make sense to use P values in simulation models (as one can always achieve a significant result by increasing sample size). I would delete the P values throughout the text. Discussion: Occupancy modeling should improve imperfect detection. It is worth discussing how these methods may improve the separation between core and transient species given the results presented here. The impact of rarity is strong and nicely seen in figure 5, but I feel does not receive sufficient attention in the discussion. Figures and tables: I am not sure Figure 3 is very informative, as it is not the number of core and transient species that is interesting in this context but the misclassification rates. I would remove. ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: Yes: Andrew Laughlin Reviewer #2: Yes: Jonathan Belmaker [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. 25 Sep 2020 For formatted responses, please see attached document. I have just a few questions: 1) Are 'transient' and 'sink' species the same thing (line 54)? How does this model compare or add to the source/sink dynamic models? Although we link the processes we describe to source-sink dynamics throughout the text, we agree that it was unclear to refer to the term "sink species", as “sink” is better used to describe the location than the species occurring in that location. To clarify, we have deleted the “or sink” phrase in this sentence. 2) Is landscape heterogeneity the major driver of transient species occurrences in communities? I'm not too familiar with the literature on core/transient species in communities, but it seems like factors other than habitat preference might also drive the occurrence of a transient species in a community. For example, naturally low population sizes, meta-population dynamics, etc. This is beyond the scope of this model, but it would be good to mention other factors that influence whether a species is transient. Indeed, low population density and metapopulation dynamics do contribute to the likelihood of a species being classified as transient. We mention low population density on line 67 of the introduction and have expanded our discussion of this effect on lines 230-240 of the discussion. Metapopulation dynamics occur due to discrete populations connected by varying degrees by dispersal across a heterogeneous landscape. This is what we are explicitly modeling and what our simulated landscape with varying degrees of heterogeneity seeks to represent. 3) Most of the model parameters make sense and are explained clearly. But why did you choose a 7x7 pixel region surrounding the focal pixel to quantify heterogeneity? Did you make sure that this sized region was truly reflective of landscape heterogeneity on the whole grid? Given enough simulations, this might not be an issue, but maybe a little explanation about why you chose this size would be helpful. Thank you for this comment. Each pixel on the landscape contains an entire biological community of organisms and we wanted to represent the heterogeneity of the region over which the vast majority of dispersing propagules are colonizing from. For the main dispersal kernel we examined, 99% of movements occurred within 4 grid cells. Had we chosen a larger window, landscape features outside of the likely range of colonization would influence our estimate of heterogeneity. We have added a sentence justifying this decision in lines 154-155. Reviewer #2: Introduction: The ability to detect core or transient species will depend on how temporal occupancy is used to separate these groups. A few more words here would be useful for the naïve reader to understand how core and transient species are separated in other studies, and what approach will be used here. We have added text and citations to clarify how core and transient species have previously been separated in the introduction (ll. 59-61). Methods: Very specific birth, death, immigration rate and carrying capacity parameters were used. I do not believe this will change the results, but some sensitivity analyses to the modification of these parameters would increase the robustness of the findings (this was done for dispersal rate only). We appreciate the reviewer’s suggestions for sensitivity analyses, and we carried out this analysis for dispersal rate, which, of all the model parameters, is most directly connected to the process leading to the presence of transient species in habitats to which they are unsuited. The other parameters mentioned by the reviewer (birth rate, death rate, immigration rate, and carrying capacity) are largely independent of the dispersal and detection processes in our model that would affect the misclassification of species as core or transient. As we mention in the discussion (lines 287-298), interspecific variability in these parameters and others might be interesting to consider for other reasons, but interspecific variation is beyond the scope of our present aims. For the simulations varying detection probability, the distribution of detection probabilities across species and habitats is not clear. Please clarify. I had to get to the discussion to understand that detection probability did not vary among species. On line 137-139 of the methods section, we state that “Detection probability was assumed to be both species- and habitat-independent.” I would like to better understand the rational for this, as it seems less relevant to natural communities than models that vary detection among species. Furthermore, I would expect that when detection rates vary among species the misclassification of core and transient species may increase dramatically. I would strongly suggest to explore adding such heterogeneity in detection rate to the simulations. We appreciate the reviewer’s comment regarding detection probability, and they are absolutely correct that detectability varies among species empirically based on traits such as color, size, and behavior. Introducing interspecific variation in detection probabilities would undoubtedly help explain which species are being misclassified within the simulation, but this is not our aim and we take it as a given. We disagree with the reviewer’s intuition regarding the effect of heterogeneity in detectability across species within a simulation run. The fewest misclassifications of species as transient will occur when detectability is high for all species, the most misclassifications will occur when detectability is low for all species, and when there is a mix of high and low detectability species, the misclassification rate for the community will be intermediate. Instead, we are interested in the consequences of imperfect detectability across an entire community, given that detectability may vary on average from community to community as a function of environmental characteristics. We have added text in the introduction on lines 71-73 to emphasize this motivation. Given these aims, and the fact that a “data point” in our analyses is a grid cell (characterized by mean detectability and landscape heterogeneity) and not a species, we find it most appropriate to vary detectability as we have which provides a sense of the best and worst case scenarios. Results: It does not make sense to use P values in simulation models (as one can always achieve a significant result by increasing sample size). I would delete the P values throughout the text. Thank you for pointing this out. We now include a statement acknowledging this fact lines 171-173), and we no longer refer to the P values in the results section of the manuscript. Discussion: Occupancy modeling should improve imperfect detection. It is worth discussing how these methods may improve the separation between core and transient species given the results presented here. We have added the following sentence to the Discussion (lines 302-305): "Depending on the nature of the raw data available, occupancy modeling approaches (sensu MacKenzie et al. 2006) may have the potential to refine assignments of core and transient species status by directly accounting for detectability, and deserve further research in this context." That said, there are many nuances regarding the types of data that occupancy modeling requires and the specific questions that it is able to answer, and we feel this extended discussion is tangential to the aims of our manuscript. Specifically, we would argue that “occupancy” as used in the occupancy estimation modeling literature is typically a description of spatial distribution, defined in MacKenzie et al. (2006) as the “proportion of an area occupied by a species or fraction of landscape units where the species is present.” Proper occupancy modeling requires repeat visits within a year in order to estimate detectability and in datasets where sites are visited only once per year, it is more difficult to separate detectability and occupancy. Temporal occupancy is instead a proxy for population persistence over time that can be calculated for a much broader array of dataset types, and the point of this manuscript is that it performs reasonably well. The impact of rarity is strong and nicely seen in figure 5, but I feel does not receive sufficient attention in the discussion. Thank you for pointing this out. We have added a paragraph discussing this result on lines 230-240: "Misclassification of core species as transients was also more common for species occurring at low abundance across the landscape, and in particular, for species with abundances less than 12% of the most abundant species. The probability of detecting a species with n individuals given an individual detection probability, p, will be 1 – (1 – p)n, and thus this link between abundance, species detection, and potential misclassification of transient status is quite expected. Also, species that occur at low density may have large home ranges relative to the scale of the survey (e.g. woodpeckers, raptors), and may frequently be missed on surveys even if they are on territory and have high detectability when present. This is one reason that relatively small spatial scales have been shown to have fewer perceived core species and more perceived transient species compared to larger scales (Snell Taylor et al. 2018; Jenkins et al. 2018)." Figures and tables: I am not sure Figure 3 is very informative, as it is not the number of core and transient species that is interesting in this context but the misclassification rates. I would remove. We have moved Figure 3 to the supplemental information (Figure S5). We feel it remains useful for contextualizing the data on misclassification rates in Figure 4 given that there is variation in the total number of transient and core species that could potentially be misclassified across parameter space. Submitted filename: Response to reviewers.pdf Click here for additional data file. 12 Oct 2020 A simulation study of the use of temporal occupancy for identifying core and transient species PONE-D-20-24435R1 Dear Dr. Snell Taylor, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Patrick R Stephens, Ph.D. Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: All comments have been addressed Reviewer #2: All comments have been addressed ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Yes ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: (No Response) Reviewer #2: This is the second time I am reviewing this manuscript. In general, my previous comments were addressed (for a few small issues I do not completely agree, but this is a matter of opinion and I am willing to accept the authors’ response). I do not have any new suggestions I will be happy to see this accepted for publication. Sincerely, Jonathan Belmaker ********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: Yes: Andrew Laughlin Reviewer #2: Yes: Jonathan Belmaker 14 Oct 2020 PONE-D-20-24435R1 A simulation study of the use of temporal occupancy for identifying core and transient species Dear Dr. Snell Taylor: I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department. If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org. If we can help with anything else, please email us at plosone@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Patrick R Stephens Academic Editor PLOS ONE

7 in total

A simulation study of the use of temporal occupancy for identifying core and transient species.

Introduction

Methods

Simulation model

Schematic documenting the events that occur in a single time step of the simulation, including death, birth, dispersal, and establishment.

Simulation analysis

Results

Probability of correct classification of biologically core species based on temporal occupancy as a function of the log of landscape wide abundance (relative to the abundance of the most abundant species, 100%).

Discussion

Correct or incorrect classification of biologically core species based on temporal occupancy as a function of the log of landscape wide abundance (relative to the abundance of the most abundant species) at a narrower dispersal kernel (99% of movements result in dispersal distances ≤ 2 grid cells).

Correct or incorrect classification of biologically core species based on temporal occupancy as a function of the log of landscape wide abundance (relative to the abundance of the most abundant species) at a broader dispersal kernel (99% of movements result in dispersal distances ≤ 8 grid cells).

Parameter estimates from linear models of the number of core and transient species as a function of detection and landscape similarity for all dispersal kernels.

Parameter estimates from linear models of the percent incorrect core and transient species as a function of detection and landscape similarity for all dispersal kernels.

1. Explaining the excess of rare species in natural species abundance distributions.

2. A core-transient framework for trait-based community ecology: an example from a tropical tree seedling community.

3. The prevalence and impact of transient species in ecological communities.

4. Opposing mechanisms drive richness patterns of core and transient bird species.

Review 5. Accounting for imperfect detection in ecology: a quantitative review.

6. The proportion of core species in a community varies with spatial scale and environmental heterogeneity.

7. Inferring species richness using multispecies occupancy modeling: Estimation performance and interpretation.