| Literature DB >> 35958088 |
Abstract
The detection of target species is of paramount importance in ecological studies, with implications for environmental management and natural resource conservation planning. This is usually done by sampling the area: the species is detected if the presence of at least one individual is detected in the samples. Green & Young (Green & Young 1993 Sampling to detectrare species. Ecol. Appl. 3, 351-356. (doi:10.2307/1941837) introduce two models to determine the minimum number of samples n to ensure that the probability of failing to detect the species from them, if the species is actually present in the area, does not exceed a fixed threshold: based on the Poisson and the Negative Binomial distributions. We generalize them to two scenarios, one considering the area size N to be finite, and the other allowing detectability errors, with probability δ. The results in Green & Young are recovered by taking N → ∞ and δ = 0. Not taking into consideration the finite size of the area, if known, leads to an overestimation of n, which is vital to avoid if sampling is expensive or difficult, while assuming that there are no detectability errors, if they really exist, produces an undesirable bias. Our approximation manages to skirt both problems, for the Poisson and the Negative Binomial.Entities:
Keywords: Negative Binomial; Poisson; detection error; sampling; target species
Year: 2022 PMID: 35958088 PMCID: PMC9364006 DOI: 10.1098/rsos.220046
Source DB: PubMed Journal: R Soc Open Sci ISSN: 2054-5703 Impact factor: 3.653
Figure 1Example of an area A of size 64 m2, from which six quadrats of 1 m2 have been chosen at random. Then, N = 64.
K = 107 iterations of the algorithm 1 for some values of N in the two examples of table 1, both using (2.1) and (1.1) to determine n. p = Probability of not detecting the presence of the species in area A, if present, from the n samples, which must be approximately equal to β, ideally not greater, although the simulation procedure may lead to a result that (narrowly) violates this constraint.
| example 1: | example 2: | ||||
|---|---|---|---|---|---|
| 3000 | 2330 (2.1) | 0.05003096 | 300 | 233 (2.1) | 0.05001881 |
| 2996 (1.1) | 0.00021405 | 300 (1.1) | 0.00000000 | ||
| 4000 | 2698 (2.1) | 0.04991710 | 400 | 270 (2.1) | 0.04977368 |
| 2996 (1.1) | 0.03213956 | 300 (1.1) | 0.03198875 | ||
| 5000 | 2876 (2.1) | 0.04994049 | 500 | 288 (2.1) | 0.04973971 |
| 2996 (1.1) | 0.04360176 | 300 (1.1) | 0.04333699 | ||
Two examples of how the overestimation of n using the formula (1.1), with respect to the formula (2.1), decreases as N increases. In both cases, the maximum overstatement (achieved with the minimum N) is greater than 28% ( and ).
| example 1: | example 2: | ||||
|---|---|---|---|---|---|
| (1.1)–(2.1) | (1.1)–(2.1) | ||||
| 3000 | 2330 | 666 | 300 | 233 | 67 |
| 3500 | 2543 | 453 | 350 | 255 | 45 |
| 4000 | 2698 | 298 | 400 | 270 | 30 |
| 4500 | 2805 | 191 | 450 | 281 | 19 |
| 5000 | 2876 | 120 | 500 | 288 | 12 |
| 5500 | 2921 | 75 | 550 | 293 | 7 |
| 6000 | 2950 | 46 | 600 | 295 | 5 |
| 6500 | 2968 | 28 | 650 | 297 | 3 |
| 7000 | 2979 | 17 | 700 | 298 | 2 |
| 7500 | 2986 | 10 | 750 | 299 | 1 |
| 8000 | 2990 | 6 | 800 | 299 | 1 |
| 8500 | 2992 | 4 | 850 | 300 | 0 |
| 9000 | 2994 | 2 | 900 | 300 | 0 |
| 9500 | 2995 | 1 | 950 | 300 | 0 |
| 10 000 | 2995 | 1 | 1000 | 300 | 0 |
| 10 500 | 2996 | 0 | 1050 | 300 | 0 |
| 11 000 | 2996 | 0 | 1100 | 300 | 0 |
K = 107 iterations of algorithm 2 for examples 3 and 4. p = Probability of not detecting the presence of the species in area A, if present, from the n samples, with δ the probability of not detecting any individual.
| example 3: | ||||
|---|---|---|---|---|
| 0.0000 | 2543 | 0.04980696 | 2996 | 0.02044645 |
| 0.0001 | 2543 | 0.05005280 | 2997 | 0.02037177 |
| 0.0005 | 2544 | 0.04998135 | 2998 | 0.02044940 |
| 0.0010 | 2545 | 0.04998002 | 2999 | 0.02043230 |
| 0.0050 | 2556 | 0.04984517 | 3011 | 0.02044562 |
| 0.0100 | 2568 | 0.04991535 | 3026 | 0.02041467 |
| 0.0500 | 2677 | 0.04993774 | 3154 | 0.02035112 |
| 0.1000 | 2825 | 0.05009235 | 3329 | 0.02046655 |
| example 4: | ||||
| 0.1 | 3278 | 0.04996806 | 3329 | 0.04766190 |
| 0.2 | 3688 | 0.04992643 | 3745 | 0.04749297 |
| 0.3 | 4214 | 0.05003753 | 4280 | 0.04770510 |
| 0.4 | 4917 | 0.04991602 | 4993 | 0.04761596 |
| 0.5 | 5900 | 0.04993702 | 5992 | 0.04759538 |
Figure 2Pipeline of practice to implement the approach to find the minimum number of samples n to ensure that the probability of failing in detecting the species from them, if the species is actually present in the area, does not exceed a fixed threshold β.
Summary of formulae to determine n. Taking the limit as r → +∞ in the formulae corresponding to the Negative Binomial (model B) we obtain those corresponding to the Poisson model (model A).
| model A: Poisson distribution Pois( | ||
|---|---|---|
| number of samples | scenario 1. Area size | |
| scenario 2 | ||
| model B: Negative Binomial distribution NB( | ||
| number of samples | scenario 1. Area size | |
| scenario 2 | ||
Summary of estimates. Model A: w is the number of the n samples for which we have detected the presence of the species (= the number of them that contain at least one individual, if δ = 0). Model B: x1, …, x are the number of individuals of the species that have been detected in any of the samples (= the number of individuals indeed present in any of them, if δ = 0). is the sample mean value, is the sample (uncorrected) variance.
| model A: Pois( | model B: NB( | |
|---|---|---|
| scenario 2. | ||