Literature DB >> 23704965

The power to detect recent fragmentation events using genetic differentiation methods.

Michael W Lloyd¹, Lesley Campbell, Maile C Neel.

Abstract

Habitat loss and fragmentation are imminent threats to biological diversity worldwide and thus are fundamental issues in conservation biology. Increased isolation alone has been implicated as a driver of negative impacts in populations associated with fragmented landscapes. Genetic monitoring and the use of measures of genetic divergence have been proposed as means to detect changes in landscape connectivity. Our goal was to evaluate the sensitivity of Wright's F st, Hedrick' G'st , Sherwin's MI, and Jost's D to recent fragmentation events across a range of population sizes and sampling regimes. We constructed an individual-based model, which used a factorial design to compare effects of varying population size, presence or absence of overlapping generations, and presence or absence of population sub-structuring. Increases in population size, overlapping generations, and population sub-structuring each reduced F st, G'st , MI, and D. The signal of fragmentation was detected within two generations for all metrics. However, the magnitude of the change in each was small in all cases, and when N e was >100 individuals it was extremely small. Multi-generational sampling and population estimates are required to differentiate the signal of background divergence from changes in Fst , G'st , MI, and D associated with fragmentation. Finally, the window during which rapid change in Fst , G'st , MI, and D between generations occurs can be small, and if missed would lead to inconclusive results. For these reasons, use of F st, G'st , MI, or D for detecting and monitoring changes in connectivity is likely to prove difficult in real-world scenarios. We advocate use of genetic monitoring only in conjunction with estimates of actual movement among patches such that one could compare current movement with the genetic signature of past movement to determine there has been a change.

Entities: Disease Gene Species

Mesh：

Year: 2013 PMID： 23704965 PMCID： PMC3660580 DOI： 10.1371/journal.pone.0063981

Source DB: PubMed Journal: PLoS One ISSN： 1932-6203 Impact factor: 3.240

Introduction

Habitat loss and fragmentation are considered to be among the most imminent threats to biological diversity worldwide and thus are fundamental issues in conservation biology [1]–[4]. Fragmentation is a complex phenomenon that is simultaneously a consequence of habitat loss and a process in and of itself [5]–[7]. It is a function of the extensiveness of individual patches, distances among those patches [8]–[10], the nature of the intervening landscape [11], and how individual species are affected by each of those aspects [12]. Understanding the joint and independent effects of loss and configuration of the remaining habitat has long been a major focus of landscape ecology due to conservation implications e.g., [13]–[17]. Although the two phenomena are intertwined, when they are examined separately habitat loss has repeatedly been shown to have larger detrimental effects than fragmentation alone [5], [7], [18]–[21]. Still, increased isolation has been implicated as a driver of population extinctions [22], declining population size of interior species [13], [23], altered social behavior [24], reduced population viability [25], [26], demographic change in general [11], [27], [28], and spread of invasive species [29]. Reduced migration under lower levels of connectivity will have genetic consequences of reduced effective population size (N e) and increased rates of inbreeding and genetic drift within newly isolated habitat patches that will affect short- and long-term potential for survival [30]–[33]. Changes in landscape composition and configuration associated with the fragmentation process have been quantified and monitored using an extensive array of landscape indices [34]–[40]. Assessing the consequences of these changes for populations and processes fundamentally requires linking the structural attributes of landscape pattern with potential or actual movement of individuals among patches [8], [40]–[43]. Movement is often documented using habitat suitability, mark-recapture, radio-telemetry, experimental removal-recolonization studies [19], [41] and demographic monitoring [44]–[46]. Unfortunately, such studies can be so data- and time-intensive that there may be little practical application for conservation of most species e.g., [47], [48]. Observing physical movement of cryptic or primarily sessile organisms in which mobility is limited to particular life stages is especially challenging [49], [50]. Genetic monitoring [51] has been proposed as a minimally invasive, relatively cost-effective means of quantifying genetic effects of changes in landscape structure. Population genetic parameters may be more sensitive for detecting changes in connectivity than traditional demographic estimates that have large error components [52]. Thus, although in many cases conservation biologists are concerned about genetic diversity for its own sake, here we are interested in the potential for using genetic changes that result from fragmentation to quantify changes in the ecological process of movement. Direct genetic methods have been developed to detect actual dispersal events [53]–[57]. However, it is more common for investigators to document fragmentation using indirect methods that quantify the amount of divergence in populations in putatively fragmented habitat. Although potentially more powerful analytical methods have been developed [58]–[61] and are being tested [62], [63], most investigators use Wright’s F st [64] and its analogues [65]–[73]. Despite its fundamental importance and strong theoretical foundations, detecting genetic effects of fragmentation in the wild has not been as straightforward as one might expect. Attempts to link indices of landscape structure to ecological and evolutionary processes have not yielded consistent relationships and many empirical investigations of fragmentation fail to detect definitive effects [74]–[76]. In particular, empirical data are often equivocal relative to predictions of the impacts of fragmentation on genetic divergence. Inconsistent relationships may result from non-monotonic relationships between many landscape metrics and landscape configuration [10] or non-linear or threshold-like population responses along the fragmentation gradient. Additionally, not all habitat that is perceived as fragmented by humans is actually fragmented from the perspective of a species of interest, thus some investigations may be trying to quantify effects of fragmentation where it actually does not exist. As mentioned above, the point at which discrete patches are functionally fragmented depends on the scale at which a species perceives and interacts with the landscape [77]–[79]. For species in patchy habitats, connectivity ultimately depends on the degree to which land cover types between discrete patches are barriers, versus filters, versus easily traversable; information that is lacking for most species. Moreover, even if movement through a landscape is impeded or precluded through anthropogenic change, long-lived individuals that pre-date the fragmentation event would provide a genetic signature of connectivity that no longer exists [74]. These issues can be addressed through careful study design in which temporal and spatial sampling scales match potential scales of fragmentation based on the biology of the focal organism. Of greater concern is the potential that characteristics of F st-related values might make them insufficient for detecting habitat fragmentation on time scales that are relevant for conservation management. Wright’s F st and subsequent derivations have a number of specific assumptions that are almost always violated in natural systems and complicate interpretation of genetic divergence and gene flow among populations [80]–[83]. Because F st integrates over evolutionary time it is difficult to separate current from historical processes based on a single estimate of pattern alone and it may be slow to reflect changes in migration following a fragmentation event, especially if N e remains large. Additionally, the alleles that are most likely to be lost through drift are at low frequencies in populations and these alleles contribute little to F st values [84]. Slow response may also arise from the fact that when connectivity is only reduced rather than eliminated entirely, estimates of F st may remain close to zero [83]. Finally, measures of genetic structure (e.g., F st, G st, Φst) can be depressed when within-subpopulation heterozygosity or variance is high relative to among-subpopulation levels, which is common with highly diverse markers e.g., microsatellites [85]–[89]. F st–related measures calculated from such data will never approach unity regardless of the underlying patterns of allelic diversity, and they do not behave monotonically. Hedrick [86] sought to overcome the dependence of G (a generalization of Wright's F to include multiple alleles) on levels of heterozygosity by standardizing the measure against the maximum G possible for the observed amount of heterozygosity. The resulting statistic, G’ varies from 0–1 in a way that better reflects the underlying patterns of genetic diversity [86], but remains fundamentally based on heterozygosity. Jost [85] proposed a measure of genetic divergence based on allelic diversity (D) that varies between 0 and 1 regardless of within-population heterozygosity, and it is suggested to better reflect population differentiation. Heller and Siegismund [90] found that values of Jost’s D calculated from data in 34 published studies were ∼60% greater than the corresponding G st values, and that G’st values were ∼85% greater than G st. The increased magnitude of both G’ and Jost’s D and potential wider range of values may provide greater ability to detect recent fragmentation events. Additionally, D is expected to be more sensitive because it is calculated based on allele diversity which will decline more rapidly than heterozygosity [84]. More recently Sherwin has proposed a standardized mutual information (MI) index [91], [92] based on Shannon’s index that also varies between 0 and 1 and is independent of heterozygosity with the added property weighing all alleles according to their frequency (i.e. neither favoring rare nor common alleles). Because we were interested in effects of fragmentation independent of habitat loss, we evaluated the ability to detect genetic effects of fragmentation with F st, G’, MI, and D over timeframes associated with anthropogenic habitat modification (i.e., <200 years) while controlling for population size. The number of generations necessary to make such an evaluation renders the task infeasible in a field setting. Therefore, we developed an individual-based population model to simulate genetic divergence among recently fragmented populations and measured F st, G’, MI, and D over time. Potential for detecting change in these metrics will vary based on the amount and nature of migration among populations; therefore, we simulated two severe cases of fragmentation. In the first, migration among a set of historically panmictic populations was abruptly and completely stopped. In the second, limited gene flow among populations was allowed and subsequently ceased. The first scenario provides the most ideal situation for detecting change – going from a base condition of a Wright-Fisher population to complete isolation. The second provides a more realistic starting condition in which there is a pre-existing level of divergence among populations onto which anthropogenic fragmentation is imposed. We complement a recent investigation of the effect of dispersal distance among individuals on the time required to detect an abrupt barrier to gene flow [63] by examining multiple discrete populations and by quantifying the influence of population size, overlapping generations, and sampling effort in terms of individuals and loci on ability to detect a significant change in four measures: F st, G’, MI, and Jost’s D.

Methods

Model Description

We generated six homogeneous panmictic populations of equal size at the start of each run. Panmixia among populations was created by allowing mating at random among individuals in all populations. The model allows variation in distances among individual population pairs but for the purposes of this evaluation all populations were equally isolated. Census size maxima (N max) within populations were set to 25, 75, 100, 500, 1000, and 3000 individuals (N e was subsequently calculated) which encompasses the size ranges of populations of most plant species listed under the U.S. Endangered Species Act (Neel unpublished data), and 71% of minimum viable population estimates for plant species world wide [93]. Initial size of each population was set to 75% of the size limit for each run and the size cap was reached within one or two generations. At initiation, individuals were assigned two alleles at each of 20 unlinked microsatellite loci. Allele size ranged between 5 and 50 repeat units. Alleles for each locus could take on any value within the given range, and were drawn from a normal distribution with parameters μ = mean of the size range of the locus and σ2 = 5. Drawing initial allele frequencies from a normal distribution allows for accurate simulation of the stepwise mutational model of microsatellite evolution throughout a simulation [94]. These starting conditions yielded between 7 and 42 alleles per locus at the start of each simulation depending on the population size. Mutations occurred every 0.004 gamete transfer events [94]. By using a stepwise mutational model of microsatellite evolution, small changes in allelic state were more likely than large changes and the direction of mutation tended toward the mean size range of each locus [94]. Individuals were simulated to be hermaphroditic, annual plants that were self-compatible, but that did not self-fertilize more than what would be expected at random, and therefore the amount of selfing depended upon population size. All individuals had an equal probability of mating each generation. Individuals from within a population had an equal probability of being a father for all individuals within that population. The proportion of individuals contributing seed to the next generation varied around a normal distribution with the parameters μ = 50% total population size and σ2 = 1. The number of seeds produced per female was drawn from a normal distribution with parameters μ = 35 and σ2 = 5 to provide stochastic variation around a likely number of seeds per plant. Each seed had a randomly selected father. When a seed bank was included in the model, those seeds not germinating entered the seed bank; otherwise, seeds that did not germinate immediately were removed. Germination potential of seeds in the seed bank decreased over time following a negative exponential function. As the size of each population approached the population size limit, the number of viable seeds produced was reduced to reflect density dependence [95]. Each cap size was run under four conditions that independently varied presence or absence of a seed bank (i.e., non-overlapping versus overlapping generations) and presence or absence of preexisting population structure prior to population isolation. To simulate absence of population structure, panmictic populations were immediately isolated to yield an abrupt fragmentation event with the highest likelihood of being detected. To more closely reflect realistic conditions, we simulated preexisting population structure by limited seed and pollen migration as described below for 500 generations prior to stopping all migration. At least 85% of pollen grains remained within a population and 15% had some probability of moving. Probability of dispersal from a population followed a Laplace distribution (μ = 0.4, b = 0), a commonly used dispersal kernel for plants that reflects a range of common dispersal syndromes [96]–[99]. Seeds produced from matings within populations could either stay within the population in which they were generated or they could disperse. Probability of dispersal followed the same dispersal kernel described above. After the dispersal step, seeds had a 10% chance of germinating the year after they were produced and their ultimate fate depended on whether or not generations overlapped. Although the specific values for seed production, seed germination, and pollen and seed dispersal were arbitrary, they were within the range of values that have been documented for plant species [100]–[106]. Simulations with preexisting population structure ran under the above conditions for 500 generations prior to complete isolation, those that began from panmixia were immediately isolated. Following isolation in both simulation types, the model proceeded for 200 additional generations with no migration among the 6 populations. We conducted 200 independent simulations for each of the four conditions for each of the six population size caps, yielding 24 model configurations. The resulting 4,800 simulations were run on The Lattice Project, a Grid computing system [107]–[110]. During simulations, individual populations were allowed to go extinct and to be recolonized with migrants from other populations (when migration was allowed) or from the seed bank (when overlapping generations were present). At small population sizes, individual populations would frequently go extinct. When all populations went extinct, the simulation was restarted. However, extinction of all six populations occurred in only ∼1/100 cases. We determined the total number of alleles, observed (H o) and expected (H e) heterozygosity at each generation. In simulations without overlapping generations, we calculated the inbreeding N e at each generation as where is the mean number of progeny and is the variance in the number of progeny at each generation [111]. In simulations with overlapping generations, N e was calculated as N e = T(N) where T is generation time defined as the average age of parents including dormancy [112] calculated following Vitalis et al. [113] and N b is the effective number of breeders in a given year [114]. Effective population size for each population, and for each run was calculated as the harmonic mean across all generations and then averaged across simulation runs. At each generation we calculated Weir and Cockerham’s [115] unbiased estimate θ, Hedrick’s G’ st [86], Sherwin’s standardized MI, and Jost’s D [85] using the estimator D est_Chao following Chao et al. [116]. We estimated the four measures from the total number of individuals using all 20 loci at each generation to provide the census or “true” estimate of θ, G’ st, MI, and D est_Chao for comparison with the subsamples of individuals and loci discussed below. We assessed the number of generations required for θ, G’ st, MI, and D est_Chao to reach equilibrium by visually assessing asymptotic behavior. We used Fisher’s exact tests to assess whether each estimated value was significantly different from 0, assuming individuals were members of a global population and then randomly reallocated to populations while maintaining sample sizes at the realized values, and recalculating each statistic [117]. The actual value for each run was compared with the distribution of 2000 such randomizations to obtain a p-value. The number of generations after population isolation at which θ, G’ st, MI, and D est_Chao became significantly different from values at the last time-step with gene flow was tested using a one-way Dunnet multiple mean comparison test in R v2.14.1 [118]. To determine the power to detect differences we calculated the proportion of runs at each generation that was significantly different from 0. The magnitude and rate of change between consecutive generations was calculated for the first 24 generations following fragmentation for all simulations. We sampled factorial combinations of 10, 15, and 20 loci, and 20, 30, and 50 individuals (as allowed by total maximum population sizes) at every generation over the course of each simulation run. To evaluate the effect of sample size on potential to detect fragmentation, we compared estimates of θ, G’ st, MI, and D est_Chao calculated for all factorial combinations of individuals and loci to the corresponding census value using a Tukey multiple comparison test in R. In addition, we tested estimates of each measure from all factorial combinations for significant departure from 0 using the methods described above.

Results

All Individuals and Loci

As expected, the number of alleles, H o and H tended to be higher through time in larger populations (Figure 1). Model runs with overlapping and non-overlapping generations yielded similar average allelic diversity for any given N max (2–42 alleles per locus). However, model runs with overlapping generations tended to yield higher average H o and H through time than did runs with non-overlapping generations, and differences were more pronounced at smaller N max (Figure 1).

Figure 1

Values of Na, Ho, and He for 20 loci and all individuals across all simulation conditions.

Lines from top to bottom represent the N max’s of 3000, 1000, 500, 100, 75, and 25 individuals.

Values of Na, Ho, and He for 20 loci and all individuals across all simulation conditions.

Lines from top to bottom represent the N max’s of 3000, 1000, 500, 100, 75, and 25 individuals. In absence of overlapping generations, the harmonic mean values of N e estimates for each of the six subpopulations based on all individuals averaged over all runs were 13, 40, 52, 265, 531, 1601 individuals. These N e values represented roughly half the actual N max values of 25, 75, 100, 500, 1000, and 3000, respectively. With overlapping generations, the harmonic mean of N e estimates for each subpopulation averaged over all runs was roughly twice the N max: 43, 143, 193, 975, 1994, 5994 individuals, respectively. As expected from theory, behavior of θ, G’ st, MI, and D est_Chao at a given time point depended on three factors: N max, presence or absence of overlapping generations, and presence or absence of population sub-structuring prior to fragmentation. Smaller N max predictably yielded larger values for any given time step (Figures 2, 3, 4, 5) except for D est_Chao when N max = 25 and generations did not overlap. For a given N max, measures were most often lower in simulations with overlapping generations than those without (Figure 2, 3, 4, 5). In simulations with population sub-structuring prior to fragmentation, θ and G’ values followed similar trajectories to those in which isolation occurred immediately after a period of panmixia (Figures 2 & 3). D est_Chao and MI values after isolation were lower when prior population sub-structuring was included whereas θ and G’ were of similar magnitude (Figures 4 & 5).