Literature DB >> 30976301

Genetic signatures of small effective population sizes and demographic declines in an endangered rattlesnake, Sistrurus catenatus.

Michael Sovic^1,2, Anthony Fries^1,3, Scott A Martin¹, H Lisle Gibbs¹.

Abstract

Endangered species that exist in small isolated populations are at elevated risk of losing adaptive variation due to genetic drift. Analyses that estimate short-term effective population sizes, characterize historical demographic processes, and project the trajectory of genetic variation into the future are useful for predicting how levels of genetic diversity may change. Here, we use data from two independent types of genetic markers (single nucleotide polymorphisms [SNPs] and microsatellites) to evaluate genetic diversity in 17 populations spanning the geographic range of the endangered eastern massasauga rattlesnake (Sistrurus catenatus). First, we use SNP data to confirm previous reports that these populations exhibit high levels of genetic structure (overall Fst = 0.25). Second, we show that most populations have contemporary Ne estimates <50. Heterozygosity-fitness correlations in these populations provided no evidence for a genetic cost to living in small populations, though these tests may lack power. Third, model-based demographic analyses of individual populations indicate that all have experienced declines, with the onset of many of these declines occurring over timescales consistent with anthropogenic impacts (<200 years). Finally, forward simulations of the expected loss of variation in relatively large (Ne = 50) and small (Ne = 10) populations indicate they will lose a substantial amount of their current standing neutral variation (63% and 99%, respectively) over the next 100 years. Our results argue that drift has a significant and increasing impact on levels of genetic variation in isolated populations of this snake, and efforts to assess and mitigate associated impacts on adaptive variation should be components of the management of this endangered reptile.

Entities: Chemical Disease Species

Keywords: RADSeq; Sistrurus catenatus; contemporary effective population size; eastern massasauga rattlesnake; forward genetic simulations; historical demographic modeling; microsatellites

Year: 2019 PMID： 30976301 PMCID： PMC6439488 DOI： 10.1111/eva.12731

Source DB: PubMed Journal: Evol Appl ISSN： 1752-4571 Impact factor: 5.183

INTRODUCTION

Small, recently isolated populations face the risk of reduced viability over time due to genetic costs of inbreeding (Allendorf, Luikart, & Aitken, 2013). Habitat fragmentation can reduce gene flow between populations and decrease population size. This in turn increases the magnitude of genetic drift and the negative impact of inbreeding depression on survival and reproduction, and can lead to populations entering an “extinction vortex” (Gilpin & Soulé, 1986). Genetic analyses of historical and contemporary features of populations can provide information that is useful in assessing whether small isolated populations are at risk of high levels of drift and inbreeding depression, and thus should be significant components of conservation plans for endangered species (Jamieson & Allendorf, 2012). One population feature that is important to understand in this context is effective population size (Ne) (Luikart, Ryman, Tallmon, Schwartz, & Allendorf, 2010), which measures the strength of genetic drift in a population. This is expected to reflect the rate at which variation is lost from populations and, in turn, may predict their ability to respond to future environmental change (Allendorf et al., 2013). Effective size is estimated over historical or contemporary timescales (Hare et al., 2011). Historical (long‐term) estimates of Ne evaluate the parameter over evolutionary timescales and generally estimate effective size at a species‐wide level (Charlesworth & Willis, 2009). In contrast, contemporary (current) measures of Ne estimate the parameter over short timescales (<5 generations) and so are most relevant to conservation and wildlife management applications (Hare et al., 2011). The development of estimators of contemporary Ne using genetic data is an active area of research in conservation genetics (Waples, 2016; Waples, Larson, & Waples, 2016). Recently, Gilbert and Whitlock (2015) simulated data under a variety of demographic scenarios to evaluate different estimators of contemporary Ne and concluded that an estimator based on genome‐wide estimates of linkage disequilibrium (LDNe, Waples & Do, 2008), as implemented in the program NeEstimator (Do, Waples, & R.S., Peel, D., Macbeth, G.M., Tillett, B.J., and Ovenden, J.R., 2014) was accurate in small isolated populations. Likewise, Wang (2016) validated the use of a sibship‐based estimator implemented in the program Colony (Jones & Wang, 2010) under similar demographic scenarios. Both methods have been widely used for estimating contemporary Ne for captive and wild populations (Husemann, Zachos, Paxton, & Habel, 2016). Analysis of observed distributions of genetic polymorphism can also yield insights into the demographic history of populations (Lawton‐Rauh, 2008; Marjoram & Tavare, 2006). To date, these analyses have primarily made qualitative inferences of changes in population size over time by comparing observed and expected patterns of diversity at small numbers of microsatellite loci (Garza & Williamson, 2001; Peery et al., 2012). However, these approaches can have low statistical power to detect even severe bottlenecks in natural populations (Busch, Waser, & DeWoody, 2007; Peery et al., 2012). Increasingly, alternative coalescent‐based modeling approaches are being used (i.e., Storz & Beaumont, 2002; Excoffier et al., 2013), which have the advantage of using an explicit statistical framework to test diverse demographic scenarios. These approaches have been used to detect significant demographic events over recent timescales, including those within the range of significant human impacts to the landscape (Goossens et al., 2006; Harris et al., 2016; Hoffman, Grant, Forcada, & Phillips, 2011). When combined with other information, the inferred timing of demographic events can be used to assess whether humans have played a significant role in those events (Goossens et al., 2006) or whether they are instead more likely due to natural large‐scale events, such as those associated with historical changes in climate (Tucker, Schwartz, Truex, Pilgrim, & Allendorf, 2012). Finally, projecting future changes in genetic diversity can identify genetic risks faced by populations that have recently declined (Amos & Balmford, 2001). This issue is especially important when recent declines have led to populations not in genetic equilibrium. Populations with a small contemporary effective size may retain the genetic profile of a larger population after the decline, and this can lead to overestimates of the amount of variation present relative to present and future population sizes. Demographic simulations can help assess the rate at which existing levels of variation will decline and provide insights into an effective timeline for management activities such as genetic rescue (Frankham, 2015; Tallmon, Luikart, & Waples, 2004). One difficulty is that while the ultimate goal of management activities is the conservation of adaptive variation, these simulations are based on neutral evolutionary processes like genetic drift and focus on patterns of change in neutral variation (Carvajal‐Rodríguez, 2010). Nonetheless, they provide a time frame in which adaptive variation may be lost under the assumption that levels of adaptive and neutral variation covary (Reed & Frankham, 2003). The eastern massasauga rattlesnake (Sistrurus catenatus) is a small short‐lived (generation time ~2 years) venomous snake found throughout eastern North America. In the United States, it primarily exists in relatively small, isolated patches of habitat surrounded by heavily modified landscapes (Szymanski et al., 2016). Population declines throughout the range due to habitat fragmentation and destruction have led to the listing of this species under the US Endangered Species Act (US Fish & Wildlife Service, 2016) in the United States and as a Species at Risk (Government of Canada, 2009) in Canada. This species exhibits little phylogeographic structure across its range (Sovic, Fries, & Gibbs, 2016), and so important management units within this species may best align with existing individual populations. Previous work at the population level (Chiucchi & Gibbs, 2010; Gibbs & Chiucchi, 2012) identified high levels of genetic structure among populations that have persisted over long periods of time, large variation in long‐term Ne between populations, and little evidence for recent declines in population sizes. However, these results were based on microsatellite‐based variation alone and used qualitative methods of analyses for inferring demographic patterns. In particular, measures of effective population size were historical long‐term measures. These may not reflect the strength at which drift is currently operating in these populations, making them uninformative for present‐day management decisions for this species. Here, we use two independent genetic datasets to make conservation‐relevant inferences about genetic and demographic characteristics of individual populations across the range of this species. Our specific goals are to (a) evaluate the degree of genetic isolation among populations using a new genome‐scale dataset; (b) estimate contemporary effective population sizes to generate a better understanding of the present and future importance of drift in these populations; (c) re‐evaluate previous results (Gibbs & Chiucchi, 2012) that the genetic cost to living in small populations is small by using new estimates of individual heterozygosity from RADSeq loci in heterozygosity–fitness correlation (HFC) analyses (Chapman, Nakagawa, Coltman, Slate, & Sheldon, 2009); (d) test for evidence of past population size changes and estimate the timescale on which inferred changes occurred; and (e) use simulations to project the rate at which existing levels of variation will be lost in populations. Our study represents the most comprehensive analyses of the present‐day and future genetic demography of existing populations of S. catenatus and provides information useful for developing the Recovery Plan for this species (US Fish and Wildlife Service, in preparation).

METHODS

Samples

We analyzed genetic data from S. catenatus individuals from 17 populations spanning the geographic range of this species (Figure 1; Table 1). Individual populations were defined as sets of randomly collected adult samples from within a single geographic location ≤3 km2 in area for which sample sizes were sufficiently large to conduct the analyses described below (see Table 1 for marker‐specific sample sizes for individual populations). Samples from most populations were collected across multiple years and/or consisted of multiple body size classes from a single year and so represent adults from a mixture of age classes. This is relevant to interpreting estimates of contemporary effective population size (see below). These populations were the same as those analyzed by Chiucchi and Gibbs (2010) with the difference that samples from one additional population from Spring Valley, Ohio (SPVY—see Figure 1), were analyzed using RADSeq data.

Figure 1

Map showing locations of populations of Sistrurus catenatus analyzed in this study. Summaries of genetic variation for RADSeq and microsatellite loci are given in Table 1

Table 1

Estimates of genetic variation for each population based on microsatellite and polymorphic RADSeq loci. Microsatellite results are based on data from 17 microsatellite loci reported in Chiucchi and Gibbs (2010)

Population	Code	Microsatellite					RAD
Population	Code	N	Ho	He	FIS	AR	N	N loci (100%)	Ho	He	Fis	AR
Southern Illinois
South Shore State Park	SSSP	18	0.646	0.679	0.050	3.70	5	537 (59%)	0.329	0.400	0.180	1.44
Eldon Hazlet State Park	EHSP	14	0.738	0.759	0.029	4.35	8	1,183 (29%)	0.287	0.350	0.170	1.48
Western and Central Ohio
Prairie Road Fen	PRDF	21	0.535	0.588	0.093	3.14	21	510 (43%)	0.332	0.363	0.070	1.33
Spring Valley Wildlife Area	SPVY	0	‐	‐	‐	‐	7	395 (57%)	0.340	0.395	0.127	1.32
Killdeer Plains Wildlife Area	KLDR	68	0.749	0.768	0.024	4.82	27	1,372 (25%)	0.274	0.296	0.062	1.49
Willard Marsh Wildlife Area	WLRD	15	0.676	0.663	−0.043	4.34	10	617 (60%)	0.300	0.345	0.123	1.44
Northeast Ohio
Grand River Lowlands 1	GRL−1	18	0.576	0.571	−0.009	3.16	20	733 (32%)	0.364	0.347	−0.064	1.36
Grand River Lowlands 2	GRL−2	18	0.535	0.508	−0.055	2.79	16	533 (45%)	0.342	0.341	−0.001	1.31
Grand River Lowlands 3	GRL−3	20	0.642	0.626	−0.037	3.57	19	610 (47%)	0.321	0.330	0.020	1.37
Western Pennsylvania
State Games Lands 95	GLAD	6	0.526	0.576	0.084	3.02	5	351 (60%)	0.373	0.402	0.074	1.27
Venango County	VNGO	7	0.532	0.606	0.132	3.55	7	410 (59%)	0.349	0.386	0.103	1.34
Jennings Environmental Education Center	JENN	9	0.595	0.688	0.143	3.58	10	410 (59%)	0.352	0.365	0.039	1.31
New York
Bergen Swamp	BERG	20	0.558	0.545	−0.029	2.75	14	371 (50%)	0.323	0.344	0.054	1.22
Cicero Swamp	CCRO	62	0.538	0.544	0.011	2.94	27	635 (32%)	0.298	0.315	0.037	1.22
Ontario
Bruce Peninsula Nat. Park	BPNP	20	0.708	0.720	0.016	4.46	27	886 (35%)	0.273	0.304	0.083	1.45
Beausoleil Island, Georgian Bay Is. Nat. Park	BEAU	15	0.616	0.657	0.046	3.80	22	666 (45%)	0.291	0.318	0.081	1.38
Killbear Provincial Park	KBPP	20	0.608	0.600	−0.014	3.49	18	672 (48%)	0.288	0.324	0.110	1.40

Code is the identifier used for each population on Figure 1; N = number of individuals genotyped; Ho and He are observed and expected heterozygosity, Fis is the fixation index, and AR is allelic richness which were calculated using Arlequin (Excoffier & Lischer, 2010). N loci (100%) is the number of polymorphic RAD loci scored in individuals in that population (% scored in all individuals).

Map showing locations of populations of Sistrurus catenatus analyzed in this study. Summaries of genetic variation for RADSeq and microsatellite loci are given in Table 1 Estimates of genetic variation for each population based on microsatellite and polymorphic RADSeq loci. Microsatellite results are based on data from 17 microsatellite loci reported in Chiucchi and Gibbs (2010) Code is the identifier used for each population on Figure 1; N = number of individuals genotyped; Ho and He are observed and expected heterozygosity, Fis is the fixation index, and AR is allelic richness which were calculated using Arlequin (Excoffier & Lischer, 2010). N loci (100%) is the number of polymorphic RAD loci scored in individuals in that population (% scored in all individuals).

Genetic data

We analyzed individuals (n = 385) using two multilocus genetic datasets: One based on DNA microsatellite loci previously analyzed by Chiucchi and Gibbs (2010) and a new dataset based on restriction site‐associated DNA loci (RADSeq—Andrews, Good, Miller, Luikart, & Hohenlohe, 2016). Sixty percent (n = 229) of individuals were characterized using both genetic markers. Both datasets were used for estimating contemporary effective population sizes. Only the RADSeq dataset was used for the historical demographic analyses, which are based on modeling site frequency spectra (SFS) (see below). For the microsatellites, we used previously published data from 17 loci reported in Chiucchi and Gibbs (2010). For RADSeq loci, we generated data following the protocols described in Sovic et al. (2016). Briefly, high‐quality genomic DNA was extracted from blood or tissue samples, and double‐digest RADSeq libraries (Peterson, Weber, Kay, Fisher, & Hoekstra, 2012) were generated with EcoRI and SbfI restriction enzymes (New England Biolabs, Ipswich, MA) and a modified version of the RADSeq protocol described in DaCosta and Sorenson (2014). For details of the library preparation methods, see the Supplemental Information section of Sovic et al. (2016). Pooled libraries of up to 36 individuals were sequenced in single‐end 50‐bp runs as partial lanes on an Illumina HiSeq 2,500 sequencer. De novo locus assembly, SNP identification, and genotyping of RADSeq loci were carried out on the raw fastq data using AftrRAD version 5.0 (Sovic, Fries, & Gibbs, 2015). Unless otherwise specified, default settings were used in the AftrRAD run. Specifically, reads containing one or more bases for which the quality (Phred) score was below 20 were removed from analyses (minQual = 20). Reads that shared 90% or more sequence similarity after accounting for indels were assigned as allelic variants at a given locus as part of the de novo assembly (minIden = 90). After assembly, genotypes were called at loci for which there was a minimum of five reads in the individual (MinReads = 5). Loci with <5 reads were scored as missing data. Within each population, we then retained for analyses only those loci that were scored in all samples. Genotypes were assigned based on read (haplotype) identity for all analyses, with the exception of site frequency spectra used for model testing in fastsimcoal, which were constructed by selecting the first SNP from each locus (see details below).

Population structure

Repeated analyses based on DNA‐based markers such as RAPDs (Lougheed, Gibbs, Prior, & Weatherhead, 2000) and microsatellites (Chiucchi & Gibbs, 2010) have shown high levels of genetic structure and low migration rates between individual populations of Sistrurus catenatus. To assess whether RADSeq data showed similar levels of genetic structure, we used hierfstat (Goudet, 2005) to estimate pairwise and overall Fst for both the RADSeq dataset newly reported here and, for comparative purposes, for the microsatellite data reported by Chiucchi and Gibbs (2010). We calculated Pearson's r correlation and associated significance levels between pairwise Fst values generated for microsatellite and RADSeq data using the rcorr function in the R base package (R Development Core Team, 2015).

Effective population size

We assumed that each population was essentially genetically isolated due to high degrees of genetic structure between populations and low levels of genetically effective migration inferred by Chiucchi and Gibbs (2010). As such, we analyzed each set of samples from a single location as a genetically isolated population. We calculated contemporary effective population sizes using two different estimators for both the RADSeq and microsatellite datasets. First, we estimated contemporary Ne using the LDNe method (Waples & Do, 2008), as implemented in the program NeEstimator (Do et al., 2014). This method estimates Ne based on patterns of linkage disequilibrium between loci and was shown to perform well relative to other methods when calculating Ne under scenarios of low Ne and low migration rates (Gilbert & Whitlock, 2015). We used a “two allele” minimum for each locus within each population based on the recommendations of Waples and Do (2010) relative to the sample size of individuals (≤25) in almost all our populations. Confidence intervals for Ne values were estimated using a parametric approach implemented in the program. Second, we used Colony (Jones & Wang, 2010), which uses a maximum‐likelihood‐based method to calculate Ne based on inferred sibship frequencies among the samples, with associated confidence intervals obtained through bootstrap resampling.

Heterozygosity–fitness correlations

Snout–vent length (SVL) and body mass data were available for a subset (N = 91) of the individuals genotyped for RAD variation. These included samples from SSSP (N = 5), BERG (N = 7), CCRO (N = 18), GRL‐2 (N = 7), GRL‐3 (13), SPVY (N = 6), WLRD (N = 5), GLAD (N = 5), JENN (N = 8), VNGO (N = 7), and PRDF (N = 10). We log‐transformed (ln) these data and performed a major axis regression in the R package lmodel2 (Legende, 2018) to obtain the slope of the best‐fit line between these variables. The slope of this line (bSMA) was used to calculate the Scaled Mass Index (SMI, Peig & Green, 2009) according to the equation SMI =BMi(SVL0/SVLi)bSMA, where BMi is the mass of the sample, SVL0 is the mean SVL across all samples, and SVLi is the SVL of sample i. We then used major axis regression to evaluate the relationship between these SMI values and average genome‐wide SNP heterozygosity, which was estimated for each individual from RADSeq data using scripts included with AftrRAD.

Model‐based analyses of historical demography

We used historical demographic analyses to test between models of population history. Our goal was to assess whether evidence exists for declines in population size within each population. For declining populations, we assessed whether the declines occurred over a timescale consistent with anthropogenic impacts (arbitrarily defined as <200 years—see Discussion) or were better explained by historical factors occurring over longer timescales. To avoid problems associated with model overparameterization, we analyzed the simplest models possible that captured the key features of the process (population size change) we were interested in exploring. Specifically, we tested between two simple demographic models (Figure 2). The first (“null model”) represents a population of constant size and is defined by a single parameter, the current population size (NCURR). The second (“bottleneck model”) is characterized by three parameters: the current population size (NCURR), a time of instantaneous population size change (TBOT), and a population size prior to TBOT (NANCES). Note that although we refer to this model as the “bottleneck model,” there are no restrictions regarding the direction of the population size change, as it allows for increasing population size in addition to decreasing population size. Harris et al. (2016) used a similar approach to assess the impact of urbanization on white‐footed mice (Peromyscus leucopus) in New York City. Prior ranges for the three parameters “NCURR,” “NANCES,” and “TBOT” were 10–50,000, 10–50,000, and 1–10,000, respectively, and were not bounded on the upper end.

Figure 2

Historical demographic models analyzed for each population using fastsimcoal. Note that for the “bottleneck model,” there are no restrictions regarding the direction of the population size change, as it allows for both increases or decreases in population size. Table S3 shows the results for model comparisons for each population based on AIC. Parameter estimates for the best‐fit models are given in Table S4 We tested among the competing historical demographic models with fastsimcoal 2.5.2 (Excoffier, Dupanloup, Huerta‐Sanchez, Sousa, & Foll, 2013), which uses maximum‐likelihood methods to perform model selection and parameter estimation based on the site frequency spectrum (sfs) calculated from the population genotypes. This approach has advantages over qualitative analyses of demography based on microsatellite data (e.g., Kimmel et al., 1998) by allowing the evaluation of the statistical fit of different demographic models using a genomic‐scale dataset summarized with a single metric. Folded site frequency spectra (calculated from observed counts of the minor allele) were generated separately for each of the 17 populations from SNPs scored in all sampled individuals. For loci with more than one variable site, only the first SNP at each locus was retained. For each of these populations, we performed 100 fastsimcoal runs (30 ECM cycles and 1e5 simulations per run) under each of the two models. The maximum‐likelihood runs under each model were then compared using AIC to select the optimal model for each population. Once the optimal model was identified for each population, we performed additional fastsimcoal analyses to estimate parameters within each population under the optimal model. To obtain point estimates, we used site frequency spectra that included all sites, performed 100 independent fsc runs for each population, and obtained parameter estimates by selecting the values associated with the maximum‐likelihood run. To evaluate the variability of these parameter estimates, we generated 50 bootstrap resampled datasets, each with 10% fewer sites than the full dataset, from the observed sfs from each population, and performed fsc analyses on each these 50 resampled datasets (also 100 runs for each dataset; 30 ECM cycles and 1e5 simulations per run). A key parameter in these models is the genome‐wide mutation rate. Because we are not aware of any direct estimates for genome‐wide mutation rates for snakes, we conducted a series of preliminary analyses using paired sets of models and datasets that only varied in their use of “high” and “low” mutation rates. For the “high” mutation rate, we used 2.5 × 10−8 per site/generation as estimated for humans (Nachman & Crowell, 2000) and used in Sovic et al. (2016). For the “low” mutation rate, we used 2.1 × 10−9 per site/generation as used for a recent study on frogs by Thomé and Carstens (2016) and used in Gibbs et al. (2018). We then compared the model likelihoods for each analysis and found that the use of the high mutation rate consistently generated significantly better model likelihoods (results not shown). Consequently, we used the rate of 2.5 × 10−8 mutations per site/generation in our subsequent analyses. Finally, as described by Sovic et al. (2016), we followed the approach of Lande, Engen, and Saether (2003) to convert estimates of time from generations to years, but used a lower estimate of age at first reproduction (2 years) which is based on data collected from the populations we studied (G. Lipps, unpublished data). This resulted in an estimated generation time of 2.03 years. We stress that although we have conducted independent historical demographic analyses on each population the degree to which each population represents independent entity over historical timescales is unclear. Thus, as a whole the results of these analyses should be taken as a qualitative survey of the types of demographic histories experienced by an unknown number of historically independent populations across the range of this species.

Potential impact of Fst‐outlier loci on demographic results

Our analyses of population structure, size, and historical demography assume that the polymorphism measured represents neutral variation, yet loci showing high levels of divergence between populations (outlier loci) may represent loci under divergent selection. We explored the sensitivity of our demographic analyses to such loci in two ways. First, we used Outflank version 0.2 (Whitlock & Lotterhos, 2015) with default parameter settings to identify highly divergent loci across populations. This program represents a “second‐generation” Fst‐outlier detection program that has much lower false‐positive rates, yet comparable power as compared to earlier programs. For the Outflank dataset, we allowed up to 10% missing data at each locus. Second, because characteristics of our study system (high levels of genetic structure and small numbers of loci and populations) may make detection of outliers problematic (Lotterhos & Whitlock, 2015), we also conducted a sensitivity analyses in which we chose a set of three population pairs that represent a range of sizes and interpopulation distances, and then assessed the impact of deleting a portion of high Fst loci from the dataset on demographic estimates. Specifically, we used Genepop (Rousset, 2008) to calculate Fst for each locus in each of the following pairwise comparisons: PRDF/BERG, KLDR/KBPP, GRL‐1/GRL‐2. We then identified the loci associated with the top 5% of Fst values, removed these loci from the datasets, and used these modified datasets to calculate effective population size and perform historical demographic analyses with the same methods used for the full datasets.

Projecting loss of genetic variation

We assessed how existing levels of variation might change in the future for a range of contemporary effective sizes given no change in current sizes. To do this, we conducted forward genetic simulations in simuPOP with overlapping generations (Peng & Kimmel, 2005). SimuPOP is a flexible forward‐time simulator of demographic scenarios that uses life history information from the species under study. We first generated models that incorporated estimates of key features of Sistrurus demography (Jones et al., 2012) (see below). Next, we ran 10 iterations for each of three fixed Ne values (50, 30, and 10) that represent a range of empirically estimated values for Ne. For each iteration, 1,000 independent bi‐allelic loci were initialized for every individual with a starting frequency of p = 0.81 and q = 0.19, corresponding to a starting He of 0.30, which approximates the currently observed He in polymorphic RADSeq loci (Table 1). Simulations were run for 100 time steps, where one time step equaled one year. Individuals did not reproduce until year 2 and died after year 5. We based our choices of minimum reproductive ages and maximum ages based on previously published demographic data. While S. catenatus may live up to 12 years, less than 70% of a breeding cohort is expected to reach age 5 (mean annual survival of 0.67; Jones et al., 2012). During each time step, individuals were aged by one year, and those greater than 5 years old removed from the simulated population. Random pairs of adults in the population were then chosen to reproduce until the population returned to the fixed population size. After new individuals were generated, individual heterozygosity was exported every time step for later analysis. For each run, individual heterozygosity was used to calculate the average heterozygosity for each year. Standard deviation of the observed heterozygosity was calculated for each year from the 10 iterations and used to generate 95% confidence intervals. We compared our simulation results with a widely used analytical approach for estimating loss of variation based on population genetic theory (James, 1970) that has been used for estimating changes in heterozygosity in endangered species (Amos & Balmford, 2001). Specifically, James (1970) proposed using the equation Ht = H0 (1−1/2Ne) to describe changes in heterozygosity over time in bottlenecked populations where H0 is initial heterozygosity and Ht is heterozygosity t generations after the instantaneous decline of a large population to size Ne. A key difference between the two methods is that compared to the simulation‐based approach in SimuPOP, this analytical approach assumes no overlapping generations with all individuals only reproducing in a single generation. As with our simuPOP runs, we fixed the initial H0 at 0.3 and then projected the loss of variation using the equation above for three fixed values of Ne: 50, 30, and 10.

RESULTS

Genetic markers

For the RADSeq data, a mean of 1,037,766 sequence reads were assigned to each of the 263 samples (range 216,381–4,239,460). The average numbers of total non‐paralogous loci and polymorphic loci per population were 52,815 and 641, respectively (Table 1). For a given population, the median read depth ranged from 48 to 61 reads per polymorphic locus, and the percentage of polymorphic loci that were scored for all sampled individuals ranged from 25% to 60% across the 17 datasets (Table 1). These values for coverage are consistent with those reported in Sovic et al. (2016), while the number of polymorphic loci used in the analyses here is smaller than for the previous study. This is because datasets in this study represent only intra‐population variation, while the datasets for the previous study included both intra‐ and interpopulation variation.

Genetic structure

The overall Fst for RADSeq data across all populations was 0.283 with pairwise values for individual populations ranging from 0.083 to 0.398 (Table S1a). These values are similar to values estimated from microsatellite data (overall value: 0.223; range of pairwise values: 0.088–0.51; Table S1b). Fst values based on RADSeq and microsatellite data for specific pairwise comparisons are significantly correlated with each other (r = 0.74; p < 0.05) with microsatellite‐based values averaging ~22% less than matched RADSeq‐based values. The variation in pairwise values is similar, with coefficients of variation of 28.1% and 31.6% for pairwise Fst values based on RADSeq and microsatellite data, respectively. We note that direct comparisons of Fst values based on multi‐allelic microsatellite loci and bi‐allelic SNP loci are problematic because differences in Fst based on each type of marker do not scale in the same way (Frankham et al., 2017). However, qualitatively, the RADSeq data support previous results from the microsatellite analyses of Chiucchi and Gibbs (2010) that these populations are genetically distinct and demographically isolated from each other.

Estimates of effective population size

Point estimates of the effective population sizes for individual populations vary depending on the type of genetic data and method used to generate the estimates (Figure 3; Table S2). A small number of populations (RADSeq: EHSP and VNGO; microsatellites: BPNP and GLAD) yielded estimates that were 1–2 orders of magnitude greater than other populations or were estimated as infinite in size—these are coded as NA in Table S2, and omitted from Figure 3. Estimates from LDNe for RADSeq data range between 2 (JENN) to 48 (BPNP) with an overall mean (±SE) of 17.3 ± 3.5. Microsatellite‐based estimates of LDNe ranged from 3 (GRL‐1) to 64 (JENN) with an overall mean that is slightly higher (20.6 ± 3.6 SE). For estimates based on RADSeq data, all populations had values <50, whereas for microsatellite data, 13 of 15 (87%) populations were below this threshold. These summaries include all populations including those where estimates were not available for one type of data. When paired estimates for single populations are compared, mean values based on RADSeq data are slightly lower (RADSeq: 15.6 ± 3.5; microsatellites: 21.1 ± 5). Paired values based on each type of data are not significantly correlated with each other (r = 0.11; p > 0.05).

Figure 3

Contemporary effective population size estimates based on LDNe generated from microsatellites (hollow triangles) and RADSeq data (filled triangles) for 17 populations of s. catenatus. Confidence intervals with an upper bound of infinity are represented by dashed lines extending to the top of the plot. Only RADSeq data are available for the population from SPVY For sibship frequency‐based measures of Ne, point estimates range from 8 (SPVY) to 140 (BPNP) (mean: 36.38 ± 10.2) for RADSeq data, and 14 (GRL‐1) to 95 (BPNP) (mean: 44.06 + 5.8) for microsatellite data. Proportionally fewer populations (RADSeq: 77% [10/13]; microsatellites: 75% [12/16]) had sibship‐based values <50. Paired estimates show similar patterns (RADSeq: mean = 29.55 ± 8.5; microsatellites: 35.36 ± 10.2) but, in contrast to LDNe estimates, are highly correlated with each other (r = 0.89; p < 0.05). In general, although estimates of contemporary Ne vary depending on method and marker type, most populations have small effective sizes of <50 with only a few populations having larger sizes. There was no significant relationship between our estimate of individual fitness (standardized mass index−SMI) and genome‐wide individual heterozygosity estimates based on RADSeq data (r 2 = 0.001; p = 0.418). We also performed regression analyses on individual populations with a minimum of five samples, but as for the full dataset, no significant positive relationships were detected (results not shown).

Demographic modeling

Statistical comparisons of the two demographic models in Figure 2 show strong support (AIC ~1) for the model incorporating population size change as best‐fitting the data for all individual populations (Table S3). Estimates of ancestral population size (Na) averaged 39,197 (range 21,073–56,684) and were consistently higher than those for current population size (Nc), which had a mean of 2,445 (range 10–12,583), suggesting all populations were best modeled as having experienced a population decline. In terms of the timing of this decline, joint inspection of the point estimates relative to the parameter distributions from the resampled datasets shows two scenarios (Figure 4). For 12 populations, there is a close correspondence between the point estimate of TBOT (in generations) from the original data and the distribution of TBOT values from the resampled data. In contrast, for five populations (EHSP, PRDF, WLRD, GRL‐2, and BEAU), the point estimate does not mirror the resampled distribution, and therefore, TBOT cannot be inferred with confidence. For populations with coincident point estimates and distributions, eight populations have small values (<100 generations) suggesting recent declines within the past 200 years (SSSP, SPVY, GRL‐1, GRL‐3, GLAD, VNGO, JENN, and BERG), while four populations have relatively large values (>2,000 generations or>4,000 years) suggesting declines on historical timescales. Thus, for populations for which TBOT can be inferred with confidence, a majority (8/12; 67%) showed declines during the time period in which humans have had strong impacts on the landscape, while four showed declines on historical timescales.

Figure 4

Histograms representing maximum‐likelihood estimates from fastsimcoal of the number of generations in the past at which a decline in population size occurred for each of the 17 populations. The dotted gray line indicates the bin containing the point estimate (see Table S4), and distributions reflect estimates from 50 bootstrapped datasets for each population. Times along the x‐axis are binned into intervals of 100 generations. The bin of 0–100 generations, roughly corresponding to an anthropogenic timeframe (~200 years), is indicated with red. All estimates exceeding 2,000 generations are combined into the 2,000 generation bin

Lack of impact of Fst‐outlier loci on demographic estimates

None of the 701 loci included in the Outflank analysis of the 17 populations were flagged as Fst‐outliers. Sensitivity analyses for six populations suggested that removal of loci associated with the top 5% of Fst values based on pairwise population comparisons had little effect on our results. Specifically, for all six populations, estimates of LDNe, and model choice results and parameter estimates from fsc were identical or very similar to those from the full dataset (results not shown). These findings indicate that even if our datasets contain subsets of loci under divergent selection not detected by our Fst‐outlier tests due to a lack of power, it is unlikely such loci are having an impact on our demographic inferences.

Projected loss of variation

All simulated populations lost variation over time, with the greatest loss occurring in the simulation based on the smallest fixed value of Ne (Figure 5). Based on point estimates of He at 0, 30, and 50 generations, populations with an Ne value of 50 lost 21% of their initial variation, those with Ne = 30 lost 33% of their variation, and populations with Ne = 10 lost 68% of their variation over roughly 100 years given our estimate of the generation time of this snake. These estimates of expected loss of He were substantially less than the predicted losses over the same period of time based on the analytical formula that assumes no overlapping generations, which showed declines of He that varied from 63% (Ne = 50) to 99% (Ne = 10) (Figure 5). This demonstrates the importance of incorporating realistic demographic parameters into models used for projecting loss of variation. Overall, these results suggest that nearly all populations are not at genetic equilibrium relative to their current effective size but have excess polymorphism which they will lose due to drift at varying rates over the next 100 years. This implies that most have undergone recent declines in census size (reflected in estimates of current effective size), but these declines have yet to influence the standing genetic variation present in a given population.

Figure 5

Results of simulations showing the projected loss of existing levels of variation (estimated as He) over time for populations of varying effective size. Trajectories are shown for fixed Ne values of 50, 30, and 10 individuals from results based on the SimuPOP simulations (point estimates shown by solid line +95% CI polygons) or the analytical formula for loss of He over time in a bottlenecked population

DISCUSSION

Estimation of contemporary Ne

A major result is that point estimates of contemporary Ne are small (<50) for the majority of populations sampled from across the range of Sistrurus catenatus. For example, estimates based on LDNe analysis of RADSeq data show that mean Ne is ~22 individuals (range: 2–48), although the confidence estimates for individual populations can be large. This result of small Ne values holds regardless of the method used to generate the estimates (i.e., LDNe vs. Colony), or the type of genetic marker (microsatellites vs. RADSeq), and suggests that genetic drift is poised to have a significant impact on the maintenance of genetic variation in these populations. These estimates are much smaller than the effective sizes based on microsatellite variation reported by Gibbs and Chiucchi (2012) for many of the same populations reported here and, in some cases, for estimates of Ne based on historical demographic modeling of RADSeq‐based polymorphism by Sovic et al. (2016) (and here). This is because these studies estimate Ne over different timescales using different methodologies, and so values from each set of studies are not directly comparable (Hare et al., 2011). For example, the estimates of Ne in Chiucchi and Gibbs (2010) are based on results from Migrate (Beerli, 2009) which generates estimates of long‐term effective size. These reflect the amount of genetic diversity (estimated as)maintained in a population at evolutionary equilibrium due to the joint effect of drift and mutation (µ) over a period of approximately 4Ne generations ( = 4Neµ). Likewise, the estimates of Ne using fastsimcoal generated by Sovic et al. (2016) and here also represent historical estimates of effective population size over many past generations. These long‐term estimates have conservation value in potentially representing a measure of population size that predates human impacts. As such, they could represent a historical target for assessing current management goals (Hare et al., 2011) or for evaluating the magnitude of recent anthropogenic impacts relative to historical population sizes (Alter, Rynes, & Palumbi, 2007). Long‐term estimates can also be useful for species‐level comparisons relevant to conservation. For example, Sovic et al. (2016) compared long‐term effective sizes and population dynamics of S. catenatus with the closely related but non‐threatened Western Massasauga Rattlesnake (S. tergeminus). The demographic patterns they detected were consistent with differences in the conservation status of each species. The long‐term effective population size was much smaller for the representative S. catenatus population than for the S. tergeminus population, and the S. tergeminus population was best modeled as a growing population, whereas S. catenatus showed evidence of a long‐term population decline. In contrast, the short‐term estimates of Ne based on single sample estimators reported here measure effective population size over much more recent timescales—approximately the period of sample collection, and hence are useful for different conservation purposes (Hare et al., 2011). Specifically, in species with overlapping generations, they generate an estimate of Ne that is approximately equal to Nb * generation time, where Nb is the effective number of breeding individuals in a given year. As such, short‐term estimates are more useful for estimating short‐term abundance using empirically derived estimates of the ratio of Ne/Nc, where Nc is population census size (Palstra & Ruzzante, 2008), and evaluating the degree to which genetic drift may erode the adaptive genetic potential of small populations over timescales relevant to management activities (Lynch, Conery, & Burger, 1995). Because our analyses use mixtures of age classes of individuals from each population, they may underestimate true values of contemporary Ne (Waples, Antao, & Luikart, 2014). The degree of this bias is related to the ratio of adult lifespan/generation time, which we estimate as ~4. This assumes a two‐year generation time (see above) and uses the range‐wide estimate of annual survivorship (0.67—Jones et al., 2012) to project when 95% of a hypothetical cohort of snakes would have died. Given this value, extrapolations from the trends shown in Figure 6 in Waples et al. (2014) suggest that our estimates of Ne based on LDNe may be underestimated by as much as 25%. Two other studies have recently reported contemporary effective size estimates for this species using microsatellite genetic data analyzed using the LDNe program. Both show estimates broadly similar to the values reported in this paper. Bradke et al. (2018) reported values of 29.5 (95% CI = 22.2–40.5) and 44.2 (95% CI = 29.7–73.4) for two populations in Michigan. Baker et al. (2018) reported Ne values of ~30 individuals (95% CIs ~15–47) for each of three point estimates over a 10‐year period (2002 to 2012) for the SSSP population in Illinois reported here. As described by Bradke et al. (2018), these values fall within the range of LDNe estimates found for other threatened and non‐threatened snakes (see Table 3 in Bradke et al. (2018)). A general benchmark for evaluating whether species are at short‐term genetic risk comes from the widely debated “50/500 rule” first proposed by Franklin (1980) for assessing the minimum viable size for endangered species. Briefly, the rule proposed that to avoid the deleterious effects of inbreeding depression, populations should be maintained at a minimum short‐term effective size of at least 50 individuals. The validity of 50 individuals as a specific numerical target has been much discussed (Frankham, Bradshaw, & Brook, 2014; Jamieson & Allendorf, 2012) and argued to be too small to achieve its purpose (Frankham et al., 2014). Regardless, our empirical finding that most populations of S. catenatus have a contemporary Ne of <50 suggests that these populations are potentially at risk of the deleterious effects of inbreeding depression (but see Wood, Yates, & Fraser, 2016) and that conservation planning for this species needs to take this possibility into account. Our results reinforce those of Gibbs and Chiucchi (2012) who also found little evidence of genetic costs of inbreeding based on heterozygosity–fitness correlations (HFCs) (Chapman et al., 2009). We note that the datasets used in the two studies are not independent. Many of the same individuals (and their estimates of body condition) were used in both analyses, with the difference being in the type and number of genetic loci used to estimate individual heterozygosity. We echo the cautions of Gibbs and Chiucchi (2012) that body condition is an indirect measure of individual fitness in these animals and that small sample sizes combined with small effect sizes may reduce the statistical power of both analyses to detect effects. A more direct way to test whether populations are currently experiencing significant mutational load would be future genome‐level assessments of whether deleterious mutations such as loss of functional variants (e.g., Rogers & Slatkin, 2017) occur in individuals, and whether the frequency of such variants is greater in small versus large populations (Perrier, Ferchaud, Sirois, Thibault, & Bernatchez, 2017).

Historical demography

Our analysis of historical demography suggests that range‐wide, all populations have undergone a decline in population size. However, based on point estimates, the timing of the population reduction varies: Eight populations show a decline within the past 200 years, whereas four others experienced declines >2,000 years bp. This suggests that there are two different classes of drivers for the population declines. The first operates on contemporary timescales that are coincident with the colonization and subsequent landscape modification by European settlers in North America, which has occurred within the past 200 years (Pielou, 1991; Schmidt, 1938). In contrast, the older dates correspond with large‐scale environmental changes related to climate that have previously been hypothesized to impact the distribution of this (Cook, 1992) and other species in this region of North America (Soltis, Morris, McLachlan, Manos, & Soltis, 2006). There are no obvious spatial patterns in populations showing recent versus historical declines that might suggest a broad‐scale geographic cause to such variation. Populations within the same geographic regions show either different (e.g., EHSP and SSSP) or similar (e.g., VNGO, GLAD, and JENN) timescales for declines. This lack of a pattern suggests that local conditions may play a primary role in determining the timescale of population declines. This hypothesis could be explored by incorporating a historical dimension to evaluating habitat suitability in the local area where each population is found (e.g., McClusky et al., 2018). These results are at odds with previous work that showed limited evidence for population declines in these populations using different methods of analyses and microsatellite genetic data. Specifically, Chiucchi and Gibbs (2010) used allele distribution tests implemented in Bottleneck (Luikart, Allendorf, Cornuet, & Sherwin, 1998) and only found significant results supporting recent bottlenecks in three of 16 populations. We suggest that the SFS‐based tests used here are more sensitive because they employ an explicit model‐testing framework, use a much greater amount of genetic data, and examine the possibility of declines over a broader timescale. Other studies using similar approaches have found evidence for either recent or historical bottlenecks in other endangered vertebrates (Dussex, Rawlence, & Robertson, 2015; Goossens et al., 2006; Salmona et al., 2012; Tucker et al., 2012; Zhu et al., 2013), but in general, the focus is on one or a few populations of each species. Our study differs in that it examines evidence for declines across a relatively large number of independent populations, and shows that the timing of significant declines varies across populations. This suggests that characterizing species as impacted only at historical or contemporary timescales is simplistic and that a given species may experience both types of declines. Identifying populations that have experienced significant recent declines may represent a way of prioritizing certain populations over others in terms of the impacts of anthropogenic effects on the viability of specific populations.

Projecting future levels of genetic variability

Our simulations suggest that if there are no changes in key population parameters like size and levels of migration, most Sistrurus catenatus populations will lose 20% or more of their existing variation over the next 100 years. These conclusions are based on several assumptions. First, they assume populations will remain no larger than their currently estimated effective size with no increase in migration. This is reasonable given that trends over time are for populations to remain isolated with the same or declining numbers Szymanski et al.., 2016). Second, our analyses are based on the evolutionary dynamics of what we assume is neutral variation and not the adaptive variation that is used to delineate adaptive conservation units within threatened taxa (Funk, McKay, Hohenlohe, & Allendorf, 2012). This is a common problem in conservation genetic studies in that the genes that underlie adaptive variation are difficult to identify. However, Reed and Frankham (2003) have provided evidence that the two types of variation are broadly correlated with each other. In Sistrurus, studies of adaptive variation within populations are scarce, but Jaeger et al. (2016) have documented variation in MHC loci in three populations in Illinois that showed qualitative association with variation in six microsatellite loci. In addition, Ochoa et al. (in prep.) have found evidence for polymorphism in genes associated with venom proteins. In general, these findings suggest that adaptive variation is present in existing populations of Sisturus and that high levels of drift have yet to purge this variation. However, whether the rate at which it is expected to be lost parallels the results of our simulations is unclear. Because selection generally acts to retard the impact of drift on the loss of variation within populations, we suspect that the timeline for loss of variation established by our simulations may be conservative. These results have conservation implications. They suggest that existing populations of Sistrurus may not yet have paid the true genetic cost of living in their current size population, but that this cost could be “paid” in the near future. As such, despite their history of living in small populations over historical timescales (Chiucchi & Gibbs, 2010), they may be moving to a new equilibrium with respect to gene dynamics, with issues to do with losses of adaptive genetic variation becoming more prominent in the near future. In other words, many of these populations may be poised to enter the extinction vortex (Gilpin & Soulé, 1986) but have yet to experience the fitness costs that will act in a positive feedback loop to drive populations to even smaller sizes. The simulations may also provide a timeline that could guide the timing of introductions associated with genetic rescue (Tallmon et al., 2004) in the event it is adopted as a viable management option (Ralls et al., 2017). Specifically, our simulations suggest that as a guide to preserve 50% of the existing variation in the very smallest populations (Ne = 10), new genetic variation would need to be introduced within the next 40 years, whereas for the largest populations (Ne = 50), this could be delayed for as long as >100 years. Genetic introductions have been used to increase population viability in other highly inbred snake populations through the introduction and subsequent removal of radio‐tagged males (Madsen, Shine, Olsson, & Wittzell, 1999; Madsen, Ujvari, & Olsson, 2004). We feel that a similar approach could be used to supplement variation in populations of Sistrurus, especially given the fact that this species represents a taxon where outbreeding depression is not expected to be a significant issue (Frankham, 2015).

CONFLICT OF INTEREST

None declared. Click here for additional data file.

6 in total

1. Genomic signatures of drift and selection driven by predation and human pressure in an insular lizard.

Authors: Marta Bassitta; Richard P Brown; Ana Pérez-Cembranos; Valentín Pérez-Mellado; José A Castro; Antònia Picornell; Cori Ramon
Journal: Sci Rep Date: 2021-03-17 Impact factor: 4.379

2. Comprehensive genotyping of a Brazilian cassava (Manihot esculenta Crantz) germplasm bank: insights into diversification and domestication.

Authors: Alex C Ogbonna; Luciano Rogerio Braatz de Andrade; Lukas A Mueller; Eder Jorge de Oliveira; Guillaume J Bauchet
Journal: Theor Appl Genet Date: 2021-02-11 Impact factor: 5.699

3. Bioindicator snake shows genomic signatures of natural and anthropogenic barriers to gene flow.

Authors: Damian C Lettoof; Vicki A Thomson; Jari Cornelis; Philip W Bateman; Fabien Aubret; Marthe M Gagnon; Brenton von Takach
Journal: PLoS One Date: 2021-10-29 Impact factor: 3.240

4. Intact landscape promotes gene flow and low genetic structuring in the threatened Eastern Massasauga Rattlesnake.

Authors: Nathan Kudla; Eric M McCluskey; Vijay Lulla; Ralph Grundel; Jennifer A Moore
Journal: Ecol Evol Date: 2021-05-02 Impact factor: 2.912

5. Genotyping-in-Thousands by sequencing reveals marked population structure in Western Rattlesnakes to inform conservation status.

Authors: Danielle A Schmidt; Purnima Govindarajulu; Karl W Larsen; Michael A Russello
Journal: Ecol Evol Date: 2020-06-02 Impact factor: 2.912

6. Limited gene flow and pronounced population genetic structure of Eastern Massasauga (Sistrurus catenatus) in a Midwestern prairie remnant.

Authors: Whitney J B Anthonysamy; Michael J Dreslik; Sarah J Baker; Mark A Davis; Marlis R Douglas; Michael E Douglas; Christopher A Phillips
Journal: PLoS One Date: 2022-03-24 Impact factor: 3.240

6 in total