Literature DB >> 31278891

Replicated Landscape Genomics Identifies Evidence of Local Adaptation to Urbanization in Wood Frogs.

Jared J Homola^1,2, Cynthia S Loftin³, Kristina M Cammen⁴, Caren C Helbing⁵, Inanc Birol⁶, Thomas F Schultz⁷, Michael T Kinnison¹.

Abstract

Native species that persist in urban environments may benefit from local adaptation to novel selection factors. We used double-digest restriction-side associated DNA (RAD) sequencing to evaluate shifts in genome-wide genetic diversity and investigate the presence of parallel evolution associated with urban-specific selection factors in wood frogs (Lithobates sylvaticus). Our replicated paired study design involved 12 individuals from each of 4 rural and urban populations to improve our confidence that detected signals of selection are indeed associated with urbanization. Genetic diversity measures were less for urban populations; however, the effect size was small, suggesting little biological consequence. Using an FST outlier approach, we identified 37 of 8344 genotyped single nucleotide polymorphisms with consistent evidence of directional selection across replicates. A genome-wide association study analysis detected modest support for an association between environment type and 12 of the 37 FST outlier loci. Discriminant analysis of principal components using the 37 FST outlier loci produced correct reassignment for 87.5% of rural samples and 93.8% of urban samples. Eighteen of the 37 FST outlier loci mapped to the American bullfrog (Rana [Lithobates] catesbeiana) genome, although none were in coding regions. This evidence of parallel evolution to urban environments provides a powerful example of the ability of urban landscapes to direct evolutionary processes. © The American Genetic Association 2019.

Entities: Chemical Disease Species

Keywords: Molecular adaptation and selection; RAD-seq; amphibian; anthropogenic evolution; rapid evolution; urban ecology; vernal pool

Year: 2019 PMID： 31278891 PMCID： PMC6785938 DOI： 10.1093/jhered/esz041

Source DB: PubMed Journal: J Hered ISSN： 0022-1503 Impact factor: 2.645

Urban environments are expanding worldwide. Currently, 54% of the world’s human population is located in urban areas, and the United Nations projects that the rate of urban habitation will grow to 66% by 2050 while the global human population adds 2.5 billion people (UN DESA 2014). The proportion of the landscape influenced by urban areas is also growing as cities become increasingly diffuse (Seto et al. 2010). These shifts have altered land cover in ways that decrease habitability for some species while increasing it for others, consequently reshaping both the composition and amount of biodiversity in affected regions (McKinney 2002; Piano et al. 2017). For many species that persist in the altered habitats, some level of biological adaptation is necessary for survival (Johnson and Munshi-South 2017). Recognition of contemporary evolution as a driver of adaptation over observable time scales has grown in recent decades (Hendry and Kinnison 2001; Kinnison and Hendry 2001; Kinnison et al. 2007). Increasingly, contemporary evolution is suggested as common and potentially important for species experiencing selection from rapid environmental change in urbanizing environments (Donihue and Lambert 2015; Alberti et al. 2017). Cases of directional selection linked to urbanization-specific environmental factors include the classic response of peppered moth (Biston betularia) melanin concentrations, which facilitated color shifts to better camouflage individuals against the soot-covered environments that came and receded with the industrial revolution and subsequent environmental protections (Kettlewell 1956; Cook et al. 2012), and larger beak sizes in house finches (Carpodacus mexicanus) putatively owing to the abundant availability of a nonendemic food source, sunflower seeds in bird feeders (Badyaev et al. 2008). In comparison to studies of phenotypic trait changes, few studies have examined urbanization-related changes at the genetic level by identifying allele frequency changes at loci underpinning certain traits. Harris and Munshi-South (2017) found evidence of selection on metabolic pathways potentially related to novel diets in urban ecosystems for white-footed mice (Peromyscus leucopus) based on an analysis of transcriptomic single nucleotide polymorphisms (SNPs). Low coverage whole-genome sequencing (Reid et al. 2016) and restriction-side associated DNA (RAD) sequencing (Osterberg et al. 2018) have been used to identify genetic adaptations associated with a tolerance to polycyclic aromatic hydrocarbons associated with urban pollution in killifish (Fundulus heteroclitus). In addition to presenting strong selection, urban environments can affect the genetic variation of populations in other ways, including reductions in gene flow, increases in genetic drift, and losses of genetic diversity (reviewed in Johnson and Munshi-South 2017). Given these expectations for the consistent presence of such effects across urban landscapes, it may be possible to detect generalizable species-specific genetic responses to urban environments. Landscape genomics provides a useful framework for understanding how heterogeneous landscapes, such as those associated with urbanization, influence spatial arrangements of adaptive genetic diversity (Schoville et al. 2012; Petren 2013; Storfer et al. 2015). Genome scans can be separated into 2 broad categories: those that detect selection based on genetic differentiation (e.g., FST outlier test) and those that evaluate associations between genetic and environmental variation (Lewontin and Krakauer 1973; Günther and Coop 2013; Rellstab et al. 2015). However, because they are correlative, both methods can lead to potential spurious inference if additional environmental factors vary with the gradient of interest. One potentially powerful way to overcome this limitation is to study instances of parallel evolution (Conte et al. 2012; Stern 2013). Simulations comparing various landscape genomic sampling approaches confirm that paired-replicate designs that sample across parallel adaptive gradients provide optimal power when using genome scans to detect selection (Lotterhos and Whitlock 2015). Replication has thus emerged as a recommended approach in landscape genomics as the field shifts from exploratory to more hypothesis-driven approaches (Hand et al. 2015; Rellstab et al. 2015; Storfer et al. 2018). Empirical applications are still uncommon, but include identification of genes tied to adaptation to freshwater environments in threespine stickleback (Gasterosteus aculeatus; Jones et al. 2012) and genes tied to adaptation to high elevation hypoxia in humans (Foll et al. 2014). Replication is not without challenges, however. While population replicates are typically chosen based on a focal environmental selective factor of interest, natural populations almost always differ in demographic attributes that can influence the capacity to detect signals of selection. For instance, differing effective population sizes or gene flow rates across replicates can cause populations to respond differently to a similar selective factor. Analytical strategies taken to detect genes under shared selection must somehow account for, or be robust to, such complications. This can be achieved with a replicated study design by standardizing observed selective effects across replicates and consequently interpreting locus-specific signals of selection based on their relative magnitudes rather than their absolute values. The objective of this research was to use a paired-replicated sampling approach to evaluate whether urban environments induce parallel evolution in wood frogs (Lithobates sylvaticus). Life history and behavioral characteristics such as short generation times (2–3 years; Berven 1990; Sagor et al. 1998), small home ranges (Berven and Grudzien 1990; Groff et al. 2017), and high rates of philopatry (Berven and Grudzien 1990) make wood frogs excellent candidate species for detecting evolutionary change over short time scales. Moreover, previous research suggests urban-associated landscape features such as roadways (Hall et al. 2017; Brady et al. In press) and open tree canopies (Skelly et al. 2014) can influence wood frog fitness-related characteristics such as growth and fecundity rates, potentially as a result of, or a precursor to selection. In light of previous research on the evolutionary consequences of urbanization in other species, we tested 2 hypotheses: 1) genome-wide genetic diversity would be less in urban relative to rural populations and 2) signals of selection associated with urbanization would be detectable. These 2 indicators of contemporary evolution can be helpful in understanding the capacity of wild populations to evolutionarily respond to anthropogenic selection factors.

Materials and Methods

Study Design and Sample Collection

Our sampling sites were chosen to achieve a paired site study design that included 4 replicates, each consisting of one rural and urban site in close geographic proximity in central and southern Maine (Figure 1). Populations are referenced using notation that combines their replicate number and rural/urban categorization (e.g., 1R, 1U, 2R, 2U, etc.). The nearest replicates (18.17 km between replicated 2 and 3) were spaced at a distance of >14 times the mean dispersal distance for wood frogs (1169 m ± 351 SD; Berven and Grudzien 1990), and within replicate distances averaged 6.25 km (range: 2.82–9.56 km). We selected this design to minimize differences in evolutionary history within replicates while maximizing differences among replicates, as suggested by Lotterhos and Whitlock (2015).

Figure 1.

Map of wood frog sampling sites located in Maine, USA. The replicated paired sampling design includes 4 rural (R1–R4) and 4 urban (U1–U4) sites, with sample collection site indicated by a white dot. Following initial selection based on apparent environmental characteristics, sampling sites were quantitatively evaluated for their conformance to a rural/urban dichotomy. Sites were characterized based on 4 landscape characteristics: 1) percent impervious surface within 1 km as determined with the Maine Department of Inland Fisheries and Wildlife 2016 impervious surface data layer (Jason Czapiga, Maine Department of Inland Fisheries and Wildlife); 2) percent canopy cover within 1 km, which was measured using the 2011 National Land Cover Database (NLCD; Homer et al. 2015); 3) distance to the nearest paved roadway, determined using the State of Maine’s NG911 roads dataset (https://geolibrary-maine.opendata.arcgis.com/datasets#data); and 4) percent of landscape within 1 km that is categorized in any of the 4 “developed” categories by the 2011 NLCD. We used ArcGIS 10.2 (ESRI, Redlands, CA) for processing all spatial data layers. Overall differences in urban and rural site characteristics were evaluated with principal component analysis (PCA) and a permutational multivariate analysis of variance (PERMANOVA) using the “adonis” function in the R package vegan (Oksanen et al. 2017). Wood frogs used in our analyses were collected as embryonic individuals from vernal pools in April and May of 2014, 2015, and 2016. We collected one individual per egg mass to avoid sampling full siblings, which can bias some genetic analyses (Rodríguez-Ramilo and Wang 2012; Peterman et al. 2016; but see Waples and Anderson 2017). Microsatellite genotypes collected for a concurrent study (Homola 2018) were used to perform sibship reconstruction using Colony (Wang 2004; Jones and Wang 2010) to ensure no siblings were included in analyses.

Genomic Data Collection and Analysis

Genomic DNA was extracted from whole embryos using a Qiagen DNEasy kit following the manufacturer’s guidelines. Sample quality and concentration were evaluated using a 1% agarose gel and a Qubit fluorometer using the dsDNA Broad Range Assay Kit (v1.0, Invitrogen/Life Sciences, Norwalk, CT). Double-digest RAD (ddRAD) libraries were built following a protocol modified from Peterson et al. (2012). First, we digested 100 ng of DNA for each sample using SbfI-HF and MspI restriction enzymes. We then ligated barcoded P1 adapters to SbfI cut sites and a P2 adapter to MspI sites before pooling 32 samples for subsequent amplification and sequencing. DNA underwent purification and size selection using 1.5× volume Sera-Mag SpeedBeads magnetic particles at multiple points following digestion, ligation, and amplification. In total, 3 ddRAD libraries with 32 individuals each were sequenced at the Duke Center for Genomic and Computational Biology (Durham, NC) using an Illumina HiSeq2000 to collect 50 bp single end reads. Genotyping was performed using Stacks v1.32 (Catchen et al. 2011). We used the process_radtags module to demultiplex reads using their barcodes, remove reads below the default quality threshold of Phred score <10 (Catchen et al. 2011), and removed reads without the SbfI cut site. Next, we used the denovo_map.pl script to run the Stacks pipeline, including removing highly repetitive RAD tags before loading the resulting catalog into a MySQL database with the load_radtags.pl script. We performed a sensitivity analysis on the Stacks assembly parameters that dictate the number of identical, raw reads required to create a stack (m), the number of mismatches allowed between loci when processing a single individual (M), and the number of mismatches allowed between loci when building the catalog (n; Paris et al. 2017; Shafer et al. 2017). We evaluated the default (m: 3, M: 2, n: 1) and neighboring parameter values and selected the combination of values that assembled an intermediate number of loci to minimize likelihoods of including loci assembled erroneously due to paralogous sequences, over-splitting of genomic sites into separate loci, or sequencing error (Paris et al. 2017). After creating the catalog of loci with all genotyped individuals, we used the populations module in Stacks to remove loci that were not genotyped for at least 9 individuals in every population, had a minor allele frequency less than 0.05, and with a stack depth less than 10. The first SNP in each RAD locus was written to a VCF file for downstream analyses. We refer to loci using the Stacks naming convention that provides the arbitrarily designated locus number followed by an underscore and the SNP’s position within the locus (e.g., 24302_14 references the 14th position of the 24 302 assembled locus).

Neutral Genetic Structure and Genetic Diversity

We used discriminant analysis of principal components (DAPC; Jombart et al. 2010) executed in the adegenet R package (Jombart 2008) to characterize the overall genetic differentiation among the study populations. DAPC summarizes genetic data using a PCA before conducting a discriminant analysis on retained principal components to create K genetic clusters that minimize variation within groups and maximize it among groups. This approach provides a means of evaluating whether our sampling design achieved our goal of more similarity in evolutionary history within replicates than among them. For instance, we expected to find each replicate population pairing placed into its own cluster that is unique from the other clusters in a K = 4 scenario and clear population-level differentiation using a K = 8 scenario. We also used DAPC to estimate individual reassignment probabilities as an indicator of our ability to differentiate among various grouping levels (environment type, population, and replicate). In addition to DAPC, we examined overall genetic differentiation among populations using pairwise FST values (Nei and Chesser 1983) calculated with the “genetic_diff” function in the R package vcfR (Knaus and Grünwald 2017). Differences in genetic diversity between urban and rural sites were quantified based on observed heterozygosities and nucleotide diversities after removing 37 loci with evidence of selection (see below). We constructed empirical cumulative distribution functions for each genetic diversity measure for each SNP. These distributions were compared between the rural and urban populations using the nonparametric Kolmogorov–Smirnov test (Massey 1951), which provides a test statistic D that quantifies the maximum absolute difference between each empirical cumulative distribution function, thereby providing a measure of effect size that is bound from 0 to 1.

Detection of Loci under Selection

We used an outlier approach for identifying loci showing signals of selection. The relatively isolated nature of vernal pools used for breeding by wood frogs promotes hierarchical genetic structuring patterns across their populations, providing an excellent scenario for detecting selection via outlier analyses (Berven and Grudzien 1990; Semlitsch 2008). Genetic differentiation (FST) was estimated for each locus within each replicated population pairing using the “genetic_diff” function in vcfR (Knaus and Grünwald 2017). Because the degree of neutral genetic variation differs among replicates, it is impossible to equitably identify a locus as an outlier across replicates based on FST values alone. Therefore, we standardized measures of differentiation across replicates by calculating a percentile for each locus based on the observed range of FST values within each replicate. For instance, an FST value of 0.2 could be in the 96th percentile (top 4 percent of FST values) for one replicate, while in the 78th percentile for another replicate with a broader range of values. Because we are interested in loci with consistently elevated FST values across replicates, we quantified the observed percentile variation by calculating the standard error (SE) of the percentiles for all 4 replicates for each locus. Based on the observed distribution of mean and SE of percentiles for each locus (e.g., the inflection point when plotting loci in order of increasing FST values), we chose to designate loci with a lower SE boundary above the 20th percentile as putative outliers with consistent signals of directional selection. We used 2 multivariate techniques to assess the strength of associations between the set of outlier loci and the urban-rural dichotomy. First, principal coordinate analysis (PCoA) was used to visualize the separation of rural and urban individuals based on genotypes at outlier loci. Next, we used DAPC to evaluate clustering of individuals into rural or urban populations in a K = 2 scenario. Additionally, we used group assignment probabilities as a measure of confidence in the ability of the outlier loci to naively reassign samples to either a rural or urban designation. Finally, a locus-specific loadings plot was examined to identify the specific loci most influential in separating individuals into rural or urban clusters. While the FST outlier analysis can identify signals of selection based on genetic differentiation, it does not account for whether selection is occurring in the same direction across all replicates (e.g., one genotype being consistently favored in an urban environment). Therefore, we also examined associations between individual loci and the rural/urban variable using logistic regression. Our approach is analogous to a case-control genome-wide association study (GWAS), wherein SNP allele frequencies are evaluated for associations with a dichotomous variable using logistic regression (Bush and Moore 2012). For this GWAS analysis, genotypes were recoded with the count of reference alleles to be reflective of an additive genetic model (e.g., AA = 0, AG = 1, GG = 2). The logistic regression was performed using the “glm” function in R and included environment type as the response and genotype and replicate number as predictors. We adjusted alpha levels using the false discovery rate (FDR) approach to account for multiple testing (Benjamini and Hochberg 1995). Loci identified as outliers were aligned to the draft North American bullfrog (Rana [Lithobates] catesbeiana) genome, version 3.1, for assignment of putative functional annotations (Hammond et al. 2017). Alignments were performed using the Bowtie short read aligner (Langmead et al. 2009) and considered valid if they contained 3 or fewer reportable alignments and had 3 or fewer mismatched bases. BEDTools “closest” was used to identify the nearest 2 annotated regions to wood frog outlier loci that aligned to reference scaffolds containing annotations (Quinlan and Hall 2010; Quinlan 2014).

Results

Site Characteristics and Genotyping

Sampled sites characterized as urban had greater nearby percent impervious surface, less canopy coverage, greater percent developed landscape, and shorter distance to the nearest paved road (Table 1). PCA indicated clear separation of the sites based on their urban/rural designation using the first 2 axes, which together explained 96% of the variation (Figure 2). Differentiation of the sites based on the measured urbanization-associated variables was confirmed with PERMANOVA (r2 = 0.80; P = 0.03).

Table 1.

Site locations and characteristics for 8 sampled wood frog populations

Replicate	Urban/rural	Latitude	Longitude	% Impervious	% Canopy cover	Dist. to road (m)	% Developed
1	Rural	44.8876	−68.7826	5	60	417	8
2	Rural	43.9355	−70.0082	3	59	358	10
3	Rural	43.8005	−70.1331	4	46	172	27
4	Rural	43.2808	−70.6904	3	75	275	8
1	Urban	44.8021	−68.7886	48	22	39	98
2	Urban	43.9168	−69.9847	23	41	83	64
3	Urban	43.8123	−70.1767	19	48	57	52
4	Urban	43.3199	−70.5947	18	59	158	43

Figure 2.

Plot of first 2 principal component axes that explain a total of 95.97% of environmental variation for 4 rural and 4 urban wood frog sampling sites based on 4 urbanization-associated characteristics. Factor loadings are provided in the inset table.

Site locations and characteristics for 8 sampled wood frog populations Plot of first 2 principal component axes that explain a total of 95.97% of environmental variation for 4 rural and 4 urban wood frog sampling sites based on 4 urbanization-associated characteristics. Factor loadings are provided in the inset table. We successfully sequenced 12 individuals from each population. Our Stacks sensitivity analysis indicated that variations in the m parameter led to the greatest fluctuations in the total number of loci assembled (Table 2). We selected the parameters values of m: 5, M: 4, and n: 2, which resulted in 1 167 060 loci, of which 182 316 loci contained at least one SNP. These parameter values produced an intermediate number of loci that should reduce the likelihood of erroneously over-merging paralogous loci or splitting single genomic sites into multiple RAD loci. Following filtering implementing in the populations module (as described in the Methods), we retained 8344 polymorphic loci for analyses. Individual sequencing depths averaged 29.2× and ranged from 9.3× to 65.3×.

Table 2.

Results of sensitivity analysis on 3 assembly parameters in Stacks

Parameters			Total loci	Loci w/ SNPs (%)
m ^a	M ^b	n ^c
7	4	1	835 153	146 972 (17.6)
5	4	2	1 167 060	182 316 (15.6)
5	2	2	1 199 056	171 016 (14.2)
3	4	2	1 687 085	234 005 (13.9)
3	2	2	1 725 367	223 881 (12.9)
3	2	1	1 833 950	198 692 (10.8)

Selected parameter settings for analysis are indicated in bold.

aMinimum number of identical, raw reads required to create a stack.

bNumber of mismatches allowed between loci when processing a single individual.

cNumber of mismatches allowed between loci when building the catalog.

Results of sensitivity analysis on 3 assembly parameters in Stacks Selected parameter settings for analysis are indicated in bold. aMinimum number of identical, raw reads required to create a stack. bNumber of mismatches allowed between loci when processing a single individual. cNumber of mismatches allowed between loci when building the catalog. As expected, the strength of genetic structuring varied within and among replicates. Structuring between replicates was strongly tied to geographic proximity. A K = 4 scenario showed replicates 1 and 4 to be clearly differentiated from the other replicates, whereas replicates 2 and 3 overlapped, reflecting their geographic proximity (Figures 1 and 3a). Replicate-based reassignment probabilities were strong, with replicate 3 at 95.8% and the others at 100%. A K = 8 scenario revealed considerable genetic distance between the 2 populations in replicate 1, whereas the populations within replicates 2 and 3 tended to group together, while those of replicate 4 in were in separate, but proximal clusters (Figure 3b). Despite relatively short genetic distances between some populations, individual reassignment probabilities remained high with all populations at 100% except 2R and 2U (91.7%) and 3R (83.3%). A Bayesian information criterion analysis (Jombart et al. 2010) used to gauge relative support for various numbers of clusters indicated strongest support for K = 4 and K = 5 scenarios (Supplementary Figure 1). Pairwise FST values were generally congruent with the K = 8 DAPC scenario, showing the most within-replicate differentiation exists in replicate 1, which is greater than some between-replicate FST values (Table 3).

Figure 3.

Results of discriminant analysis of principal components (DAPC) to analyze the neutral and putatively adaptive genetic variation for 8 wood frog populations (N = 12 each). Analyses included 8307 neutral (nonoutlier) loci, and individuals are coded for their (a) population and (b) replicate and population of origin. Panel c shows the first discriminant axis for all 96 individuals across the 37 loci with evidence of selection plotted based on their rural (blue) and urban (red) cluster assignments, and panel d shows the locus-specific loading plot of the analysis depicted in panel c. See online version for full color.

Table 3.

F ST values from each of 4 paired replicates that include 1 urban and 1 rural wood frog sampling site

		Rep. 1		Rep. 2		Rep. 3		Rep. 4
		Urban	Rural	Urban	Rural	Urban	Rural	Urban	Rural
Rep. 1	Urban	—	0.041	0.046	0.048	0.053	0.053	0.061	0.060
	Rural		—	0.039	0.040	0.045	0.045	0.054	0.052
Rep. 2	Urban			—	0.028	0.033	0.033	0.038	0.037
	Rural				—	0.033	0.032	0.036	0.036
Rep. 3	Urban					—	0.034	0.040	0.038
	Rural						—	0.040	0.038
Rep. 4	Urban							—	0.028
	Rural								—

Bold values highlight within-replicate FST values.

F ST values from each of 4 paired replicates that include 1 urban and 1 rural wood frog sampling site Bold values highlight within-replicate FST values. Results of discriminant analysis of principal components (DAPC) to analyze the neutral and putatively adaptive genetic variation for 8 wood frog populations (N = 12 each). Analyses included 8307 neutral (nonoutlier) loci, and individuals are coded for their (a) population and (b) replicate and population of origin. Panel c shows the first discriminant axis for all 96 individuals across the 37 loci with evidence of selection plotted based on their rural (blue) and urban (red) cluster assignments, and panel d shows the locus-specific loading plot of the analysis depicted in panel c. See online version for full color. We identified modest differences in genetic diversity between rural and urban populations. Although less diversity was observed for urban populations, effect sizes were small for both observed heterozygosity (D = 0.015; P < 0.001) and nucleotide diversity (D = 0.018; P < 0.001; Supplementary Figure 2). No clear patterns were observed when comparing rural and urban sites based on overall percent of polymorphic loci, percent of private alleles, and average frequency of the major allele (Table 4).

Table 4.

Replicate	Urban/rural	Sites^a	% Polymorphic^b	% Private^c	P ^d	H _o ^e	π ^f
1	Rural	14779	0.489	0.028	0.874	0.163	0.163
2	Rural	12813	0.558	0.043	0.870	0.161	0.180
3	Rural	10026	0.526	0.047	0.870	0.161	0.175
4	Rural	14669	0.630	0.071	0.851	0.187	0.201
1	Urban	14821	0.453	0.019	0.847	0.157	0.155
2	Urban	13502	0.558	0.039	0.868	0.167	0.181
3	Urban	14413	0.547	0.040	0.866	0.170	0.180
4	Urban	14661	0.601	0.052	0.854	0.181	0.194
Mean	Rural	13071 ± 1111	0.551 ± 0.030	0.047 ± 0.009	0.867 ± 0.005	0.168 ± 0.006	0.180 ± 0.008
	Urban	13082 ± 1116	0.542 ± 0.037	0.045 ± 0.011	0.860 ± 0.006	0.167 ± 0.007	0.178 ± 0.009

aNumber of polymorphic nucleotide sites.

bPercent of polymorphic sites.

cPercent of variable sites with unique alleles to each population.

dAverage frequency of the major allele.

eAverage observed heterozygosity per locus.

fAverage nucleotide diversity.

Genetic diversity measures for 12 individuals sequenced at each site for nucleotide positions present in at least 6 of 8 wood frog populations, including mean ± standard error for each measure within rural and urban categories aNumber of polymorphic nucleotide sites. bPercent of polymorphic sites. cPercent of variable sites with unique alleles to each population. dAverage frequency of the major allele. eAverage observed heterozygosity per locus. fAverage nucleotide diversity.

Loci under Selection

We identified 37 outlier loci using our percentile-based outlier approach for finding loci with consistently elevated genetic differentiation among replicates (Figure 4; Supplementary Figure 3). In several cases, other loci were found to have high mean FST percentile values; however, they were not considered outliers because their wide SE bounds indicated they were not outliers consistently across replicates. Using the 37 outlier loci, PCoA (Supplementary Figure 3) and DAPC indicated separation of individuals into rural and urban groups (Figure 3C), and DAPC provided correct reassignment of 87.5% of rural samples and 93.8% of urban samples. High loading values for 4 loci suggested particularly strong associations with the rural/urban environment type (Figure 3D). Fifteen of the 37 outlier loci mapped to the American bullfrog genome, including 3 that aligned to scaffolds containing annotations associated with hypothetical proteins that are similar to 40A ribosomal protein S21, thyroglobulin, and brain-specific homeobox/POU domain protein 3 (Supplementary Table 1). Without a linkage map it is not possible to evaluate whether other loci with evidence of selection are linked to coding regions or to each other; however, they all mapped to different scaffolds, reducing the likelihood of physical linkage among them.

Figure 4.

Distribution of (a) mean FST percentiles for 8344 wood frog loci with standard error bars depicting variation among 4 population replicates and (b) the same mean percentile values for each locus plotted based on their ranked order and showing a sharp increase for a subset of loci with relatively high average FST values. Points and bars with black coloration indicate loci with high mean FST percentiles and low variation among replicates, suggesting evidence of selection. Logistic regression GWAS analyses identified significant urbanization associations for 337 of the 8344 tested loci, including 12 of the 37 identified using our percentile-based outlier approach. Based on joint probability expectations, it is highly unlikely (binomial test, P < 0.001) that 12 loci would overlap between these 2 tests by chance. Three of the 4 loci with high DAPC loadings were also found to be significant in the GWAS analyses. However, support for associations identified using GWAS is only marginal because none of the 337 loci reached the threshold for statistical significance (q value of 0.05) following FDR corrections. The nonsignificance of these results is likely associated with the relatively small sample sizes used in our analyses (Visscher et al. 2017); however, this does not preclude the utility of the GWAS analysis for evaluating consistency in the direction of selection for specific loci. Figure 5 illustrates genotype frequencies for the 12 loci with P <0.05 in logistic regression GWAS models and consistently elevated FST values among environment types. Although differences between rural and urban environments are evident for all 12 loci, changes in genotype frequency are especially pronounced in 3 cases (Figure 5). For 12782_12, the T/T genotype is nearly fixed in urban sites, whereas the locus is more diverse in rural sites. Similarly, 67705_12 displays a shift toward fixation for the G/G allele in the urban environment. Alternatively, 71089_22 is almost fixed for C/C in the rural environments, whereas C/T and C/C are more evenly proportioned in the urban setting. Allele frequency plots for each of these 12 loci are provided in Supplementary Figure 4.

Figure 5.

Variation in genotype frequency for 12 loci with evidence of selection due to the influence of urbanization based on FST outlier and GWAS analyses. The top and bottom of each box are the 25th and 75th percentiles, the band within each box is the median value, whiskers represent overall variation, and points are possible outliers beyond 1.5× observed interquartile range.

Discussion

Urbanization Influences Wood Frog Evolution

We detected genetic evidence of parallel directional selection associated with urbanization in wood frogs. With parallel evolutionary patterns unlikely to arise by chance (Samis et al. 2012; Santangelo et al. 2018), this outcome is strong evidence for the presence of environmentally driven selection (Hohenlohe et al. 2010; Reid et al. 2016). Comparisons of genotype frequencies between rural and urban environments for the 12 loci with evidence of selection based on both the FST outlier and GWAS analyses provide insight into the possible mode of selection. For instance, the shift from high genotypic variation in rural settings to a strongly favored genotype in urban settings for 12782_12 and 67705_12 suggests directional selection on standing genetic variation. In other cases, a clearly predominant genotype in the rural environment shifted to an intermediate frequency in urban settings, a pattern most clearly observed in loci 71089_22 and 20064_11, where balancing selection may be occurring, potentially associated with a release from a selective constraint present in the rural environment. These scenarios of apparent balancing selection may also be explained by ongoing directional selection. Standing genetic variation is generally expected to be a greater contributor to rapid adaptation than the introduction of novel alleles via mutation or gene flow (Barrett and Schluter 2008), a premise that is supported by our analyses, which did not identify loci that were fixed for a certain genotype in either the rural or urban environments. Because we cannot eliminate the possibility that some of our loci are linked to one another, nor assume that we have sampled every linkage group (Lowry et al. 2017), the number of candidate loci we identified should not be considered a proxy of the strength or ubiquity of selection across the genome of wood frogs in these populations. Environmental features associated with urbanization may increase the likelihood of acute selection factors generating the observed changes in genotype frequencies. For example, habitat fragmentation is commonly seen in urban environments (Zipperer et al. 2012; Leonard et al. 2017) and can result in decreases in gene flow as detected in a concurrent study for the wood frog populations described herein (Homola 2018), as well as other urban vernal pool amphibians (Cox et al. 2017). These increases in population isolation can shift the genetic foundation of traits under selection from highly polygenic to a simpler basis (Yeaman and Whitlock 2011), increasing the likelihood of finding strong single locus effects. Furthermore, recent isolation can disrupt migration-selection equilibrium, increasing the effects of selection, particularly when environmental differences become more dramatic between populations with diminished gene flow (Savolainen et al. 2013). In scenarios where gene flow may still be present, reductions in habitat quality associated with urbanization may lead to asymmetrical gene flow (i.e., source-sink dynamics) if urban populations are less productive. Such a scenario was traditionally thought to hamper local adaptation due to swamping of low frequency adaptive alleles in the sink population (reviewed in Lenormand 2002). However, recent empirical genomics work suggests local adaptation in the presence of gene flow is more common than theory predicts, even for asymmetrical gene flow scenarios (Tigano and Friesen 2016). Previous research on the effects of urbanization on wood frogs and other vernal pool amphibians provides some potential insights into the specific urban-related selection factors that may be causing the observed patterns. Roadside populations of wood frogs have been observed to have larger body sizes, higher fecundity, and better jumping performance than their nearby woodland counterparts (Brady et al. In press). These indicators of greater fitness in roadside populations is juxtaposed by other observations of reduced growth rates for larval wood frogs from these environments (Hall et al. 2017), suggesting road-associated effects may be variable across wood frog populations or life history stages. Brady (2012) found evidence of spotted salamanders (Ambystoma maculatum), a pool breeding amphibians that is often sympatric to wood frogs, adapting to runoff of salts used for deicing roadways into their vernal pool breeding habitats. They used reciprocal transplants to evaluate the survival of embryonic individuals between woodland and roadside environments, finding 25% greater survival for individuals that originated in a roadside pool. Road salts have also been associated with altered prey behavior and reduced growth in spotted salamanders (Petranka and Francis 2013). Our urban sites were consistently much closer to roadways than the rural sites, suggesting similar selection factors may be driving our observations. We also documented reduced canopy cover in the areas immediately surrounding the urban vernal pools. Declines in canopy cover have been associated with increased growth rates for wood frogs (Skelly et al. 2005; Schiesari 2006). Coupled with increases in water temperature and associated alterations in biotic communities in vernal pools with less tree cover (Skelly et al. 2014), a reduction in canopy coverage associated with urbanization is a plausible contributor to the observed selection patterns. Locations of some SNPs with evidence of selection in or near annotated homologous regions in American bullfrog provides some support for these selective patterns. RAD locus 63867 is in and near genes potentially associated with thyroglobulin, which is a thyroid hormone precursor, and thyroid pathways may be targets for road salt effects (Tollefsen et al. 2015). Locus 75170 is located near a putative brain-specific homeobox associated gene, which has been linked to photoreceptor cell development in Xenopus (D’Autilia et al. 2010) and zebrafish (Danio rerio; Schredelseker and Driever 2018). Likewise, selection on some photoreceptor genes might be expected under conditions of altered canopy and light regimes in urban settings (Skelly et al. 2014). We detected lesser amounts of genetic diversity in the urban populations, although the differences detected may have stronger statistical than biological significance. Our D values were all <0.02, indicating little appreciable difference between cumulative density functions and statistical significance likely inflated by the large sample of loci. While this difference in our populations is modest, a reduction in urban genetic diversity is logical for our populations given evidence of population size reductions (Rubbo and Kiesecker 2005; Skidds et al. 2007; Clark et al. 2008; Veysey et al. 2011) and increases in population isolation (Homola 2018) for urban vernal pool amphibians, which could diminish genetic diversity by amplifying the effects of genetic drift while reducing the effects of gene flow. Less genome-wide genetic diversity for urban relative to rural populations has been found in several species, including burrowing owls (Athene cunicularia; Mueller et al. 2018), great tits (Parus major; Perrier et al. 2018), and white-footed mice (P. leucopus; Munshi-South et al. 2016). The small difference in genetic diversity between urban and rural populations in our study may be associated with a time lag effect, with a greater disparity between habitat types possibly being observable in the future (Landguth et al. 2010; Epps and Keyghobadi 2015).

Advantages and Challenges of Replicated Landscape Genomics Sampling

Inferences from landscape genomics studies can be substantially strengthened when replicated study sites are well-distributed geographically. Our research benefited from replication in 3 ways. First, the inclusion of replicates provides a level of protection against high false positive (type I error) rates that occur when using genome scans for selection (Lotterhos and Whitlock 2014; Storfer et al. 2015). For instance, we observed several loci with high mean FST values within particular replicate-pairs that were not identified as universal outliers, suggesting these idiosyncratic cases of elevated differentiation were not consistently associated with general features of urbanization (Figure 4). Second, even when loci are accurately identified as experiencing selection, the ability to detect consistent changes in genotype frequency across multiple replicates reduces the possibility that the observed pattern is caused by an unmeasured environmental influence, population structure, or genetic drift. For example, several population pairs consistently experiencing genetic drift in the same direction are much less likely than shared selection compelling a certain genotype toward fixation. Rare scenarios, such as sampling population pairs occurring along a range expansion front with a shared colonization source, may cause common alleles to drift to fixation across separate populations; however, that scenario is unlikely for our study, where populations occur well within the species’ range boundary. Within-study replication also aids in the generalizability of results, providing confidence that a hypothesized environmental effect is apt to induce similar evolutionary processes in populations other than those included in analyses. An absence of a signal of selection in a replicated landscape genomics study, however, does not necessarily mean selection is not occurring. Some aspects of local adaptation could evolve through alternative genomic pathways, allowing the same adaptive phenotype to be developed via different genetic architectures (Ralph and Coop 2015; Bernatchez 2016). For these traits, a replicated study design may overlook important adaptive processes. Our percentile-based approach for identifying outlier loci was necessary because most popular genome scan approaches do not explicitly account for paired or replicated sampling schemes (but see Foll et al. 2014). For example, Flanagan and Jones (2017) identified limitations of the FST heterozygosity approach common to programs such as FDIST2 (Beaumont and Nichols 1996) and Lositan (Antao et al. 2008) when only 2 populations are sampled. They noted that FST values are constrained by the value of HT (i.e., global heterozygosity; Wright 1943), which is inherently limited when biallelic loci are analyzed in only 2 populations (Jakobsson et al. 2013). This can lead to an inaccurate estimation of confidence intervals and erroneous identification of outlier loci. OutFLANK (Whitlock and Lotterhos 2015), an approach that uses trimmed FST distributions to identify outlier loci, is inappropriate for 2 population scenarios owing to a lack of power to detect signals of selection. Replicated sampling can be problematic in other software such as PCAdapt (Luu et al. 2017), an outlier analysis that is commonly used for detecting evidence of selection and that uses an approach that assumes loci most strongly related to population structure are associated with local adaptation. This approach is challenging under replicated sampling, because genetic structure may be driven by unique components of selection, genetic drift, and gene flow in different replicates due to intrinsic characteristics of the populations (e.g., effective populations size) and their landscape context (e.g., gene flow permeability), making comparisons across replicates inequitable. We selected our sampling sites with a goal of achieving replicated pairings of sites, with greater divergence expected between replicates than within replicates. Such a sampling scheme should maximize the effects of evolutionary history within replicates, increasing the odds that apparent selection is related to urbanization (Lotterhos and Whitlock 2015). Despite achieving this desired spatial arrangement of sites, there was more genetic divergence within replicate 1 than between replicates 2 and 3 (Figure 3b). Although this did not preclude our ability to detect selection associated with urbanization, it provides an important reminder of the potential incongruity between geographic proximity and genetic structure when designing a replicated landscape genomic study. When vast differences in genetic similarity are present within replicates, demographic processes can obscure signals of selection (Lotterhos and Whitlock 2014; de Villemereuil et al. 2014). However, with sufficient replication, the nontarget noise associated with demographic processes likely can be minimized. The importance of choosing an appropriate spatial scale for a paired study design is also highlighted by Lotterhos and Whitlock (2015), who note the relevance of choosing a spatial scale within each pair that is reflective of gene flow-selection balance; however, without extensive prior study of the focal populations, this is unknown. Substantial disparities in gene flow-selection balance among replicates may result in difficulty in detecting consistent signals of selection if strong gene flow within a subset of replicates is able to swamp locally advantageous alleles that are selectively favored. Consistently high FST values for several loci in our study suggests similar gene flow-selection relationships likely exist among our analyzed replicates.

Future Directions and Conclusions

This study provides an empirical example of the capacity of anthropogenic landscape alterations to influence evolutionary processes in wild populations while describing an approach for detecting evidence of parallel evolution with a replicated sampling design. Landscape genomics provides a valuable set of tools for discovering the basis and limitations of evolutionary adaptation; however, study design and interpretation of results deserve careful consideration. Studies geared toward detecting parallel evolution provide a promising avenue for future research into the consequences of common habitat alterations, such as those observed with urbanization. More complex model-based approaches for detecting parallel evolution will likely provide a more nuanced understanding of these processes (Bailey et al. 2018; Santangelo et al. 2018). Additionally, continued efforts to produce annotated genomes for nonmodel species, such as wood frogs, will provide the foundation for a more precise understanding of where selection is occurring within genomes and importantly, which phenotypic traits are being influenced. The coming decades will continue to be defined by profound environmental changes that are fundamentally affecting ecosystems. Urbanizing landscapes are often assumed to represent degraded environments that are incompatible with wildlife. While urbanization has led to the repeated extirpation of some species (Er et al. 2005; Dolan et al. 2011; Aronson et al. 2014), others will likely adapt to these environmental changes. A consideration of the contributions of rapid evolution to the adaptive capacity of species alongside other processes such as range shifts and phenotypic plasticity will likely improve the efficacy of biodiversity conservation efforts (Ashley et al. 2003; Hendry et al. 2011).

Funding

This research was supported by the National Science Foundation (BCS-1313627). This project was also supported by the USDA National Institute of Food and Agriculture, Hatch (or McIntire-Stennis, Animal Health, etc.) project number #ME0-31706 through the Maine Agricultural & Forest Experiment Station, publication number 3675. Maine Department of Inland Fisheries and Wildlife provided support though the Cooperative Agreement with the US Geological Survey Maine Cooperative Fish and Wildlife Research Unit. J.J.H. was supported by US National Science Foundation Adaptation to Abrupt Climate Change IGERT program (DGE-1144423); and the University of Maine’s Janet Waldron Doctoral Research Fellowship. Click here for additional data file.

71 in total

1. An introduction to microevolution: rate, pattern, process.

Authors: A P Hendry; M T Kinnison
Journal: Genetica Date: 2001 Impact factor: 1.082

2. Reliable Detection of Loci Responsible for Local Adaptation: Inference of a Null Model through Trimming the Distribution of F(ST).

Authors: Michael C Whitlock; Katie E Lotterhos
Journal: Am Nat Date: 2015-09-15 Impact factor: 3.926

3. Adaptation from standing genetic variation.

Authors: Rowan D H Barrett; Dolph Schluter
Journal: Trends Ecol Evol Date: 2007-11-14 Impact factor: 17.712

4. Constraints on the FST-Heterozygosity Outlier Approach.

Authors: Sarah P Flanagan; Adam G Jones
Journal: J Hered Date: 2017-07-01 Impact factor: 2.645

5. Purging putative siblings from population genetic data sets: a cautionary view.

Authors: Robin S Waples; Eric C Anderson
Journal: Mol Ecol Date: 2017-02-06 Impact factor: 6.185

Review 6. Genomics of local adaptation with gene flow.

Authors: Anna Tigano; Vicki L Friesen
Journal: Mol Ecol Date: 2016-04-06 Impact factor: 6.185

7. DISPERSAL IN THE WOOD FROG (RANA SYLVATICA): IMPLICATIONS FOR GENETIC POPULATION STRUCTURE.

Authors: Keith A Berven; Thaddeus A Grudzien
Journal: Evolution Date: 1990-12 Impact factor: 3.694

Review 8. 10 Years of GWAS Discovery: Biology, Function, and Translation.

Authors: Peter M Visscher; Naomi R Wray; Qian Zhang; Pamela Sklar; Mark I McCarthy; Matthew A Brown; Jian Yang
Journal: Am J Hum Genet Date: 2017-07-06 Impact factor: 11.025

9. Evaluation of demographic history and neutral parameterization on the performance of FST outlier tests.

Authors: Katie E Lotterhos; Michael C Whitlock
Journal: Mol Ecol Date: 2014-04-11 Impact factor: 6.185

10. Reducing bias in population and landscape genetic inferences: the effects of sampling related individuals and multiple life stages.

Authors: William Peterman; Emily R Brocato; Raymond D Semlitsch; Lori S Eggert
Journal: PeerJ Date: 2016-03-14 Impact factor: 2.984

1 in total

Review 1. Evolutionary principles guiding amphibian conservation.

Authors: Maciej Pabijan; Gemma Palomar; Bernardo Antunes; Weronika Antoł; Piotr Zieliński; Wiesław Babik
Journal: Evol Appl Date: 2020-03-13 Impact factor: 5.183

1 in total