| Literature DB >> 31134281 |
Yann Bourgeois1, Robert P Ruggiero1, Joseph D Manthey1,2, Stéphane Boissinot1.
Abstract
Gaining a better understanding on how selection and neutral processes affect genomic diversity is essential to gain better insights into the mechanisms driving adaptation and speciation. However, the evolutionary processes affecting variation at a genomic scale have not been investigated in most vertebrate lineages. Here, we present the first population genomics survey using whole genome resequencing in the green anole (Anolis carolinensis). Anoles have been intensively studied to understand mechanisms underlying adaptation and speciation. The green anole in particular is an important model to study genome evolution. We quantified how demography, recombination, and selection have led to the current genetic diversity of the green anole by using whole-genome resequencing of five genetic clusters covering the entire species range. The differentiation of green anole's populations is consistent with a northward expansion from South Florida followed by genetic isolation and subsequent gene flow among adjacent genetic clusters. Dispersal out-of-Florida was accompanied by a drastic population bottleneck followed by a rapid population expansion. This event was accompanied by male-biased dispersal and/or selective sweeps on the X chromosome. We show that the interaction between linked selection and recombination is the main contributor to the genomic landscape of differentiation in the anole genome.Entities:
Keywords: zzm321990 Anolis carolinensiszzm321990 ; divergence; recombination; selection
Mesh:
Year: 2019 PMID: 31134281 PMCID: PMC6681179 DOI: 10.1093/gbe/evz110
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
. 1.—Genetic structure in Anolis carolinensis from whole-genome SNP data. (A) Results from the DAPC analysis highlighting the five clusters inferred from the analysis of ∼6,500 SNPs thinned every 10 kb and with <20% missing data. The map reports the coordinates of the localities used in this study and the genetic clusters they belong to. (B) RAxML phylogeny based on one million SNPs randomly sampled across the genome. All 100 bootstrap replicates supported the reported topology, except for two nodes with support of 90 and 85. One individual from South Florida was removed due to a high rate of missing data. (C) Network representation of the relatedness between samples as inferred by Splitstree v4. Color codes match those in parts (A) and (B).
Diversity and Tajima’s D (±SD) for Each of the Five Genetic Clusters, Averaged over Nonoverlapping 5-kb Windows across the Genome
| Statistics | CA | GA | EF | WF | SF |
|---|---|---|---|---|---|
| Nucleotide diversity | 0.00155±0.00154 | 0.00177±0.00153 | 0.00330±0.0021 | 0.00341±0.0022 | 0.00279±0.002 |
| Tajima’s | −0.17±1.49 | 0.14±0.0015 | −0.73±0.002 | −0.80±0.0022 | −0.66±0.002 |
. 2.—Variation in effective population sizes with time and comparison of drift between autosomes and sex-linked scaffolds. (A) Reconstruction of past variations in effective population sizes (Ne) inferred by SMC++. Dashed vertical lines correspond to the estimated splitting times between the five genetic clusters previously inferred. We assume a mutation rate of 2.1×10−10/bp per generation and a generation time of 1 year. (B) Average branch lengths obtained from autosomal data and ESRs (ξ) inferred from KIMTREE. A set of 5,000 autosomal and 5,000 sex-linked markers were randomly sampled to create 50 pseudoreplicated data sets on which the analysis was run. The analysis was run on the three most closely related populations. Pie charts indicate the proportion of replicates for which we observed significant support (S<0.01) in favor of a biased sex-ratio.
. 3.—(A) Graphic description of the six categories of ∂a∂i models tested over pairs of green anole genetic clusters. Each model describes a scenario where two populations diverge from an ancestral one, with varying timing and strength of gene flow after their split. SI, strict isolation; AM, ancestral migration, where populations first exchange gene flow then stops Tiso generations ago; PAM, ancestral migration with two periods of contact lasting Tiso/2 generations; SC, secondary contact where populations still exchange gene flow at present time; PSC, secondary contact with two periods of contact lasting Tsc/2 generations; IM, isolation with constant migration and no interruption of gene flow. Models were constrained so that Tiso and Tsc lasted at least ∼50,000 years. Reproduced with the authorization of Christelle Fraïsse. (B) Fitting of the best models for the EF (N = 16) versus GA (N = 14) and EF versus WF (N = 8) comparisons. Both models fit the observed data sets as indicated by the similar spectra between observation and simulation. The “2 N” suffix means that background selection was added to the base model by modeling heterogeneous effective population sizes across loci. The “2M2P” suffix means that heterogeneity in gene flow was incorporated into the model. The “ex” suffix means that exponential population size change was introduced in the base model.
Summary of Best-Supported Demographic Models
| Comparison | Model | Na12 | Na1 | Na2 | N1 | N2 | m2->1 | m1->2 | Tiso | Tsc | Tscg |
|
|
|
| O | logLikelihood | AIC |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| GA vs. EF | IM2M2Pex | 2,156,641 | 3,538,759 | 2,438,866 | 371,048 | 7,132,328 | 2.42E-07 | 2.57E-07 | NA |
| 1,615,603 | 0.83 | 0.95 | NA | NA | 0.97 | −1,045.99 | 2,113.97 |
| GA vs. EF | IMex | 2,156,641 | 5,971,380 | 2,137,291 | 364,329 | 6,755,470 | 1.93E-07 | 2.36E-07 | NA |
| 1,962,652 | NA | NA | NA | NA | 0.97 | −1,048.14 | 2,114.29 |
| GA vs. EF | SC2M2Pex | 2,156,641 | 4,214,885 | 2,296,060 | 367,662 | 7,049,892 | 2.12E-07 | 2.58E-07 |
|
| 1,723,615 | 0.91 | 0.95 | NA | NA | 0.97 | −1,046.01 | 2,116.01 |
| GA vs. EF | PSCex | 2,156,641 | 4,126,463 | 2,402,541 | 367,459 | 6,926,709 | 1.93E-07 | 2.35E-07 |
| 110,560 | 1,713,849 | NA | NA | NA | NA | 0.97 | −1,048.03 | 2,116.06 |
| GA vs. EF | SCex | 2,156,641 | 4,150,102 | 2,359,593 | 369,470 | 6,934,978 | 1.91E-07 | 2.36E-07 |
|
| 1,719,867 | NA | NA | NA | NA | 0.97 | −1,048.05 | 2,116.11 |
| GA vs. EF | PSC2M2Pex | 2,156,641 | 3,873,718 | 2,394,203 | 370,665 | 7,005,838 | 2.04E-07 | 2.58E-07 |
|
| 1,672,998 | 0.93 | 0.95 | NA | NA | 0.97 | −1,046.12 | 2,116.24 |
| GA vs. EF | IM2Nex |
|
| 2,766,204 | 360,422 | 6,961,498 | 1.99E-07 | 2.34E-07 | NA | 615,678 | 1,461,807 | NA | NA |
|
| 0.97 | −1,048.09 | 2,118.17 |
| EF vs. WF | PSC2N | 2,091,300 | NA | NA | 5,181,876 | 5,389,218 | 1.94E-06 | 6.93E-07 | 1,064,768 | 759,07 | NA | NA | NA | 0.60 | 0.25 | 0.98 | −669.91 | 1,357.81 |
| EF vs. WF | SC2N | 2,091,305 | NA | NA |
|
| 1.18E-06 | 4.25E-07 | 1,952,690 |
| NA | NA | NA | 0.57 | 0.26 | 0.98 | −676.66 | 1,371.33 |
Note.—PSC2N, secondary contact with two periods in isolation and heterogeneous effective population sizes across the genome; SCex, secondary contact with an episode of population expansion following secondary contact; SC2M2P and IM2M2P, models of secondary contact and constant gene flow with heterogeneous migration rates along the genome. The ancestral size (Na12) before the split was calculated from the SMC++ output to facilitate comparisons. For the -ex models, following the initial split, populations have a population size of N1a and N2a followed by an exponential growth that leads to their current sizes N1 and N2. In basic models, populations have constant sizes N1 and N2 since the split. nr, proportion of the genome displaying an effective population size of bf times the population size displayed by the remaining 1−nr fraction not affected by linked selection; O, proportion of sites for which the ancestral state was correctly inferred; P1 and P2, the proportion of sites resisting gene flow in populations 1 and 2; Tiso, total time spent in isolation. For the PSC model, populations are isolated twice in their history for Tiso/2 generations and are connected twice for Tsc/2 generations (see fig. 3). Tsc, time during which stable populations stay connected; Tscg, time since population size change (with gene flow). The total time during which populations were connected is Tscg+Tsc. For each model, the set of best estimates is shown. Uncertainties over parameters were measured by SDs obtained from 100 bootstrap replicates. No star, uncertainties are below ±20% of point estimates; *, uncertainties between ±20% and ±50%; **, uncertainties between ±50% and ±150%. Cells with “NA” values correspond to parameters that were not part of a given model.
. 4.—Summary statistics for recombination and differentiation along chromosomes. ρ = 4×Ne×r, with r the recombination rate per bp and per generation and Ne the effective population size for the EF cluster. FST and dXY are relative and absolute measures of differentiation that are correlated with the amount of shared heterozygosity and coalescence time across populations, respectively. We present differentiation for the three genetic clusters having diverged for the longest time period. Statistics were averaged over nonoverlapping 5-kb windows and a smoothing line was fit to facilitate visual comparison. Repetitive centromeric regions that are masked from the green anole genome are highlighted by black rectangles.