| Literature DB >> 29531700 |
Roman Yukilevich1, Luana S Maroja2, Kim Nguyen1, Syed Hussain1, Preethi Kumaran1.
Abstract
The rapid evolution of sexual isolation in sympatry has long been associated with reinforcement (i.e., selection to avoid maladaptive hybridization). However, there are many species pairs in sympatry that have evolved rapid sexual isolation without known costs to hybridization. A major unresolved question is what evolutionary processes are involved in driving rapid speciation in such cases. Here, we focus on one such system; the Drosophila athabasca species complex, which is composed of three partially sympatric and interfertile semispecies: WN, EA, and EB. To study speciation in this species complex, we assayed sexual and genomic isolation within and between these semispecies in both sympatric and allopatric populations. First, we found no evidence of reproductive character displacement (RCD) in sympatric zones compared to distant allopatry. Instead, semispecies were virtually completely sexually isolated from each other across their entire ranges. Moreover, using spatial approaches and coalescent demographic simulations, we detected either zero or only weak heterospecific gene flow in sympatry. In contrast, within each semispecies we found only random mating and little population genetic structure, except between highly geographically distant populations. Finally, we determined that speciation in this system is at least an order of magnitude older than previously assumed, with WN diverging first, around 200K years ago, and EA and EB diverging 100K years ago. In total, these results suggest that these semispecies should be given full species status and we adopt new nomenclature: WN-D. athabasca, EA-D. mahican, and EB-D. lenape. While the lack of RCD in sympatry and interfertility do not support reinforcement, we discuss what additional evidence is needed to further decipher the mechanisms that caused rapid speciation in this species complex.Entities:
Keywords: gene flow; genomic divergence; isolation by distance; population structure; range expansions; reproductive character displacement; secondary contact; sexual isolation
Year: 2018 PMID: 29531700 PMCID: PMC5838044 DOI: 10.1002/ece3.3893
Source DB: PubMed Journal: Ecol Evol ISSN: 2045-7758 Impact factor: 2.912
Figure 1Geographical range map of the Drosophila athabasca species complex: D. athabasca (WN, blue), D. mahican (EA, red), and D. lenape (EB, orange) with specific locations shown as pie charts. Each pie chart represents the relative frequency of the three species in each location based on isofemale lines genotyped and/or phenotyped for species identity (sample size of identified lines used per location is shown in parentheses). See Materials and Methods for identification procedure. See Table S1 for detailed location information and description of all lines studied
Results of IMa2 simulations for estimated time of divergence (T) and effective population size (Ne) for each species pair analyzed and comparison of our results to previous estimates
| Wong‐Miller et al. ( | Scenario 1: | Scenario 2: | Scenario 3: | |
|---|---|---|---|---|
|
| 5.8 × 10−8 | 6.08 × 10−8 | 1.027 × 10−8 | 5.068 × 10−9 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Estimated time of divergence is based on peak probability from IMa2 model with different scenarios (95% confidence intervals shown below each estimate). Note that the IMa2 program calculates the per locus mutation rate based on calculated sequence divergence per locus and a given time of divergence with an outgroup species (i.e., D. affinis; see Materials and Methods).
Values shown are averages of Wong‐Miller et al. (2017) independent estimates for autosomes and X chromosome data (Wong‐Miller et al., 2017 assumed: 5.8 × 10−9/gen. and 10 gen./year, so that u = 5.8 × 10−8 per site per year).
Assuming a hypothetical (and not previously supported) divergence time of 250,000 years between D. affinis and athabasca species complex in order to approach divergence time estimates close to estimates of Ford and Aquadro (1996) and Wong‐Miller et al. (2017).
Assuming divergence time of 1.5 million years between D. affinis and athabasca species complex based on Carson's (1976) calibration using Hawaiian Drosophila divergence and geological island formation time with the estimated Nei's D divergence rate of 0.00000025/generation (Nei's D of 1 per 2 million years, with two generations per year in Hawaiian slow‐breeding flies).
Assuming divergence time of 3 million years between D. affinis and athabasca species complex, which is slightly less than average between two prior published nucleotide sequence‐based estimates: Beckenbach et al. (1993) (2.7 MY) and Russo et al. (2013) (3.6 MY).
Estimated Ne is the average between two simulations with different isofemale lines used for D. mahican that are sympatric with D. athabasca or with D. lenape, respectively: 812,902 (D. mahican–D. athabasca)/205,755 (D. mahican–D. lenape).
Estimated Ne is the average between two simulations with different isofemale lines used for D. mahican that are sympatric with D. athabasca or with D. lenape, respectively: 2,867,411(D. mahican–D. athabasca)/1,583,023 (D. mahican–D. lenape).
Estimated Ne is the average between two simulations with different isofemale lines used for D. mahican that are sympatric with D. athabasca or with D. lenape, respectively: 9,207,490 (D. mahican–D. athabasca)/2,672,698 (D. mahican–D. lenape).
Figure 5Genetic structure and ancestry of 281 individuals of the Drosophila athabasca species complex based on 4,690 total variable base pairs with software STRUCTURE (v 2.3.4). Species designation is as follows: D. athabasca (WN, blue/turquoise), D. mahican (EA, red), D. lenape (EB, orange). K = 2–5 plots are shown with default software settings and 5,000 burn‐in period and 5,000 MCMC runs (see Materials and Methods). Specific plot shown is the highest probability run for each K among 20 runs per K (other runs not shown). On average, K = 3 had maximal average value of Ln P(D) and delta K value (Evanno et al., 2005; Table S6). Samples are organized into 21 geographical populations (divided by black lines and abbreviations shown below the plots) and grouped into larger geographical regions across North America (shown above plots and designated with thick black bars)
Figure 2Sexual isolation between and within species of Drosophila athabasca (WN), D. mahican (EA), and D. lenape (EB). The mean SI is shown with error bars indicating 95% confidence intervals across replicates. Number of total replicate tests per comparison shown next to each bar. Results are based mostly on multiple‐choice mating tests across multiple isofemale lines with few no‐choice tests (see Table S2 and text). See Materials and Methods on how mating tests were performed. See Table S2 for detailed description of lines used, specific matings, mating method used, and conditions for each replicate. ANOVA test: NS = not significantly different. Asterisk indicates significance with a post hoc Tukey test
Figure 3Sexual isolation between (panels: a–c) and within (panels: d–f) species: Drosophila athabasca (WN), D. mahican (EA), and D. lenape (EB). Sexual isolation indexes (SIs) between pairwise comparisons are plotted as a function of geographical distance (km). For multiple comparisons made between the same pairwise localities, the mean SI is shown with error bars indicating 95% confidence intervals across replicates. The number of replicate tests per comparison is shown next to each data point. R 2 are shown for each plot (none of the relationships are significant). Results are based mostly on multiple‐choice and few no‐choice mating tests across multiple isofemale lines (see Table S2 and text)
Figure 4Relationship between pairwise geographical distance (km) of conspecific populations (x‐axis) and average measures of sequence divergence (y‐axis) within each species: Drosophila athabasca (WN, left panels, blue squares), D. mahican (EA, center panels, red circles), and D. lenape (EB, right panels, orange triangles). Top panels (a–c) show Dxy (absolute measure of sequence divergence), and bottom panels (d–f) show Fst (relative measure of sequence divergence). Both measures of genetic divergence are averaged across all gene fragments. Only populations with greater than six chromosomes are considered. Mantel tests were performed to determine the significance of each relationship with 1,000 permutation tests of matrix correlations between genetic divergence (Dxy or Fst) and geographical distance. Significant positive correlations reveal evidence of isolation by distance. Trendlines are shown only to help show patterns. Note the scale of y‐axis
Figure 6Phylogenetic and principal component clustering analyses of populations from the three species: Drosophila athabasca (WN, blue, squares), D. mahican (EA, red, circles), and D. lenape (EB, orange, triangles/diamonds). D. affinis was the outgroup. See text and Figure 1 for locations of abbreviations of specific locations. (a) Phylogeny inferred using the neighbor‐joining (NJ) method based on average pairwise Fst values across all 52 gene fragments (see Materials and Methods). Sum of branch length = 1.7 (MEGA7). Minimum number of sequences allowed per population was six chromosomes. Branch lengths are shown. Phylogeny based on Dxy distances showed qualitatively similar results (data not shown). (b) A maximum likelihood consensus phylogeny with bootstrap values based on 22,261‐bp consensus sequences across populations (D. mahican QB, MVS/MKSP, and SAIJ populations were not included due to small sample sizes). Bootstrap supports (500 replicates) are shown next to the branches. (c) Principal component analysis (PCA) based on covariances across a total of 1,820‐bp SNP sites across populations of each species (same populations as in panel b, except D. lenape LK population was not included due to small sample size). PC1 explains 36%, and PC2 explains 20% of total genetic variation across populations. Results were qualitatively similar when different nonoverlapping sets (500 bp per set) were used to generate PCA. Some sympatric locations are labeled
Figure 7Inferred ancestry (probability of being assigned to a given species) of 281 individuals based on 4,690 total variable base pairs with software STRUCTURE (v 2.3.4) for K = 3 run. Drosophila athabasca (WN, blue), D. mahican (EA, red), and D. lenape (EB, orange). Species designation for each isofemale line (represented by a sequenced individual) was established using phenotypic data (male courtship song, copulation duration, and sexual isolation) and geographical information (see Table S1 and text). Open circles represent individuals (lines) for which species designation using phenotypic/geographical data was not determined. Only individual isofemale lines that have a relatively low (70% or less) probability of being assigned to a given species are labeled
Figure 8Relationship between the minimum geographical distance (km) of a focal population to the closest population of the other species and its average measure of sequence divergence to all heterospecific populations. Left panels (a and d) compare Drosophila athabasca and D. mahican, central panels (b and e) compare D. athabasca and D. lenape, and right panels (c and f) compare D. mahican and D. lenape. Blue squares represent D. athabasca (WN), red circles represent D. mahican (EA), and orange triangles represent D. lenape (EB) focal populations in each plot. Top panels (a–c) show Dxy, and bottom panels (d–f) show Fst. Both measures of genetic divergence are averaged across all sequenced genes. Note that these relationships avoid the problem of nonindependent data points by averaging Fst or Dxy values between each focal population with all heterospecific populations. Error bars indicate 95% confidence intervals between the focal population and heterospecific populations. Only populations with greater than three individuals are considered (i.e., D. mahican QB and RIVR/LNOM and D. lenape LK populations were excluded; results did not change when these were included). R 2 values shown for each focal species (D. lenape trendlines in center panels are not shown due to very small range of geographical data points)