| Literature DB >> 27077007 |
Quentin Rougemont1, Camille Roux2, Samuel Neuenschwander3, Jérôme Goudet2, Sophie Launey1, Guillaume Evanno1.
Abstract
Inferring the history of isolation and gene flow during species divergence is a central question in evolutionary biology. The European river lamprey (Lampetra fluviatilis) and brook lamprey (L. planeri) show a low reproductive isolation but have highly distinct life histories, the former being parasitic-anadromous and the latter non-parasitic and freshwater resident. Here we used microsatellite data from six replicated population pairs to reconstruct their history of divergence using an approximate Bayesian computation framework combined with a random forest model. In most population pairs, scenarios of divergence with recent isolation were outcompeted by scenarios proposing ongoing gene flow, namely the Secondary Contact (SC) and Isolation with Migration (IM) models. The estimation of demographic parameters under the SC model indicated a time of secondary contact close to the time of speciation, explaining why SC and IM models could not be discriminated. In case of an ancient secondary contact, the historical signal of divergence is lost and neutral markers converge to the same equilibrium as under the less parameterized model allowing ongoing gene flow. Our results imply that models of secondary contacts should be systematically compared to models of divergence with gene flow; given the difficulty to discriminate among these models, we suggest that genome-wide data are needed to adequately reconstruct divergence history.Entities:
Keywords: Approximate Bayesian computation; Divergence history; Gene flow; Lampetra fluviatilis; Lampetra planeri; Random forest; Speciation
Year: 2016 PMID: 27077007 PMCID: PMC4830234 DOI: 10.7717/peerj.1910
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Figure 1Map of sampling sites across the channel area.
River names match those given in Table 2 and F values are given for each population pair.
Estimates of populations genetic parameters for each pair of river and brook lamprey populations.
N, number of individuals used for ABC analysis; Ar, Allelic richness; He, expected heterozygosity; GW, Garza-Williamson Index. Population are classified by increasing order of genetic differentiation.
| Pop | River name | N | N | Ar | Ar | He | He | GW | GW | Δ | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| OIR | Oir | 104 | 74 | 0.028 | 4.45 | 3.61 | 0.52 | 0.508 | 0.525 | 0.622 | 0.204 |
| BET | Bethune | 14 | 14 | 0.028 | 3.51 | 3.36 | 0.516 | 0.471 | 0.452 | 0.464 | 0.507 |
| RIS | Risle | 75 | 75 | 0.033 | 3.84 | 3.92 | 0.503 | 0.472 | 0.497 | 0.421 | 0.842 |
| HEM | Hem | 30 | 65 | 0.077 | 4.21 | 3.53 | 0.504 | 0.477 | 0.406 | 0.487 | 1.633 |
| AA | Aa | 34 | 69 | 0.084 | 4.21 | 3.76 | 0.514 | 0.522 | 0.406 | 0.505 | 0.915 |
| BRE | Bresle | 93 | 80 | 0.091 | 4.14 | 4.91 | 0.49 | 0.49 | 0.466 | 0.263 | 34.37 |
Notes.
For the ABC inference, individuals of river lamprey from the AA and Hem (F = 0) river were pooled together to obtain a sample size similar to the one of brook lampreys.
Brook lamprey samples from the AA and Hem rivers are composed of upstream and downstream samples from the Rougemont et al. (2015) study.
Figure 2Different scenario of divergence between L. planeri and L. fluviatilis.
Five models with different parameters are tested and compared. Two null models: stict Isolation (SI) and Panmixia (PAN). Three models of migration: isolation with constant migration (IM), ancient migration (AM) and secondary contact (SC). The following parameters are shared by all models: τdiv : number of generations since divergence time. θ, θ, θ: effective population size of the ancestral population, of L. fluviatilis and L. planeri respectively. τisol is the number of generations since the two ecotypes have stopped exchanging genes. τ is the number of generations since the two ecotypes have entered into a secondary contact after a period of isolation. M12 and M21 represent the number of migrants expressed in 4.Nm units per generation with m the proportion of population made of migrants from the other populations.
Prior for all models.
θ, θ1, θ2 = effective mutation rate for the ancestral, river lamprey and brook lamprey populations respectively. M1, M2, MAnc = Effective migration rate for the ancestral, river lamprey and brook lamprey populations respectively. τ = divergence time, τisolτ divergence time under the ancient migration model and time of secondary contact respectively.
| Parameters | Models | Prior |
|---|---|---|
| SI, IM, AM, SC | Uniform [0–3] | |
| SI, IM, AM, SC, PAN | Uniform [0–3] | |
| SI, IM, AM, SC | Uniform [0–( | |
| IM, SC | Uniform [0–20] | |
| AM | Uniform [0–20] | |
| SI, IM, AM, SC | Uniform [0–25] | |
| AM | Uniform [0– | |
| SC | Uniform [0– |
Notes.
strict isolation
isolation with migration
ancient migration
Panmixia
secondary contact model
Figure 3Curves of out-of-bag errors rates and estimation of variable importance.
Data based on one random forest, each composed of 1,000 trees obtained from a trained set of 50,000 simulated predictor variables (summary statistics). The response variable is the demographic model. Example taken from the Aa river. Estimation for the remaining rivers yielded similar results and are presented in Table S2 and Fig. S1.
ABC classification (posterior probability) and random forest (RF) prediction of each model of speciation in each river.
| MODEL | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| SI | IM | AM | SC | PAN | ||||||
| RIVER | ABC | RF | ABC | RF | ABC | RF | ABC | RF | ABC | RF |
| AA | 0 | 0 | 0.27 | 0.39 | 0.01 | 0.06 | 0 | 0 | ||
| BET | 0 | 0 | 0 | 0.02 | 0.35 | 0.08 | 0.06 | |||
| BRE | 0.01 | 0.02 | 0.12 | 0.24 | 0.14 | 0.12 | 0 | 0 | ||
| HEM | 0 | 0 | 0.42 | 0.01 | 0.05 | 0.42 | 0 | 0 | ||
| RIS | 0 | 0 | 0.46 | 0 | 0 | 0 | 0 | |||
| OIR | 0 | 0 | 0.15 | 0 | 0.02 | 0.14 | 0.05 | |||
| Average | 0.00 | 0.00 | 0.31 | 0.44 | 0.03 | 0.05 | 0.53 | 0.49 | 0.13 | 0.02 |
Random forests out-of-bag confusion matrix and classification error.
Data based on 6 random forests, each composed of 1,000 trees based on a trained set of 50,000 simulated predictor variables (summary statistics). The response variable is the demographic model. Proportions of correctly classified demographic models are in bold. The grey italic values represent models with high error rates. Simulation between rivers differed only by the number of individual loci simulated and produced very similar values that were subsequently averaged over each demographic model.
| Predicted model (Averaged over each river) | Averaged OOB error rate | |||||
|---|---|---|---|---|---|---|
| AM | I | IM | PAN | SC | ||
| AM | 16.7% | 2.4% | 0.0% | 2.9% | 21.99% | |
| I | 25.2% | 0.6% | 0.0% | 0.6% | 26.38% | |
| IM | 1.6% | 0.2% | 0.8% | |||
| PAN | 0.0% | 0.0% | 0.2% | 0.1% | 0.30% | |
| SC | 2.1% | 0.3% | 0.6% | |||
Estimates of demographic parameters under the model of ongoing migration (IM) and secondary contact (SC) in each river.
| River | Model | N | N | N | Migration from | Migration from | Split time | Time secondary contact |
|---|---|---|---|---|---|---|---|---|
| median [95HPD] | median [95HPD] | median [95HPD] | median [95HPD] | median [95HPD] | median [95HPD] | median [95HPD] | ||
| IM | 1310[930–2020] | 410[290–680] | 2290[2260–2350] | 0.0025[0.0018–0.0032] | 0.003[0.0027–0.003] | 268000[246000–282000] | ||
| SC | 1480[760–2550] | 390[270–620] | 1850[1600–2110] | 0.0022[0.002–0.0025] | 0.0032[0.003–0.0033] | 191200[168000–230000] | 89200[61400–108000] | |
| IM | 1620[1260–2350] | 940[430–1540] | 1650[1500–1780] | 0.0029[0.0027–0.0032] | 0.0035[0.0033–0.0037] | 29800[24200–34200] | ||
| SC | 1930[1280–2630] | 940[570–1420] | 1160[1030–1300] | 0.0024[0.002–0.0027] | 0.0033[0.0032–0.0038] | 322000[265600–396400] | 164600[116000–212800] | |
| IM | 1020[440–2450] | 310[120–790] | 1360[440–2370] | 0.0004[0.0004–0.0015] | 0.002[0.0006–0.0039] | 274000[99800–453800] | ||
| SC | 1610[910–2720] | 740[220–1500] | 1440[320–2620] | 0.0017[0.0004–0.0041] | 0.0025[0.0007–0.0046] | 268000[140800–446400] | 20400[3800–124600] | |
| IM | 1000[710–194] | 190[130–310] | 2600[2480–2700] | 0.002[0.0017–0.0027] | 0.0042[0.0039–0.0044] | 240000[191200–270000] | ||
| SC | 860[660–1280] | 350[80–880] | 1680[1200–1880] | 0.0031[0.0028–0.0034] | 0.0024[0.002–0.0029] | 278000[231400–328800] | 99400[85400–110000] | |
| IM | 840[600–1540] | 620[620–880] | 840[770–930] | 0.0021[0.0015–0.003] | 0.0025[0.0018–0.0031] | 197000[181800–209400] | ||
| SC | 1360[840–2320] | 640[640–1030] | 1010[660–1520] | 0.0036[0.0034–0.004] | 0.0037[0.0036–0.0038] | 226000[168600–320400] | 91200[42600–151800] | |
| Average | IM | 1158[788–1710] | 494[318–840] | 1748[1490–2026] | 0.0020[0.0016–0.0027] | 0.003[0.0025–0.0036] | 201760[148600–249880] | [–] |
| SC | 1448[890–2300] | 612[356–1090] | 1428[962–1886] | 0.0026[0.0021–0.0033] | 0.003[0.0025–0.0037] | 257040[194880–344400] | 92960[61840–141440] | |
| PAN | 2050[1940–2180] |
Notes.
effective population size
Lampetra fluviatilis
Lampetra planeri
The PAN model is controlled by a single parameter the effective population size of the single population (made of both Lf and Lp backgrounds).