| Literature DB >> 32302396 |
Laure Olazcuaga1, Anne Loiseau1, Hugues Parrinello2, Mathilde Paris3, Antoine Fraimout1, Christelle Guedot4, Lauren M Diepenbrock5, Marc Kenis6, Jinping Zhang7, Xiao Chen8, Nicolas Borowiec9, Benoit Facon10, Heidrun Vogt11, Donald K Price12, Heiko Vogel13, Benjamin Prud'homme3, Arnaud Estoup1, Mathieu Gautier1.
Abstract
Evidence is accumulating that evolutionary changes are not only common during biological invasions but may also contribute directly to invasion success. The genomic basis of such changes is still largely unexplored. Yet, understanding the genomic response to invasion may help to predict the conditions under which invasiveness can be enhanced or suppressed. Here, we characterized the genome response of the spotted wing drosophila Drosophila suzukii during the worldwide invasion of this pest insect species, by conducting a genome-wide association study to identify genes involved in adaptive processes during invasion. Genomic data from 22 population samples were analyzed to detect genetic variants associated with the status (invasive versus native) of the sampled populations based on a newly developed statistic, we called C2, that contrasts allele frequencies corrected for population structure. We evaluated this new statistical framework using simulated data sets and implemented it in an upgraded version of the program BayPass. We identified a relatively small set of single-nucleotide polymorphisms that show a highly significant association with the invasive status of D. suzukii populations. In particular, two genes, RhoGEF64C and cpo, contained single-nucleotide polymorphisms significantly associated with the invasive status in the two separate main invasion routes of D. suzukii. Our methodological approaches can be applied to any other invasive species, and more generally to any evolutionary model for species characterized by nonequilibrium demographic conditions for which binary covariables of interest can be defined at the population level.Entities:
Keywords: BayPass; Drosophila suzukii; GWAS; Pool-Seq; biological invasions
Mesh:
Year: 2020 PMID: 32302396 PMCID: PMC7403613 DOI: 10.1093/molbev/msaa098
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 16.240
. 1.Evaluation of the performance of the C2 contrast statistic on simulated data and comparison with the BF for association and two XtX SNP-specific differentiation estimators. (A) Schematic representation of the demographic scenario used for the simulation. It consists of two successive phases: 1) a neutral divergence phase with migration (only some illustrative migration combinations being represented) leading to the differentiation of an ancestral population into 16 populations after four successive fission events (at generations t = 50, t = 150, t = 200, and t = 300) and 2) an adaptive phase (lasting 200 generations) during which individuals were subjected to selective pressures exerted by two environmental constraints (ec1 and ec2) each having two possible modalities (a or b) according to their population of origin (i.e., eight possible environments in total). Out of the 5,000 simulated SNPs, the fitness of individuals in the environment of their population of origin was determined by their genotypes at 25 SNPs for ec1 and 25 SNPs for ec2 constraints. In total, 100 data sets were simulated. (B and C) The ROC curves associated with the ec1 and ec2 C2 contrasts and the two corresponding BF for association are plotted together with those associated with the two XtX estimators (i.e., posterior mean estimator XtX, and the new calibrated estimator ). The FPRs associated with each statistic were obtained from the corresponding neutral SNP estimates combined over the 100 simulated data sets ( values in total). Similarly, the TPRs were estimated from either the n = 2,500 combined ec1 (B) or the ec2 (C) selected SNPs. ROC AUC values are given between parentheses.
. 2.Description of the invasion scenarios Inv1 (A) and Inv2 (B) with values of the historical and demographical parameters used to simulated evolutionary neutral SNP data sets. The coordinates of the 22 populations on the first two axes of variation (following singular-value decomposition with the plot.omega function available in the BayPass package) of the scaled covariance matrix of allele frequencies (respectively, ) estimated using BayPass from the SNP data simulated under Inv1 (respectively, Inv2) are plotted in (C) (respectively, D).
Proportion of SNPs (FPR) Simulated under the Invasion Scenarios Inv1 (n = 165,020 SNPs) and Inv2 (n = 152,321 SNPs) Displaying Outlying Differentiation (based on the statistic) or Showing a Signal of Association with the Population Invasive Status (based on the BF criterion or the C2 statistic).
| Scenario | Scenario | |||
|---|---|---|---|---|
| All | G1 and G2 | All | G1 and G2 | |
| BF >20 (>15) |
|
|
|
|
|
| 0.0 (0.0) | 0.0 (0.0) | 0.0 (0.0) |
|
|
| 0.0 (0.0) |
| ||
Note.—Support for association was evaluated using the BayPass regression models (BF criterion) or the q value derived from the estimated contrast statistics C2 for the three different tests comparing the six native populations allele frequencies to: 1) all 16 invasive populations (“all”); 2) the eight invasive populations from the first group (“G1”); or 3) the eight invasive populations from the second group (“G2”) (supplementary fig. S3, Supplementary Material online). Results from the two latter tests were combined to compute FPRs (columns “G1 and G2” in the table). Note that the BF threshold of 20 dB (respectively, 15 dB) corresponds to decisive (respectively, very strong) evidence in favor of association according to the Jeffreys’ rule (Jeffreys 1961). To account for the bilateral nature of the underlying test (SNPs might be over or underdifferentiated if under directional or balancing selection), the P values derived from the statistic were computed as , where represents the cumulative density function of the distribution with J degrees of freedom (here J = 22).
. 3.Whole-genome scan for association with invasion success in Drosophila suzukii. (A) Geographic location of the 22 D. suzukii population samples genotyped using a pool-sequencing methodology. Population samples from the native range are in blue and those from the invaded range are in red (American invasion route) or light red (European invasion route) (Fraimout et al. 2017). See supplementary table S2, Supplementary Material online, for details on each population sample. (B) Manhattan plot of the SNP q values on a scale derived from the estimated C2 statistics for the native versus invasive status contrast of the 22 worldwide D. suzukii populations. SNPs are ordered by their position on their contig of origin displayed with alternating dark blue and light blue color when autosomal and dark green and light green when X-linked. The horizontal dashed line indicates the 1% q-value threshold (here corresponding to a P-value threshold of ) which gives the expected false-discovery rate, that is, the expected proportion of false-positives among the 110 SNPs (highlighted in the plot) above this threshold.
. 4.Pairwise comparison of the q values derived from the (native vs. invasive Drosophila suzuki populations of the European invasion route) versus the (native vs. worldwide invasive populations) statistics (A), the (native vs. invasive populations of the American invasion route) versus the statistic (B), and the versus the statistics (C). In (A), (B), and (C), the dashed vertical and horizontal lines indicate the 1% q-value threshold for the C2 derived q values. (D) Venn diagram of the number of SNPs significant at the 1% q values among the three contrast analyses ( and ). Values for the autosomal (X-linked) SNPs are plotted in purple (green).
Description of the 26 Orthologous Drosophila melanogaster Genes Represented by At Least Two of the 204 SNPs Found Significant for One of the Three Contrast Analyses, (6 native vs. 16 invasive populations), (six native vs. eight invasive populations of the European invasion route), and (six native vs. eight invasive populations of the American invasion route).
| Number of Significant SNPs | |||||
|---|---|---|---|---|---|
|
| Position on dmel6 (in kb) | All |
|
|
|
| Der-1 (Derlin-1) | chr2L:1,974–1,975 | 2 (236) | 1 | — | 1 |
| Gdi (GDP dissociation inhibitor) | chr2L:9,492–9,495 | 4 (342) | 4 | 4 | — |
| lncRNA:CR45693 (long noncoding RNA) | chr2L:14,51–14,512 | 2 (14) | 2 | 1 | — |
| Tpr2 (tetratricopeptide repeat protein 2) | chr2L:16,492–16,507 | 2 (8) | — | 2 | — |
| Ret (Ret oncogene) | chr2L:21,182–21,199 | 2 (70) | 2 | — | — |
| tou (toutatis) | chr2R:11,579–11,616 | 2 (18) | 1 | — | 2 |
| jeb (jelly belly) | chr2R:12,091–12,119 | 2 (14) | 2 | — | — |
| CG5065 | chr2R:16,608–16,625 | 2 (13) | — | 2 | — |
| bab2 (bric a brac 2) | chr3L:1,140–1,177 | 2 (11,189) | 1 | — | 1 |
| axo (axotactin) | chr3L:4,630–4,687 | 2 (25,886) | — | 1 | 1 |
| RhoGEF64C (ρ guanine nucl. exch. fact. at 64 °C) | chr3L:4,693–4,796 | 2 (8) | 2 | 1 | 1 |
| CG7509 | chr3L:4,803–4,805 | 2 (5) | — | 2 | — |
| Con (connectin) | chr3L:4,938–4,976 | 2 (616) | 1 | 1 | — |
| Ets65A (Ets at 65A) | chr3L:6,098–6,124 | 2 (27,998) | 1 | 1 | — |
| lncRNA:CR45759 (long noncoding RNA) | chr3L:6,787–6,787 | 4 (106) | — | — | 4 |
| ome (omega) | chr3L:14,673–14,748 | 2 (1) | 2 | — | — |
| sa (spermatocyte arrest) | chr3L:21,405–21,407 | 2 (61) | 1 | 1 | — |
| yellow-e (yellow-e) | chr3R:13,410–13,415 | 3 (33) | 3 | — | 1 |
| cv-c (crossveinless c) | chr3R:14,392–14,482 | 4 (2,737) | 1 | — | 3 |
| osa (osa) | chr3R:17,688–17,718 | 2 (29) | — | — | 2 |
| cpo (couch potato) | chr3R:17,944–18,016 | 3 (193) | 3 | 2 | 3 |
| Rh3 (rhodopsin 3) | chr3R:20,081–20,082 | 2 (5,709) | 2 | 1 | — |
| Ctl2 (choline transporter-like 2) | chr3R:29,123–29,128 | 2 (3) | — | — | 2 |
| Syt12 (synaptotagmin 12) | chrX:13,359–13,368 | 3 (65) | 1 | — | 2 |
| Ac13E (adenylyl cyclase 13E) | chrX:15,511–15,554 | 4 (19) | — | — | 4 |
| Axs (abnormal X segregation) | chrX:16,680–16,684 | 2 (11) | — | — | 2 |
Note.—The third column gives the overall number of significant SNPs (at the 1% q-value threshold) and their maximal spacing in bp (on the D. suzukii assembly). Columns 4–6 give the number of significant SNPs for each of the three contrast analyses.