| Literature DB >> 33350518 |
María Bogaerts-Márquez1,2, Sara Guirao-Rico1,2, Mathieu Gautier3, Josefa González1,2.
Abstract
While several studies in a diverse set of species have shed light on the genes underlying adaptation, our knowledge on the selective pressures that explain the observed patterns lags behind. Drosophila melanogaster is a valuable organism to study environmental adaptation because this species originated in Southern Africa and has recently expanded worldwide, and also because it has a functionally well-annotated genome. In this study, we aimed to decipher which environmental variables are relevant for adaptation of D. melanogaster natural populations in Europe and North America. We analysed 36 whole-genome pool-seq samples of D. melanogaster natural populations collected in 20 European and 11 North American locations. We used the BayPass software to identify single nucleotide polymorphisms (SNPs) and transposable elements (TEs) showing signature of adaptive differentiation across populations, as well as significant associations with 59 environmental variables related to temperature, rainfall, evaporation, solar radiation, wind, daylight hours, and soil type. We found that in addition to temperature and rainfall, wind related variables are also relevant for D. melanogaster environmental adaptation. Interestingly, 23%-51% of the genes that showed significant associations with environmental variables were not found overly differentiated across populations. In addition to SNPs, we also identified 10 reference transposable element insertions associated with environmental variables. Our results showed that genome-environment association analysis can identify adaptive genetic variants that are undetected by population differentiation analysis while also allowing the identification of candidate environmental drivers of adaptation.Entities:
Keywords: Drosophila melanogaster; allele frequency; genetic adaptation; genome-environment; transposable elements
Mesh:
Substances:
Year: 2021 PMID: 33350518 PMCID: PMC7986194 DOI: 10.1111/mec.15783
Source DB: PubMed Journal: Mol Ecol ISSN: 0962-1083 Impact factor: 6.185
FIGURE 1Drosophila melanogaster samples used in this study. Samples were collected across Europe and North America east coast (see Table S1) in four main climate zones and seven subclimate zones (depending on precipitation and level of heat), according to the Köppen‐Geiger climate distribution (Kottek et al., 2006) [Colour figure can be viewed at wileyonlinelibrary.com]
Summary of the four data sets used in this analysis: three European and one North American data set
| Data set | No of populations | Autosomes | X chromosome | Total | |||
|---|---|---|---|---|---|---|---|
| SNPs | TEs | SNPs | TEs | SNPs | TEs | ||
| Europe | 20 | 2,846,701 | 249 | 119,228 | 53 | 2,965,929 | 302 |
| North America | 11 | 2,147,276 | 280 | 291,632 | 64 | 2,438,908 | 344 |
| Europe Fall | 10 | 2,663,700 | 227 | 115,147 | 49 | 2,778,847 | 276 |
| Europe Summer | 14 | 2,725,176 | 222 | 117,332 | 42 | 2,842,508 | 264 |
FIGURE 2Genome‐wide distribution of the XtX* values (population differentiation) associated with single nucleotide polymorphisms (SNPs) and transposable elements (TEs) in the four data sets. Nonsignificant SNPs and TEs are plotted in grey and black, respectively. Significant SNPs located inside and outside of inversions are plotted in dark blue and blue, respectively. Significant TEs are plotted in red. Genes for the five most significant SNPs for each data set are highlighted [Colour figure can be viewed at wileyonlinelibrary.com]
Candidate genes showing the most significant population differentiation patterns
| Gene | SNP location | Data set | XtX* | Phenotype |
|---|---|---|---|---|
|
| Gene body/Upstream | NA | 89.63 | –/Alcohol, Starvation |
|
| Upstream | NA | 88.85 | Aggressiveness; Diapause; Immunity; Starvation |
|
| Gene body | NA | 88.40 | Olfactory |
|
| Gene body | NA | 83.40 | Circadian; Starvation |
|
| Upstream | NA | 77.60 | – |
|
| Upstream | EuS | 171.38 | Alcohol; Oxidative |
|
| Upstream | EuF | 117.21 | Immunity; Starvation |
|
| Gene body | EuF | 105.60 | Alcohol, Circadian behavior, Oxidative, Xenobiotic |
|
| Gene body | EuF | 104.07 | Olfactory, Oxidative |
|
| Gene body | EuF | 116.21 | – |
|
| Gene body | Eu | 197.83 | Diapause, Insecticide resistance, Olfactory, Starvation |
| Gene body | EuS | 166.55 | ||
|
| Gene body | Eu | 213.90 | – |
| Gene body | EuS | 194.73 | ||
|
| Gene body | Eu | 228.63 | – |
| Gene body | EuS | 172.06 | ||
|
| Upstream/ Gene body | Eu | 192.62 | Starvation |
| EuS | 171.38 | |||
|
| Gene body | Eu | 211.90 | Aggressiveness |
| Gene body | EuF | 110.64 |
For each data set, top 5 genes with SNPs located in the gene body or upstream region (< 1kb) with the highest significant XtX* values and their associated phenotype (see Table S12).
Abbreviations: Eu, Europe; EuF, Europe Fall; EuS, Europe Summer; NA, North America.
GO enrichment analysis of candidate genes for local adaptation
| Data set | Significant SNPs | Significant genes | GO enrichment terms |
|---|---|---|---|
| Europe | 719 | 410 | Neuron development; eye development; signalling; organ morphogenesis; growth |
| North America | 1,164 | 583 | Response to stimulus; organ development; regulation of growth; nervous system development; localization and transport |
| Europe Summer | 752 | 396 | Learning/memory; eye development; neuron development; sensory perception of pain; organ morphogenesis |
| Europe Fall | 821 | 412 | Signalling; localization/transport; organ morphogenesis; neuron development; heart morphogenesis |
For each data set, the number of genes and significant single nucleotide polymorphisms (SNPs), located in the gene body and upstream region (<1 kb), and the top five most enriched GO terms (significance >1.3). The significance of the SNPs was determined based on the empirical distribution of the calibrated XtX* values (top 0.05%), which corresponds to q‐value thresholds of 7.56e‐10 in Europe, 3.70e‐07 in Europe Summer, 1.10e‐05 in Europe Fall, 9.90e‐06 in North America for autosomes; and to q‐value thresholds of 1.49e‐05 in Europe, 0.0003 in Europe Summer, 0.000441 in Europe Fall and 0.03 North America for X chromosome.
Summary of results obtained for environmental association
| Europe | NA | Europe Summer | Europe Fall | |
|---|---|---|---|---|
| Temperature |
|
|
|
|
| Wind |
| 83 | 17 |
|
| Rainfall | 29 |
|
|
|
| Evaporation | 36 | 79 |
| 7 |
| Solar radiation |
|
| 19 | 3 |
| Soil | – | 4 | – | – |
| Daylight hours | 18 | 52 | 8 | 1 |
| Total | 296 | 382 | 155 | 64 |
Number of genes with significant SNPs (BF ≥ 30) located in the gene body or upstream region (< 1 kb) for each type of environmental variable. In bold, three top type of environmental variables with more genes for each data set.
FIGURE 3Overlap between genes with single nucleotide polymorphisms (SNPs) significantly associated with environmental variables. For each data set, genes with SNPs in the gene body and upstream region (<1 kb) significantly associated with the three groups of environmental variables with more genes associated with them are depicted [Colour figure can be viewed at wileyonlinelibrary.com]
Candidate genes associated with environmental variables
| Gene Name | SNP location | Data set | Strongest association variable | BF | Phenotype |
|---|---|---|---|---|---|
|
| Gene body | Eu | Annual mean temperature | 84.89 | Diapause, insecticide resistance, olfactory, starvation |
|
| Gene body | Eu | Annual mean solar radiation | 62.74 | Starvation |
|
| Gene body | Eu | Annual mean temperature | 72.05 | ‐ |
|
| Gene body | Eu | Mean temperature of warmest quarter | 60.59 | ‐ |
|
| Gene body | Eu | Mean evaporation of warmest quarter | 70.75 | ‐ |
|
| Gene body | NA | Annual mean solar radiation/Solar rad mean diurnal range | 64.34 | Circadian, starvation |
|
| Gene body | NA | Solar rad mean diurnal range | 61.76 | ‐ |
|
| Gene body | NA | Temperature seasonality | 58.18 | ‐ |
|
| Gene body | NA | Annual mean solar radiation | 51.11 | Alcohol, dessication, pigmentation |
|
| Upstream | NA | Wind mean diurnal range | 50.57 | Aggressiveness, diapause, immunity, starvation |
|
| Gene body | EuS | Annual mean temperature | 74.38 | Diapause, olfactory, starvation |
|
| Gene body | EuS | Max temperature of warmest month | 68.08 | ‐ |
|
| Gene body/Upstream | EuS | Max temperature of warmest month | 58.37 | ‐ /Hypoxia, immunity, xenobiotic |
|
| Upstream | EuS | Precipitation of driest quarter | 58.48 | Xenobiotic |
|
| Gene body | EuS | Mean evaporation of warmest quarter | 62.11 | ‐ |
|
| Gene body | EuF | Wind variability index | 49.59 | Alcohol, circadian behaviour, oxidative, xenobiotic |
|
| Gene body | EuF | Wind variability index | 44.49 | ‐ |
|
| Upstream | EuF | Temperature seasonality | 65.35 | Immunity, starvation |
|
| Gene body | EuF | Wind seasonality | 48.51 | Hypoxia |
|
| Gene body | EuF | Wind seasonality | 42,35 | ‐ |
For each data set, the top five genes with significant single nucleotide polymorphisms (SNPs) located in the gene body and upstream region (<1 kb) with the highest significant Bayes factor (BF) scores, the environmental variable with the strongest association and their associated phenotype (see Table S12). All significant genes can be found in Table S10.
Abbreviations: EuA, Europe; EuF, Europe Fall; EuS, Europe Summer; NA, North America.
Significant transposable element insertions found in the population differentiation analysis
| Transposable element | Family | Location | Gene | Data set | Evidence of selection |
|---|---|---|---|---|---|
|
|
| First intron |
| Eu, EuF | iHS, H12, nSL (Rech et al., |
|
|
| First intron |
| Eu | Population differentiation (González et al., |
|
|
| First intron |
| EuF | Population differentiation (González et al., |
|
| S‐element | Second intron |
| EuF | CSTV (Lerat et al., |
|
|
| 432 bp downstream |
| Eu, EuS | – |
|
| BS | 507 bp downstream |
| NA | – |
|
| hopper | Third intron/first intron |
| NA | – |
|
| G2 | First intron |
| NA | – |
|
| hobo | 52 bp upstream/529 bp downstream |
| NA | – |
TE insertions were considered significant if their associated XtX* values were above the top 1% of the empirical distribution of XtX* values, and q‐value <0.05. When the TE insertion is located in intergenic regions, genes located nearby are reported (Table S11A).
Abbreviations: CSTV, correlation with spatiotemporal variables; Eu, Europe; EuF, Europe Fall; EuS, Europe Summer; NA, North America.
Significant candidate TE insertions associated with environmental variables (BF ≥ 20)
| Transposable element | Environmental variable | Significant XtX* | BF | Data set |
|---|---|---|---|---|
|
| Isothermality | No | 30.53 | Eu |
|
| Min temperature of coldest month | Yes | 43.38 | Eu |
|
| Temperature Annual range | Yes | 24.79 | Eu |
|
| Evaporation Mean diurnal range | No | 20.52 | Eu |
|
| Precipitation Seasonality | No | 22.12 | Eu |
|
| Isothermality | Yes | 23.69 | Eu |
|
| Annual mean wind | Yes | 43.95 | NA |
|
| Solar radiation variability index | No | 26.83 | NA |
|
| Precipitation of wettest quarter | Yes | 26.02 | NA |
|
| Mean evaporation of coldest quarter | No | 28.57 | NA |
The environmental variable with highest score is reported (Table S11).
Abbreviations: Eu, Europe; NA, North America.