| Literature DB >> 31580421 |
Venkat Talla1, Lucile Soler2, Takeshi Kawakami1, Vlad Dincă3, Roger Vila4, Magne Friberg5, Christer Wiklund6, Niclas Backström1.
Abstract
The relative role of natural selection and genetic drift in evolution is a major topic of debate in evolutionary biology. Most knowledge spring from a small group of organisms and originate from before it was possible to generate genome-wide data on genetic variation. Hence, it is necessary to extend to a larger number of taxonomic groups, descriptive and hypothesis-based research aiming at understanding the proximate and ultimate mechanisms underlying both levels of genetic polymorphism and the efficiency of natural selection. In this study, we used data from 60 whole-genome resequenced individuals of three cryptic butterfly species (Leptidea sp.), together with novel gene annotation information and population recombination data. We characterized the overall prevalence of natural selection and investigated the effects of mutation and linked selection on regional variation in nucleotide diversity. Our analyses showed that genome-wide diversity and rate of adaptive substitutions were comparatively low, whereas nonsynonymous to synonymous polymorphism and substitution levels were comparatively high in Leptidea, suggesting small long-term effective population sizes. Still, negative selection on linked sites (background selection) has resulted in reduced nucleotide diversity in regions with relatively high gene density and low recombination rate. We also found a significant effect of mutation rate variation on levels of polymorphism. Finally, there were considerable population differences in levels of genetic diversity and pervasiveness of selection against slightly deleterious alleles, in line with expectations from differences in estimated effective population sizes.Entities:
Keywords: zzm321990 Leptideazzm321990 ; Lepidoptera; adaptation; cryptic species; selection; speciation
Mesh:
Year: 2019 PMID: 31580421 PMCID: PMC6795238 DOI: 10.1093/gbe/evz212
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
. 1.—Boxplots showing the nucleotide diversity at different site categories calculated using all (vertical panel a), or only weak to weak (A/T) and strong to strong (G/C) polymorphisms (vertical panel b).
. 2.—The distribution of GC-content (%) across 100 kb windows for protein coding- and noncoding sequences (a) and for separate site categories (b).
. 3.—Scatter plots showing the differences in average pN/pS ratios in low- (brown) and high- (green) diversity regions for each species. Vertical colored lines show the mean pN/pS of low- (red) and high- (blue) diversity regions, respectively. Note that diversity is calculated as the window-based estimate relative to the genomic average (x axis) and data points are therefore centered at 0.
Ratios of Nonsynonymous to Synonymous Polymorphisms (pN/pS) in Low- (low) and High-Diversity (high) Regions across the Genome in All Species
| Species | Low | High |
|
|---|---|---|---|
|
| 0.22±0.29 | 0.16±0.21 | 1.8×10−20 |
|
| 0.24±0.20 | 0.18±0.13 | 5.8×10−19 |
|
| 0.22±0.28 | 0.18±0.20 | 4.8×10−4 |
Note.—P values for the Mann–Whitney U test are given for each respective comparison.
Ratios of Nonsynonymous to Synonymous Substitutions (dN/dS or ω) in Low- (low) and High-Diversity (high) Regions across the Genome in All Species
| Species | Low | High |
|
|---|---|---|---|
|
| 0.15±0.26 | 0.11±0.25 | 3.9×10−3 |
|
| 0.21±0.31 | 0.18±0.28 | 4.2×10−4 |
|
| 0.19±0.36 | 0.13±0.28 | 1.8×10−4 |
Note.—P values for the Mann–Whitney U test are given for each respective comparison.
. 4.—The relationship between nucleotide diversity at 4-fold degenerate sites (π4D) and gene density (proportion of protein coding/exonic sites in a window in %) calculated across 100 kb windows in all three species.
Summary of the Multiple Linear Regression Analysis Where Base Composition (GC), Recombination Rate (ρ), Gene Density (GD), and Mutation Rate (dS) Were Used as Explanatory Variables for Variation in Genetic Diversity at 4-Fold Degenerate Coding Positions (π4D)
| Population | Parameter | Estimate | SE |
| Pr(>| |
|---|---|---|---|---|---|
| LsSwe | GC | −7.94×10−5 | 4.94×10−4 | −0.16 | 0.872 |
| ρ | 2.12×10−3 | 5.06×10−4 | 4.18 | 2.98×10−5*** | |
| GD | −7.40×10−4 | 5.32×10−4 | −1.39 | 0.165 | |
|
| −2.41×10−2 | 2.72×10−2 | −0.89 | 0.374 | |
| LsKaz | GC | 1.51×10−4 | 4.98×10−4 | 0.30 | 0.762 |
| ρ | 9.59×10−4 | 5.09×10−4 | 1.88 | 0.060 | |
| GD | −7.45×10−4 | 5.35×10−4 | −1.39 | 0.164 | |
|
| −2.29×10−2 | 2.71×10−2 | −0.84 | 0.399 | |
| LsSpa | GC | 3.52×10−4 | 4.86×10−4 | 0.73 | 0.469 |
| ρ | 5.22×10−3 | 5.01×10−4 | 10.42 | <2.0×10−16*** | |
| GD | 2.78×10−4 | 5.22×10−4 | 0.53 | 0.594 | |
|
| −6.19×10−2 | 2.68×10−2 | −2.32 | 2.1×10−2* | |
| LrSpa | GC | −1.05×10−4 | 6.67×10−4 | −0.16 | 0.875 |
| ρ | 1.53×10−3 | 6.62×10−4 | 2.31 | 2.09×10−2* | |
| GD | −1.28×10−3 | 6.97×10−4 | −1.84 | 6.62×10−2 | |
|
| −1.08×10−1 | 1.85×10−2 | −5.81 | 7.17×10−9*** | |
| LjKaz | GC | −7.35×10−4 | 5.34×10−4 | −1.38 | 0.169 |
| ρ | 4.30×10−3 | 5.64×10−4 | 7.63 | 3.52×10−14 | |
| GD | −1.32×10−3 | 5.74×10−4 | −2.30 | 2.15×10−2 | |
|
| −8.69×10−2 | 2.12×10−2 | −4.09 | 4.42×10−5 | |
| LjIre | GC | −4.64×10−4 | 5.35×10−4 | −0.868 | 0.385 |
| ρ | 6.90×10−4 | 5.70×10−4 | 1.21 | 0.226 | |
| GD | −1.14×10−3 | 5.69×10−4 | −2.003 | 0.045* | |
|
| −6.06×10−2 | 2.11×10−2 | −2.875 | 0.004** |
Note.—Variance inflation factors for explanatory variables and interaction effects are presented in supplementary table 7, Supplementary Material online. The significance level of the variables are represented by the symbol ‘*’. ‘***’ represents highly significant, ‘**’ represents moderately significant and ‘*’ represents slightly significant.