| Literature DB >> 33502506 |
Magdalena Bohutínská1,2, Vinzenz Handrick3, Levi Yant3, Roswitha Schmickl1,2, Filip Kolář1,2,4, Kirsten Bomblies3,5, Pirita Paajanen3.
Abstract
A sudden shift in environment or cellular context necessitates rapid adaptation. A dramatic example is genome duplication, which leads to polyploidy. In such situations, the waiting time for new mutations might be prohibitive; theoretical and empirical studies suggest that rapid adaptation will largely rely on standing variation already present in source populations. Here, we investigate the evolution of meiosis proteins in Arabidopsis arenosa, some of which were previously implicated in adaptation to polyploidy, and in a diploid, habitat. A striking and unexplained feature of prior results was the large number of amino acid changes in multiple interacting proteins, especially in the relatively young tetraploid. Here, we investigate whether selection on meiosis genes is found in other lineages, how the polyploid may have accumulated so many differences, and whether derived variants were selected from standing variation. We use a range-wide sample of 145 resequenced genomes of diploid and tetraploid A. arenosa, with new genome assemblies. We confirmed signals of positive selection in the polyploid and diploid lineages they were previously reported in and find additional meiosis genes with evidence of selection. We show that the polyploid lineage stands out both qualitatively and quantitatively. Compared with diploids, meiosis proteins in the polyploid have more amino acid changes and a higher proportion affecting more strongly conserved sites. We find evidence that in tetraploids, positive selection may have commonly acted on de novo mutations. Several tests provide hints that coevolution, and in some cases, multinucleotide mutations, might contribute to rapid accumulation of changes in meiotic proteins.Entities:
Keywords: coevolution; de novo mutations; meiosis; polyploidy; standing variation
Year: 2021 PMID: 33502506 PMCID: PMC8097281 DOI: 10.1093/molbev/msab001
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 16.240
Fig. 1.Meiosis proteins showing signatures of positive selection in Arabidopsis arenosa lineages. (A) Our sampling of A. arenosa populations in Europe. Dots show 14 diploid (red) and 11 tetraploid populations without signs of introgression from diploids (blue) studied here. Distribution ranges of all known A. arenosa lineages are shown as colored areas, indicating that our sampling covers a complete diversity of diploid lineages (based on Kolář et al. 2016; Monnahan et al. 2019). The tetraploid distribution range covers areas occupied by populations without signs of introgression from diploids (Monnahan et al. 2019). (B) Phylogeny of A. arenosa (based on Kolář et al. 2016; Monnahan et al. 2019) with candidate meiosis proteins placed on the branch where they exhibit signatures of selective sweeps (identified as FST and FineMAV overlap, see the main text). Width of the branches corresponds to the number of meiosis proteins that are identified as positive selection candidates. Only Pannonian and tetraploid lineages had more meiosis proteins showing signatures of positive selection than expected by chance. Lineages with no evidence for positive selection on meiosis proteins are indicated as “None.” Proteins are ordered from those having the highest number of candidate AASs to the lowest (supplementary tables S5 and S6, Supplementary Material online). Three proteins found independently as candidates in parallel in two lineages are written in bold. Time axis below the tree indicates median estimates of lineage divergence times (based on Arnold et al. 2015; Kolář et al. 2016). (C) Principal component analysis based on allele frequencies of candidate AASs in the three parallel candidate meiosis proteins. Each dot represents one individual, colored based on lineage in panel A. (D) positive selection targeted more conserved amino acids in tetraploids (blue) than in diploids (red; summarizing candidate AASs identified in all but the Pannonian diploid lineage—orange). Each violin plot summarizes alignment identity (calculated across 17 plant reference genomes, higher value indicate more conserved site) over all candidate AASs identified in the corresponding lineage. **P = 0.002, Wilcoxon rank sum test.
Fig. 2.Limited standing variation across Arabidopsis arenosa diploids in protein candidates for tetraploid meiotic adaptation. (A) Lack of tetraploid-specific haplotypes in diploid populations sampled across the total range of A. arenosa. Haplotypes were combined across linked candidate AASs within each protein. A set of bar plots for each of ten candidate proteins (horizontal lines) shows frequencies of diploid, Pannonian (if different from widespread diploid) and tetraploid-specific haplotypes (y axis) in each of 14 diploid and 11 tetraploid populations (x axis, grouped to lineages and ploidies). Frequencies of minor frequency haplotypes found in either or both ploidies are summed in a gray column. (B) A hypothetical maximal variation among haplotypes of meiosis proteins in diploids and tetraploids, quantified by Hamming distances. The diameter of the red and blue circles denotes the full range of potential variability of haplotypes reconstructed by all combinations of AASs among all diploid and tetraploid individuals, respectively. The relative distance of the red and blue circles denotes the genetic distance between the diploid and tetraploid haplotypes. Overlap of both circles suggests that it is plausible that the tetraploid haplotype could have existed within the observed variation in diploids, even if the exact tetraploid haplotype was not found in our diploid sampling. Filled area of the tetraploid circle, nonoverlapping with diploid, represents the tetraploid haplotype space that cannot be explained by, and would not be expected to exist, within extant diploid AAS variation. The upper six proteins show evidence that their tetraploid haplotypes most likely accumulated additional mutations after diploid/tetraploid divergence.
Fig. 3.Evidence for meiosis protein coevolution in tetraploids. (A) Cartoon of the cohesin complex with associated proteins and variability in relative order of their selection sweeps inferred from haplotype length and number of accumulated SNPs (see Materials and Methods for details). Shown are schemes of candidate protein structures (outlined in black) and other core complex protein structures for illustration (gray). We propose that REC8/SYN1 (bolt) might be the core driver of coevolution as it is the central protein with one of the oldest sweeps. (B) Illustrative examples of pattern of allele frequency decay at locus with old (REC8/SYN1) and young (ASY3) selection sweep (as inferred in A). Plotted is AFD between diploid and tetraploid individuals for all genic variants in and around the gene. Red dots are candidate AASs identified here; blue line corresponds to 10 kb. (C) Coordinated structural changes in protein-binding sites. Cartoons of secondary protein structures from diploid A. arenosa meiosis proteins (upper lane; in orange = helix elements, in yellow = sheet elements, and black line = disordered protein regions). The pairwise comparison of predicted secondary protein structures from sequences of diploid and tetraploid A. arenosa lineages (middle lane, Structure identity plots) and the identity of their amino acid sequences (lower lane, AA identity plots). Gaps are sites with zero identity. Protein-binding sites and functional domains identified in other eukaryotes are shown as violet bars above the secondary structure plot. Reciprocal structure identity changes in corresponding binding sites of REC8/SYN1 and SCC3 and to lesser degree REC8/SYN1- and PDS5b-binding sites might indicate coevolution of these proteins—highlighted in light red.