Literature DB >> 35969752

Radiation and hybridization underpin the spread of the fire ant social supergene.

Quentin Helleu1, Camille Roux2, Kenneth G Ross3, Laurent Keller1.   

Abstract

Supergenes are clusters of tightly linked genes that jointly produce complex phenotypes. Although widespread in nature, how such genomic elements are formed and how they spread are in most cases unclear. In the fire ant Solenopsis invicta and closely related species, a "social supergene controls whether a colony maintains one or multiple queens. Here, we show that the three inversions constituting the Social b (Sb) supergene emerged sequentially during the separation of the ancestral lineages of S. invicta and Solenopsis richteri. The two first inversions arose in the ancestral population of both species, while the third one arose in the S. richteri lineage. Once completely assembled in the S. richteri lineage, the supergene first introgressed into S. invicta, and from there into the other species of the socially polymorphic group of South American fire ant species. Surprisingly, the introgression of this large and important genomic element occurred despite recent hybridization being uncommon between several of the species. These results highlight how supergenes can readily move across species boundaries, possibly because of fitness benefits they provide and/or expression of selfish properties favoring their transmission.

Entities:  

Keywords:  fire ants; hybridization; introgression; social polymorphism; supergene

Mesh:

Year:  2022        PMID: 35969752      PMCID: PMC9407637          DOI: 10.1073/pnas.2201040119

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   12.779


Understanding how complex traits requiring multiple novel mutations arise and are maintained in populations is a long-standing question in evolutionary biology (1–3). Supergenes are clusters of multiple, tightly linked genes that collectively produce complex phenotypes (2). They have evolved independently in many taxa and are responsible for intraspecific polymorphism in several important morphological and behavioral traits. The most prominent examples of supergenes are heteromorphic sex chromosomes, which drive the alternate development of males or females. Other examples of supergenes underpinning alternate phenotypes include Batesian mimicry in numerous butterfly species (4–7), self-incompatibility and floral heteromorphy in plants (S locus) (8–11), male meiotic drive (sperm killers) (12), bird plumage color polymorphism (13), and alternative social organization in ant colonies (14–17). In most known cases, the structural integrity of the supergene results from chromosomal rearrangements that suppress local recombination and thereby prevent dissociation of the genetic elements responsible for the integrated expression of complex character suites (5, 10, 13, 16). The first supergene producing alternative social organization was identified in the fire ant Solenopsis invicta (16). In this species, colonies contain either a single egg-laying queen (monogyne social form) or multiple queens (polygyne social form), a fundamental distinction associated with a suite of important individual- and colony-level phenotypic differences (15, 18–21). Studies of invasive populations in the United States have revealed that this supergene comprises two haplotypes, the Social b (Sb) and the Social B (SB), which regulate the polymorphism (16, 22). In S. invicta, monogyne colonies invariably have a single homozygous SB/SB queen and only SB/SB workers, while polygyne colonies always have multiple heterozygous (SB/Sb) queens as well as workers of all three genotypes (15). Moreover, in invasive US populations, the Sb haplotype is responsible for a selfish “green beard” effect whereby SB/Sb (polygyne) workers recognize the cuticular chemical profiles of SB/SB queens and selectively eliminate them as they mature sexually and initiate reproduction (23, 24). Recent studies revealed that social organization also is regulated by the presence/absence of workers with the Sb haplotype in six closely related species of the Solenopsis saevissima species group (i.e., Solenopsis richteri, Solenopsis megergates, Solenopsis macdonaghi, Solenopsis quinquecuspis, the undescribed Solenopsis AdRX, and the undescribed Solenopsis near interrupta) (22, 25, 26). These species collectively are referred to as the socially polymorphic South American fire ants (22). The extant Sb haplotype invariably consists of three inversions that together span a region of ∼11.4 Mb on chromosome 16 containing >476 described genes. As a result, recombination is greatly reduced between the Sb and the SB haplotypes (16, 22, 27). The largest inversion, In(16)1, spans 9.48 Mb of chromosome 16. A second inversion, In(16)2, spans 0.84 Mb between In(16)1 and a third inversion, In(16)3. In(16)2 likely emerged after In(16)1, given that a small fragment of In(16)1—telomeric in SB but central in Sb—is inverted again in the In(16)2 Sb haplotype (22). In(16)3 (1.07 Mb) is located ∼25 kb distant from In(16)2 in a pericentric region of chromosome 16 (22, 28). Previous analyses of the divergence between the Sb and SB haplotypes within species suggested that the three inversions arose over an evolutionarily short time span (22). Reconstructing the evolutionary history of supergene evolution is essential to understanding how such remarkable genomic elements originate and, possibly, the sequence of incorporation of key phenotypic elements into complex traits. The socially polymorphic species are hypothesized to have diverged from the outgroup species S. saevissima and Solenopsis metallica, ∼0.75 to 1.75 million y ago and then radiated within the past ∼500,000 y (22, 29). While a single origin of the Sb haplotype is robustly supported (30, 31), it remains unclear whether this unique element originated in the ancestral population before the radiation, or whether it has invaded the known socially polymorphic species through introgression events (22, 29, 32). Multiple studies have documented recent hybridization involving three of the socially polymorphic Solenopsis species in both the native and introduced ranges (33–37), highlighting a demographic context conducive to such genomic invasion in at least some cases. Here, we present a comprehensive analysis of the history of the fire ant supergene intended to shed light on when and in what order each inversion occurred, and whether introgression of the Sb haplotype between species best explains observed patterns in the comparative genomic data. Using complementary approaches, we conclude that In(16)3 is the oldest inversion and that it likely emerged in the common ancestor of S. richteri and the S. invicta/AdRX lineage. In(16)1 emerged next, during the divergence between the S. richteri and S. invicta/AdRX lineages. Finally, the youngest [In(16)2] inversion emerged in the S. richteri lineage. The supergene comprising all three inversions thus emerged in S. richteri, only to spread to the five most closely related species and confer the social polymorphism to them.

Results

Emergence of the Supergene Inversions in the S. invicta/AdRX–S. richteri Subgroup.

To understand the evolutionary history of the Sb haplotype, we first reconstructed the species phylogeny using whole-genome data from 107 SB and 58 Sb haploid males from six species that are socially polymorphic (S. invicta, the undescribed S. AdRX, S. richteri, S. megergates, S. macdonaghi, and the undescribed S. nr. interrupta) as well as 20 haploid males from two outgroup species (S. saevissima and S. metallica). We applied two phylogenetic approaches, a maximum-likelihood phylogenetic analysis of a concatenated alignment of the 185 individuals’ chromosomes 1 to 15 (supergene-bearing chromosome 16 not included) and a gene-tree coalescent-based method using randomly selected nonoverlapping genomic fragments (Fig. 1 and ). The two approaches yielded concordant topologies in which the six species fall into five nested monophyletic groups. However, the phylogenomic analyses also revealed widespread phylogenetic discordance due to incomplete lineage sorting (ILS) and/or introgression (see below and Fig. 1 and ).
Fig. 1.

Phylogenetic hypotheses for six socially polymorphic fire ant species and their three Sb supergene inversions. Rooted cladograms (Top) and unrooted phylogenies (Bottom) of chromosomes 1 to 15 (A) and of inversion In(16)3 (B), inversion In(16)2 (C), and inversion In(16)1 (D) based on complete genomes of 185 (haploid) males. For each inversion phylogeny, Sb haplotypes are circled with a thick black line. Sequences from the outgroup species are not circled. Bootstrap support values <100 on inner branches between species lineages are shown in red type.

Phylogenetic hypotheses for six socially polymorphic fire ant species and their three Sb supergene inversions. Rooted cladograms (Top) and unrooted phylogenies (Bottom) of chromosomes 1 to 15 (A) and of inversion In(16)3 (B), inversion In(16)2 (C), and inversion In(16)1 (D) based on complete genomes of 185 (haploid) males. For each inversion phylogeny, Sb haplotypes are circled with a thick black line. Sequences from the outgroup species are not circled. Bootstrap support values <100 on inner branches between species lineages are shown in red type. To reconstruct the evolutionary history of the Sb haplotype, we examined the phylogenetic relationships of the Sb and SB haplotypes by conducting maximum-likelihood analyses of complete alignments of each inversion in the 185 sequenced haploid male genomes. Each of the three inversions yielded a different, well-supported topology, yet for all three inversions the Sb haplotypes always formed a well-supported monophyletic group (Fig. 1). Examination of these topologies along with the species tree revealed that In(16)3 is likely to be the oldest inversion; specifically, the Sb haplotype clade with this inversion forms a sister group to the SB haplotypes of S. invicta/AdRX and S. richteri, the branching pattern of which mirrors the branching pattern in the species tree (Fig. 1). This result is consistent with inversion In(16)3 having emerged before the initial divergence of the three species lineages commenced. In(16)1 was likely the next inversion to appear. This inversion haplotype forms a poorly supported sister group to the SB haplotypes of S. AdRX and S. invicta, with low statistical support for this inner branch (bootstrap value 78; Fig. 1) as well as generally shallow node depths in this part of the tree. To investigate whether the low support for the focal branch was an artifact of concatenating segments of In(16)1 having different evolutionary histories, we used a sliding window–based phylogenetic analysis. This analysis indicates that In(16)1 is not a mosaic of genomic segments differing in their evolutionary histories (). Instead, the poorly resolved node(s) in this area of the In(16)1 tree may be the product(s) of other historical processes, including ILS (expected with short branch lengths) and/or recombination between the SB and Sb haplotypes. Finally, In(16)2 appears to be the youngest inversion. Haplotypes of In(16)2 form a sister group to the S. richteri SB haplotypes, suggesting that it emerged in the lineage leading to S. richteri (Fig. 1). It is worth noting that the internal branch leading to the S. richteri SB haplotypes and all Sb haplotypes, although very short, is strongly supported by the bootstrapping results (bootstrap value 99; Fig. 1). Importantly, the S. richteri SB haplotype lineage forms a sister group to the S. megergates–S. macdonaghi–S. invicta/AdRX SB haplotypes lineage rather than the S. invicta/AdRX SB haplotype lineage (predicted from the species tree). The fact that the In(16)2 tree disagrees substantially with the species tree suggests that SB haplotypes of In(16)2 experienced a different evolutionary history than the other inversions as well as chromosomes 1-15, particularly with regard to its passage in S. richteri. Such a pattern could be the product of ancestral balancing selection occurring in the S. richteri lineage during consolidation of the supergene via emergence of the In(16)2 inversion, which effectively locked all three inversions together. To assess whether these results derived from the phylogenetic analyses might be biased because of the large variation in sample sizes among species and haplotypes in our dataset (range of 1 to 60 individuals per haplotype/species combination; ), we conducted rarefaction analyses by generating phylogenetic trees based on one randomly selected SB and Sb individual from each species as well as two randomly selected individuals from the two outgroup species S. saevissima and S. metallica (1,000 resampling iterations). The topology of these trees was highly congruent with the topology obtained from the complete dataset (), indicating that our phylogenies are not biased by the uneven sample sizes. The topology of the SB haplotypes was largely congruent with the species topology (Fig. 1), indicating that the SB variant of chromosome 16 shares a similar evolutionary history with the remaining 15 chromosomes comprising the rest of the genome (). In contrast, and similar to what was found previously for the entire supergene (22), the topologies of each of the three Sb inversions display different lineage relationships from that of the species phylogeny (Fig. 1 and ). Surprisingly, phylogenies from each of the three Sb inversion haplotypes were almost identical to one another (). They featured two main lineages, one of which included all the S. richteri Sb haplotypes while the other comprised the Sb haplotypes from all the other species. The fact that the Sb phylogenies are very different from the species phylogeny does not support the conclusion that this supergene originated in the common ancestor of the socially polymorphic species and traversed multiple speciation events to confer the social polymorphism across the group (18, 22). This raises the important issue of what evolutionary process can explain the current distribution of the Sb supergene in fire ants as well as the observed patterns of variation in the genomic data. To further test the reliability of the phylogenetic conclusions, we used the D statistics (ABBA-BABA) method to independently quantify the relationships between the SB and Sb haplotypes from each species at each inversion. Specifically, we determined whether the haplotype of each Sb inversion in each species shares derived alleles more commonly with its conspecific SB haplotype or with a heterospecific SB haplotype (Fig. 2). These analyses revealed that the haplotypes of Sb inversions In(16)3 and In(16)1 in S. macdonaghi, S. megergates, and S. nr. interrupta share significant excesses of derived alleles with the S. invicta/AdRX and S. richteri SB haplotypes relative to conspecific SB haplotypes (Fig. 2; positive D values, P value < 0.001; ). Such excesses of derived alleles mean that S. macdonaghi, S. megergates, and S. nr. interrupta Sb haplotypes are evolutionarily closer to S. invicta/AdRX and S. richteri SB haplotypes than to conspecific SB haplotypes. This finding supports the view that these two inversions emerged in the ancestral S. invicta/AdRX–S. richteri lineage and that there has been only limited intraspecific recombination between SB and Sb haplotypes (Fig. 2). Moreover, in line with the view that In(16)2 emerged in the S. richteri lineage, our analyses further show that Sb inversion In(16)2 haplotypes of all species share a greater number of derived alleles with S. richteri SB haplotypes than with conspecific SB haplotypes.
Fig. 2.

D statistics analyses of the fire ant SB and Sb supergene haplotypes. (A) Schematic representation of the method. Only polymorphic sites with the ABBA and BABA combination of alleles in the focal species were analyzed. (B) Summary depiction of the D statistics for the In(16)3 inversion (Left), In(16)2 inversion (Middle), and In(16)1 inversion (Right) for all combinations of SB and Sb haplotypes across species (species tree, Far Left). Increasingly negative (blue) D values indicate that Sb haplotypes share increasingly more of the derived alleles with SB haplotypes from the same species than with SB haplotypes of other species. Increasingly positive (yellow) values indicate increasingly more alleles are shared with the other species than with conspecifics. Red boxes highlight comparisons for which the highest D values are predicted for each inversion based on the phylogenetic results.

D statistics analyses of the fire ant SB and Sb supergene haplotypes. (A) Schematic representation of the method. Only polymorphic sites with the ABBA and BABA combination of alleles in the focal species were analyzed. (B) Summary depiction of the D statistics for the In(16)3 inversion (Left), In(16)2 inversion (Middle), and In(16)1 inversion (Right) for all combinations of SB and Sb haplotypes across species (species tree, Far Left). Increasingly negative (blue) D values indicate that Sb haplotypes share increasingly more of the derived alleles with SB haplotypes from the same species than with SB haplotypes of other species. Increasingly positive (yellow) values indicate increasingly more alleles are shared with the other species than with conspecifics. Red boxes highlight comparisons for which the highest D values are predicted for each inversion based on the phylogenetic results. The phylogenetic analyses support a scenario in which the three inversions of the Sb supergene emerged in the following order: In(16)3→In(16)1→In(16)2. The Sb haplotypes of the In(16)3 inversion form a branch basal to the S. invicta/AdRX–S. richteri SB haplotype lineage, suggesting its emergence before the separation of these three lineages. For the In(16)1 inversion, we obtained two alternative topologies. The Sb haplotypes group with S. invicta/AdRX SB haplotypes in the first one and with S. richteri SB haplotypes in the second, suggesting its emergence during the period of divergence of these species. Finally, the Sb haplotypes of the In(16)2 inversion form a sister clade to all S. richteri SB haplotypes, suggesting its relatively late emergence in S. richteri, after separation of this lineage from S. invicta/AdRX. This scenario of the order of emergence is also supported by the D statistics and Twisst analyses conducted for each inversion (Fig. 2 and ). Together, the analyses imply that the Sb supergene was completely assembled in its current form with three inversions in the S. richteri lineage and that it later introgressed into the other species as a fully functional, modular supergene. Two lines of evidence suggest that the complete supergene moved first from S. richteri to S. invicta and then to the other species. First, the separate phylogenetic analyses of each inversion show that S. macdonaghi and S. nr. interrupta Sb haplotypes are more similar to the S. invicta/AdRX Sb haplotypes than to the S. richteri Sb haplotypes (), despite the fact that the former two species are equally related evolutionarily to all three of the latter species according to the species tree. Second, the D statistics reveal that, for each inversion, the S. nr. interrupta and S. macdonaghi Sb haplotypes share a greater number of derived alleles with the S. invicta/AdRX Sb haplotypes than with the S. richteri Sb haplotypes (). Data from S. megergates are not sufficient to reach any conclusion about the origin of its Sb haplotype that is consistent between the phylogenetic and D statistics analyses.

Genomewide Traces of Hybridization among the Socially Polymorphic Fire Ant Species.

We assessed potential opportunities for hybridization and introgression between the study species by inferring their native ranges using both the sampling locations of the individuals used in this study (from ref. 26) and additional sample data from the Global Biodiversity Information Facility (GBIF) and AntWeb databases (22, 38) (). Remarkably, the ranges of S. macdonaghi, S. megergates, and S. AdRX are completely encompassed by the range of S. invicta and partially overlap that of S. richteri (Fig. 3) (see also refs. 26 and 39). Moreover, the distribution of S. nr. interrupta partially overlaps that of S. macdonaghi, S. invicta/AdRX, and S. richteri. Given that the socially polymorphic species exhibit largely overlapping native ranges, we next searched for traces of hybridization between species in both the mitochondrial and nuclear genomes. A phylogeny of the complete mitochondrial genomes revealed that most species are highly para- or polyphyletic for this maternally inherited, cytoplasmic organelle DNA (see also ref. 36). For instance, multiple mitochondrial sequences from S. megergates and S. richteri are embedded in a monophyletic clade dominated mostly by S. invicta sequences. This and multiple other examples of mitochondrial paraphyly with respect to the species designations support the conclusion that hybridization occurred at least occasionally between S. invicta/AdRX and the three species S. macdonaghi, S. megergates, and S. richteri, as well as between S. richteri and the pair of species S. megergates and S. nr. interrupta ().
Fig. 3.

Geographic distributions of the six socially polymorphic fire ant species and evidence of gene flow among them. (A) Estimated native species ranges. (B) Frequencies of the 12 most commonly recovered topologies inferred from topology weighting using windows of 1,000 SNPs in the nuclear genomes (chromosomes 1 to 15). (C) Heatmap summarizing the D statistics results for different combinations of socially polymorphic species using S. metallica and S. saevissima as outgroups (S. AdRX samples are included with S. invicta). Higher D values (warmer colors) indicate higher estimates of introgression (admixture). Error bars represent SD. (D) Admixture graphs representing the evolutionary history of the socially polymorphic species. The three graphs show all proposed admixture events (orange dashed lines) included in the six best-fitting admixture graphs ().

Geographic distributions of the six socially polymorphic fire ant species and evidence of gene flow among them. (A) Estimated native species ranges. (B) Frequencies of the 12 most commonly recovered topologies inferred from topology weighting using windows of 1,000 SNPs in the nuclear genomes (chromosomes 1 to 15). (C) Heatmap summarizing the D statistics results for different combinations of socially polymorphic species using S. metallica and S. saevissima as outgroups (S. AdRX samples are included with S. invicta). Higher D values (warmer colors) indicate higher estimates of introgression (admixture). Error bars represent SD. (D) Admixture graphs representing the evolutionary history of the socially polymorphic species. The three graphs show all proposed admixture events (orange dashed lines) included in the six best-fitting admixture graphs (). We next quantified phylogenetic heterogeneity using chromosomes 1 to 15 (supergene-bearing chromosome 16 not included), which can be caused by disturbing forces such as introgressive hybridization following lineage separation as well as ILS during speciation. We used a topology weighting method with 1,000-SNP (single-nucleotide polymorphism) sliding windows (40) (). While the most frequently recovered tree has the identical topology as the species phylogeny (Fig. 1), it accounts for only 8% of the windows (Fig. 3). Surprisingly, the second most frequent alternative topology, in which S. invicta/AdRX is sister group to S. macdonaghi/S. megergates, is nearly as frequent as the species topology but occurs far less frequently with a weighting of 1 [i.e., maximal weighting for a single topology (41)] (Fig. 3 and ). Among other common topologies, S. invicta/AdRX is frequently sister group to S. macdonaghi or S. richteri to S. nr. interrupta. Overall, the topology weighting analyses revealed widespread phylogenetic discordance, suggesting potential hybridization between multiple lineages across the socially polymorphic fire ant clade. To establish that such phylogenetic discordance along chromosomes 1 to 15 results from introgression events rather than ILS, we measured the proportion of shared derived alleles between the different species, again using D statistics (42, 43). Significantly positive D values were obtained between S. invicta/AdRX and S. macdonaghi (D = 0.18, P < 0.001; Fig. 3 and ) and, to a lesser extent, between S. invicta/AdRX and S. megergates (D = 0.03, P < 0.001; Fig. 3 and ), strongly supporting the conclusion of historical gene flow between these species. Moreover, the D statistics results suggest possible gene flow between S. nr. interrupta and both S. richteri and S. invicta/AdRX (Fig. 3 and ). In order to place the hybridization events on the species tree, we used f statistics to fit admixture graphs to the tree-based shared-alleles frequency data (42, 43). The best-fitting admixture graphs all suggest gene flow between the S. macdonaghi and S. invicta/AdRX lineages but differ on whether introgression is inferred to have occurred between S. megergates and S. invicta/AdRX or between S. megergates and S. macdonaghi (after hybridization with S. invicta). The admixture graphs also disagree on whether gene flow occurred between the S. nr. interrupta and S. invicta/S. richteri lineages or between the S. nr. interrupta and S. invicta lineages (Fig. 3 and ). Given this uncertainty, we used an approximate Bayesian computation (ABC) approach to identify the most likely evolutionary history by comparing explicit alternative models of speciation, with and without gene flow between the species. Our ABC analysis, using the D statistic and another 17 summary statistics (), strongly supports the idea that gene flow occurred between S. invicta and several other species (S. macdonaghi, S. megergates, and S. nr. interrupta), in accord with one of the six best-fitting admixture graphs (). Thus, four distinct but complementary analyses (mitochondrial DNA [mtDNA] phylogeny reconstruction, nuclear genome topology weighting, D statistics analyses, and ABC tests) jointly implicate some level of historical gene flow between S. invicta/AdRX and both S. macdonaghi and S. megergates. The analyses provide weaker support as well for historical introgression events involving several other pairs of lineages.

Discussion

We investigated the evolutionary history of the supergene that regulates colony social organization, a fundamental feature of social life in ants, in S. invicta and closely related species of South American fire ants. Based on the results from multiple complementary approaches, we conclude that the Sb supergene underlying the social polymorphism was formed by sequential incorporation of three inversions. The two initial inversions, In(16)3 and In(16)1, occurred in the ancestral population of S. invicta/AdRX and S. richteri, and the third, In(16)2, arose in the derived S. richteri lineage (Fig. 4). Once assembled in its current form, the supergene was first transferred from the S. richteri lineage to its sister S. invicta/AdRX lineage, presumably via hybridization, and from there to three other lineages (represented by modern-day S. nr. interrupta, S. macdonaghi, and S. megergates). This complex evolutionary history of the supergene, which spanned multiple lineages across the socially polymorphic fire ant clade and involved repeated bouts of hybridization, left telltale traces across the genomes of these ants such as excesses of shared derived alleles in remotely related species and numerous discordant genealogies at different genomic locations.
Fig. 4.

Hypothesized evolutionary history of the Sb supergene haplotype. The green line represents the evolutionary history of Sb: dashed for the intermediate stages, and solid for the complete haplotype with the three inversions. Arrows illustrate introgression events between the socially polymorphic species inferred from Sb inversion phylogenies. The blue line represents the evolutionary history of the SB haplotype. The estimation of divergence time is from refs. 22 and 29.

Hypothesized evolutionary history of the Sb supergene haplotype. The green line represents the evolutionary history of Sb: dashed for the intermediate stages, and solid for the complete haplotype with the three inversions. Arrows illustrate introgression events between the socially polymorphic species inferred from Sb inversion phylogenies. The blue line represents the evolutionary history of the SB haplotype. The estimation of divergence time is from refs. 22 and 29. Our results concerning the origin of the supergene contrast with those from three previous studies. The first study suggested that the Sb haplotype arose in the most recent common ancestor of the socially polymorphic fire ant clade and spread across the radiating species via normal cladogenetic lineage splitting (22). The discrepancies in the conclusions of this study from ours partly stems from ref. 22 using a strong linkage disequilibrium (LD) filter that removed blocks of polymorphisms that represent highly informative phylogenetic signals for the supergene region. By relaxing the LD filtering parameter, ref. 22 obtained phylogenetic results for the supergene more in line with ours (ref. 22, extended data fig. 8). Two more recent studies suggested, instead, that the supergene evolved in the ancestor of one of the extant species and then spread to other species. Specifically, based on the analysis of S. invicta and S. richteri, ref. 32 concluded that the supergene probably arose in S. richteri and introgressed into S. invicta, while ref. 29 concluded, from an analysis including more species, that the supergene evolved in S. invicta/S. macdonaghi (note that the authors incorrectly labeled samples of S. AdRX as S. macdonaghi in this paper). A potential shortcoming of both studies is that they were based on sequences from a limited number of markers across the supergene and a highly fragmented genome assembly. Most importantly, these two studies, as well as the study in ref. 22, did not analyze the three inversions separately but instead as a single unique locus. This approach cannot produce a clear image of the supergene evolutionary history as it merges mixed signals from the different inversions. For example, the vast majority of the genes (87 out of 97) analyzed by ref. 29 are localized on In(16)1 (). Thus, this study effectively reconstructed the evolutionary history of the largest inversion [In(16)1], which our analyses indeed revealed to be a (moderately supported) sister group to the SB haplotypes of S. AdRX and S. invicta. Thus, it appears that while requiring more computational resources, the approach of leveraging the abundant genetic information contained in each of the three inversions is necessary to understand the distinct events that generated the supergene. Similarly, considering genetic information separately for each inversion has been crucial to understanding the complex evolutionary history of the Heliconius butterfly mimicry supergene (6). The finding that all study individuals had either all three inversions or none at all, as has been found for all socially polymorphic fire ants appropriately sequenced (22, 25, 29, 44, 45), is in line with the view that many supergenes may originate by the sequential addition of beneficial “modifier” mutations when adding new inversions to the nonrecombining region such that the intermediate states of only one or two inversions are strongly selectively disadvantageous relative to the fully formed element. Nonetheless, multiple examples of other supergenes formed through the sequential accumulation of inversions exist in which intermediate states are fully adaptive alternate variants. The best-known example is the Batesian mimicry supergene identified in the butterfly Heliconius numata, in which various combinations of the three inversions yield different wing-color patterns (6, 46). On the other hand, gradually formed supergene systems in which intermediate states are absent in natural populations are common, such as the supergene controlling mate choice in the seaweed fly, Coelopa frigida, which comprises three overlapping inversions (47). The supergene in the white-throated sparrow, Zonotrichia albicollis, regulates alternative mating syndromes and is composed of at least two nested pericentric inversions (48), while the social supergene in the socially polymorphic ant Formica selysi is formed by at least four successive inversions (49). Large sequential structural change in a chromosome is likely an efficient process giving rise to such complex phenotypes, because it allows stepwise optimization of the phenotype by “freezing” in place large numbers of favorable allelic combinations that contribute cumulatively to the novel alternate complex phenotype. It does, however, raise a long-standing question—how do the intermediate stages in the evolution of complex structures, which presumably are not highly fit because they display some sort of intermediate phenotype, persist sufficiently long in enough individuals to allow the appearance of the requisite subsequent mutations for expression of the fully formed alternate phenotype? Studies such as this one that define discrete intermediate stages in the acquisition of the necessary components of such complex trait regulators can play pivotal roles in addressing this fundamental evolutionary question once gene inventories and functional annotations for each inversion are available. Recent studies showed that the clade of socially polymorphic South American fire ants diversified very rapidly over the last 0.1 to 2.1 million y (22, 32). Such rapid species radiation is frequently associated with substantial gene flow between incipient independent lineages because adjacent or overlapping geographical distributions generate the opportunities for interbreeding, and genomic divergence causing incompatibilities and hybrid breakdown is not yet sufficiently developed [e.g., wild and domestic cattle (50), Anopheles mosquitoes (51, 52), Panthera cats (53), cichlid fishes (54, 55), and Heliconius butterflies (6, 56–58)]. Our analyses indeed revealed evidence of gene flow between several of the studied species, as suggested also by earlier research (35, 36). However, the extent of hybridization apparently has been quite limited between some species, in particular between S. richteri and S. invicta/AdRX (see also refs. 35, 37, and 59) as well as S. invicta/AdRX and S. nr. interrupta. A possible reason for only limited traces of hybridization between these species is that introgression of the Sb haplotype occurred via an intermediate species, such as S. quinquecuspis, a species that has experienced extensive historical introgression of heterospecific nuclear DNA and mtDNA from both S. invicta and S. richteri in some parts of its range (35, 36). Unfortunately, this hypothesis could not be tested because we do not have access to sequenced individuals from relevant populations of this species. Alternatively, a possible explanation for the transfer of the supergene between these species that left limited traces along the remainder of the nuclear genome is simply that it was subjected to strong positive selection after the transfer while the remainder of the genome had become increasingly resistant to introgression because of growing genomic incompatibilities (52, 60, 61). Such adaptive introgression in the face of well-developed species barriers has been suggested to be common for supergenes because they comprise combinations of coadapted genes that may readily provide benefits across multiple closely related species (2, 62). In the case of fire ants, the benefits are probably related to the ecological advantages that polygyny provides in massively disturbed, harsh habitats and/or highly competitive environments (63, 64). Such advantages include insurance against loss of colony queens and bypassing of the risky independent colony-founding phase of the life cycle characterizing the monogyne form. Moreover, it is also possible that the selfish effects of the Sb haplotype, which have been described in introduced populations of S. invicta, also manifest themselves in the related species and thus promote the spread of the Sb haplotype. The selective elimination of SB/SB queens by workers bearing the Sb haplotype (23, 24), as well as a bias toward sexualization of the female brood bearing it (65), results in increased reproductive success of SB/Sb queens and thus in the number of copies of the Sb haplotype transmitted each generation, similar to the effects of meiotic drive elements that invade a population by distorting the expected outcomes of meiosis (66). Such selective advantages to the Sb haplotype at the level of the colony or of the gene thus may explain the presumed rapid expansion of this haplotype despite hybridization being limited. A supergene regulating social organization evolved independently in the ant F. selysi, a species in a lineage that diverged from the lineage giving rise to Solenopsis about 110 million y ago. The inversions in the two genera are fundamentally distinct, with few genes in common and different chromosomal locations (17). Phylogenetic and population genomic analyses revealed that homologous haplotypes of the F. selysi supergene (Sp haplotypes) also regulate social organization in four other Formica species (49). In contrast to the Solenopsis supergene, the Formica supergene apparently did not spread between species via introgression but appeared in a common ancestor of these five socially polymorphic species 20 to 40 million y ago. Importantly, the Formica supergene also exhibits selfish properties, with the Sp haplotype producing an embryo-killing maternal effect. As a result, Sp/Sm queens never produce Sm/Sm daughters (67). This dynamic is very similar to the selective advantage provided by the selfish Sb haplotype in S. invicta and may also help account for the presence of this supergene in several Formica species. Our conclusion that the Sb haplotype has introgressed across species adds to the increasing number of such instances documented. In Heliconius butterflies, introgression of an inversion triggered the formation of the P supergene (6). Similarly, the 2/2 supergene controlling disassortative mating between male morphs in the white-throated sparrow may originate from a past hybridization event with other Zonotrichia species (13). The S locus supergene in Arabidopsis, which regulates production of two alternative floral morphs (pin and thrum), also shows traces of repeated introgressions between two closely related species (68). In summary, this study strongly supports a scenario in which the fire ant Sb supergene was formed rapidly by sequential recruitment of three inversions, with the complete element subsequently introgressing into other closely related species despite limited evidence of hybridization between some of them. Following introgression, the same supergene has been maintained in each of the species of the socially polymorphic South American fire ant clade, showing that there are important selective forces associated with polygyny (balancing selection and/or selfish genetic effects) at work in these ants.

Methods

Sequence Data.

Collection, DNA extraction, and sequencing methods of the 185 studied haploid fire ant males are described in ref. 22 (; National Center for Biotechnology Information [NCBI] Sequence Read Archive [SRA] BioProjects PRJNA421367 and PRJNA821075).

Mapping, Calling, and Filtering.

We mapped the Illumina paired-end whole-genome sequences of the 185 individuals on the S. invicta SB reference genome GCA_009650705 (22) using bwa-mem (v.0.7.17) (69). We then used SAMtools to manipulate, convert, and sort output files from bwa-mem (70). We added sample names and read-group tags to BAM alignments using bamaddrg. The SNP calling was done using FreeBayes (v.1.3.2) with -ploidy 1 (71) to identify sequence polymorphisms. We processed all individual BAMs simultaneously using -bam-list; we used a parallel approach, with each thread analyzing a distinct 150-kb region. Locus VCF files were then concatenated using BCFtools concat (v.1.10.2) and haplotypes were disassembled with vt (v.0.5772) (70, 72) to dissociate indels from multiple-nucleotide variants. Finally, using VCFtools (v.0.1.16), we removed indels and sites with a phred-scaled quality score <30 or missing individuals >0.5 (73).

Phylogeography.

We obtained the sample locations of the male specimens whose sequences were analyzed in this study from ref. 26 and other specimens from the GBIF and AntWeb databases [AntWeb version 8.41, accessed 24 July 2020 (38)]. We collected the geographic coordinates of the samples and mapped the minimal inferred ranges of the study species using the geom_sf function from the R ggplot2 package (v.3.3.2; R Core Team, 2020).

Whole-Genome Phylogenetic Analyses.

Because phylogenetic analyses conducted on SNP matrices (i.e., without invariant sites and singletons) are not optimal for estimating branch lengths and topologies (74), we conducted the phylogenetic analyses using complete sequences, excluding only indels. We reconstituted the DNA multisequence alignments including monomorphic sites from the VCF file using alignment-from-vcf.py (https://github.com/qhelleu/SocialChrom_Phylo). All maximum-likelihood (ML) analyses were conducted with IQ-TREE (versions 1.7.8 and 2.0.4), with branch support calculated from 1,000 iterations of the ultrafast bootstrap algorithm (75–78). We used multiple strategies to reconstruct the species tree: First, we conducted phylogenetic analyses on a concatenated alignment using IQ-TREE and the GTR+I+F+R3 and GTR+F+R3 substitution models, as well as the substitution model that had the smallest Bayes information criterion (BIC) score using ModelFinder (79). To take into account heterogeneity in rates of evolution across the genome, we partitioned the data into nonoverlapping windows of 100 kb (80). To account for stationary, reversible, and homogeneous (SRH) model violations, we used the symmetric test implemented in IQ-TREE to identify and remove partitions that diverge from the SRH assumptions (81). We reran the phylogenetic analysis using an edge-proportional partition model, using GTR+I+F+R3, GTR+F+R3, or the substitution model that had the smallest BIC score from ModelFinder. To confirm that the tree obtained was not biased by introgression between the species/populations analyzed, we randomly sampled 25% of the partitions throughout the genome, using the AMAS replicate tool (82). To assess the robustness of the identified species tree, we applied a resampling approach using nonoverlapping windows of 10, 20, 50, 80, and 200 kb. We analyzed each dataset with IQ-TREE and the parameters described above. To account for ILS along the genome, we also analyzed the multiple trees for each partitioned dataset with ASTRAL (v.5.7.1), using DiscoVista to plot the results (83–86). Tree visualization and manipulation were done in iTOL (87). We performed topology weighting along the genome using the Twisst method (40) on 50-, 100-, 200-, 500-, and 1,000-SNP sliding windows to explore how the inferred relationships of the focal species vary across the genome. Individual-locus trees were built using PhyML (v.3.3) with the neighbor-joining method (88).

Supergene Phylogenetic Analyses.

We conducted supergene phylogenetic analyses on the complete sequences of the three inversions, again including invariant sites and excluding indels. We analyzed each inversion separately using IQ-TREE (versions 1.7.8 and 2.0.4), with the GTR+I+F+R3 and GTR+F+R3 substitution models, as well as the substitution model that had the smallest BIC score calculated using ModelFinder (76, 78, 79). We also used a partitioning approach to identify the most frequent alternative topologies to the ML tree. Individual-locus trees were built using windows of 1,000 SNPs using PhyML with the neighbor-joining method (88). Then, we estimated the frequency of alternative species tree topologies using the Twisst method (40) on 50-, 100-, 200-, 500-, and 1,000-SNP sliding windows. Tree topology exploration from the Twisst analyses, including identification of sister clades, was conducted using the R packages ape and phytools (89, 90). In parallel, to take into account the variable number of SB and Sb haplotypes in each species, we applied a rarefaction procedure involving randomized selection of individual sequences: One sequence of each haplotype (SB or Sb) was randomly selected for each socially polymorphic species, and two sequences were selected for each outgroup species. We repeated this randomization procedure 1,000 times for each inversion. For each group of random sequences, we analyzed the alignment using IQ-TREE with the GTR+I+F+R3 substitution model and 1,000 iterations of the ultrafast bootstrap algorithm (75, 77, 78).

Inferences of SB/Sb Haplotype Relationships Using D Statistics.

We used D statistics (ABBA-BABA method) to quantify for each species and for each inversion the number of derived alleles shared between Sb and either conspecific or heterospecific SB haplotypes. In a hypothetical four-taxon phylogeny, an outgroup (O) is utilized to estimate numbers of shared derived alleles between one sequence from the focal lineage (P3) and a second sequence from either the same (P1) or a different (P2) lineage. We used AdmixTools 6.0 in the wrapper admixr (43, 91) to calculate the D statistics. We compared only four individuals at a time to measure the number of derived alleles shared between the SB and Sb haplotypes: two conspecific individuals from a socially polymorphic species carrying either the SB or Sb haplotype (P1 and P3), a third individual carrying an SB haplotype from a different socially polymorphic species (P2), and a single outgroup individual from one of the two species lacking the Sb supergene (S. saevissima or S. metallica). To avoid biases due to the different numbers of haplotypes sequenced among species, we first randomly selected three different species (two SB/Sb–carrying and one outgroup species), and then randomly selected individuals within each species for each run, using the python3 random module. We ran 5,000 iterations for each inversion. The significance of D statistics was assessed using the pairwise Dunn test with a Benjamini–Yekutieli correction (92).

Genomewide Tests for Introgression.

We further used D statistics to help distinguish introgression from ILS as the primary cause of incongruence in inferred trees. We compared all triplet combinations of species using AdmixTools v.6.0 in the wrapper admixr (43, 91), with S. metallica and S. saevissima individuals comprising the outgroup. With ILS only, the focal species (P3) should share an equal number of derived alleles (D = 0) with both compared species (P1 and P2), while an excess of shared derived alleles (D value significantly different from 0) is indicative of past gene flow between two species and is not expected under a scenario of ILS without gene flow. We kept only triplets for comparisons in which the calculated D statistic fell within the interval [0, 1] and the species triplet order was concordant with the species phylogeny, meaning that P1 and P2 were sister clades with respect to P3 for all comparisons. Among the remaining triplets, we retained only comparisons involving the same pair of species in P2 and P3 with the highest D value. The significance of D statistics was assessed using a jackknife procedure (42) on blocks of 100 SNPs. The function p.adjust in R v.3.6.3 (R Core Team, 2020) was used to apply a Benjamini–Yekutieli correction (92). Results were plotted using the Ruby script “plot_d.rb” from https://github.com/mmatschiner/tutorials/. To build the admixture graphs for the socially polymorphic fire ant species, we used qpgraph from AdmixTools v.6.0 through the wrappers qpBrute and admixturegraph (43, 93–95).

Mitochondrial Genome De Novo Assembly and Phylogeny Reconstruction.

We used the seed extend–based assembler NOVOPlasty (v.4.3) tool (96) to assemble the mitochondrial genome of each individual, excluding potential sequences transposed into the nuclear DNA. The seed is iteratively extended bidirectionally until the assembly circularizes in the expected size range (97). Then, we aligned the circularized sequences using MARS and MAFFT (v.7.475) (98, 99). Finally, we used IQ-TREE (v.2.0.4) to infer the ML phylogenetic tree of all aligned sequences with 1,000 ultrafast bootstrap iterations and the best-fitting substitution model from ModelFinder (78, 79). Tree visualization and manipulation were done in iTOL (87). We plotted the nuclear and mitochondrial phylogenetic trees face to face with links using the cophyloplot function in the ape R package [v.5.5 (89)].

Approximate Bayesian Computation Analyses with Four Populations.

We used an ABC framework with a supervised machine-learning model classification procedure (random forest) to identify possible secondary contacts and interbreeding between the nascent socially polymorphic species or ancestral populations of their lineage. In all simulations, we used the coalescent simulator msnsam (October 2007 version), a modified version of the ms program, allowing variation in sample sizes among loci under an infinite-sites mutation model (100). We ran simulations for a large set of genomic models—combinations of strict isolation and secondary contacts between the four species considered in each simulation—and set of demographic models—heterogeneous or homogeneous effective population size and/or migration rate—to examine the effects of selection against migrants. We produced a simulated dataset of 108 average values from 18 summary statistics and their SDs computed over loci for each four-species combination of tested genomic and demographic models [https://github.com/popgenomics/ABC_4pop (101)]. To limit computation time and avoid overfitting, we used a subsample of 1,000 randomly sampled 50-kb bins from chromosomes 1 to 15 for each of the 10 replicates. We then used the simulations to train a random forest, using the abcrf R package (102). The trained forest was next used to predict the model that best explains the observed dataset. We built the random forest in abcrf from 1,000 decision trees using nonnull summary statistics, and then we ran the prediction also with 1,000 decision trees, to estimate the posterior probability of each model. Each prediction was scored using the number of decision trees supporting each model and the associated posterior probability for the best one. We also implemented a post hoc test to quantify the strength (robustness) of the prediction, which judges the ability of a trained random forest to discriminate between the results from two sets of simulations. To do this, we used simulations from the best-identified model corresponding to an observed dataset, combined with the simulation output from one alternative model. First, we used 15% of the simulated data from both models to train a random forest. Next, we used the random forest to predict the models with which the 85% remaining simulated data showed the highest concordance. We then measured the proportion of simulations accurately identified. If more than 10% of the simulations from the best-identified model of the two were wrongly predicted, we considered the results from the ABC analysis as being ambiguous (101).
  89 in total

1.  Theoretical genetics of Batesian mimicry II. Evolution of supergenes.

Authors:  D Charlesworth; B Charlesworth
Journal:  J Theor Biol       Date:  1975-12       Impact factor: 2.691

2.  DILS: Demographic inferences with linked selection by using ABC.

Authors:  Christelle Fraïsse; Iva Popovic; Clément Mazoyer; Bruno Spataro; Stéphane Delmotte; Jonathan Romiguier; Étienne Loire; Alexis Simon; Nicolas Galtier; Laurent Duret; Nicolas Bierne; Xavier Vekemans; Camille Roux
Journal:  Mol Ecol Resour       Date:  2021-01-15       Impact factor: 7.090

Review 3.  Supergenes and complex phenotypes.

Authors:  Tanja Schwander; Romain Libbrecht; Laurent Keller
Journal:  Curr Biol       Date:  2014-03-31       Impact factor: 10.834

Review 4.  Adaptive introgression: a plant perspective.

Authors:  Adriana Suarez-Gonzalez; Christian Lexer; Quentin C B Cronk
Journal:  Biol Lett       Date:  2018-03       Impact factor: 3.703

5.  MAFFT multiple sequence alignment software version 7: improvements in performance and usability.

Authors:  Kazutaka Katoh; Daron M Standley
Journal:  Mol Biol Evol       Date:  2013-01-16       Impact factor: 16.240

Review 6.  The selfish Segregation Distorter gene complex of Drosophila melanogaster.

Authors:  Amanda M Larracuente; Daven C Presgraves
Journal:  Genetics       Date:  2012-09       Impact factor: 4.562

7.  The social supergene dates back to the speciation time of two Solenopsis fire ant species.

Authors:  Pnina Cohen; Eyal Privman
Journal:  Sci Rep       Date:  2020-07-14       Impact factor: 4.379

8.  Fast and accurate short read alignment with Burrows-Wheeler transform.

Authors:  Heng Li; Richard Durbin
Journal:  Bioinformatics       Date:  2009-05-18       Impact factor: 6.937

Review 9.  Supergenes and their role in evolution.

Authors:  M J Thompson; C D Jiggins
Journal:  Heredity (Edinb)       Date:  2014-03-19       Impact factor: 3.821

10.  ASTRAL: genome-scale coalescent-based species tree estimation.

Authors:  S Mirarab; R Reaz; Md S Bayzid; T Zimmermann; M S Swenson; T Warnow
Journal:  Bioinformatics       Date:  2014-09-01       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.