Literature DB >> 30624681

A Single SNP Turns a Social Honey Bee (Apis mellifera) Worker into a Selfish Parasite.

Denise Aumer1, Eckart Stolle1, Michael Allsopp2, Fiona Mumoki3, Christian W W Pirk3, Robin F A Moritz1,3,4.   

Abstract

The evolution of altruism in complex insect societies is arguably one of the major transitions in evolution and inclusive fitness theory plausibly explains why this is an evolutionary stable strategy. Yet, workers of the South African Cape honey bee (Apis mellifera capensis) can reverse to selfish behavior by becoming social parasites and parthenogenetically producing female offspring (thelytoky). Using a joint mapping and population genomics approach, in combination with a time-course transcript abundance dynamics analysis, we show that a single nucleotide polymorphism at the mapped thelytoky locus (Th) is associated with the iconic thelytokous phenotype. Th forms a linkage group with the ecdysis-triggering hormone receptor (Ethr) within a nonrecombining region under strong selection in the genome. A balanced detrimental allele system plausibly explains why the trait is specific to A. m. capensis and cannot easily establish itself into genomes of other honey bee subspecies.
© The Author(s) 2019. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

Entities:  

Keywords:  balancing selection; inclusive fitness; social evolution; social parasitism; thelytoky; worker reproduction

Mesh:

Year:  2019        PMID: 30624681      PMCID: PMC6389321          DOI: 10.1093/molbev/msy232

Source DB:  PubMed          Journal:  Mol Biol Evol        ISSN: 0737-4038            Impact factor:   16.240


Introduction

The evolution of insect societies is one of the most remarkable evolutionary transitions regarding the complexity of biotic structures. Inclusive fitness theory (Hamilton 1964) convincingly explains why the altruistic behavior of female workers that sacrifice their individual fitness in favor of a closely related highly fecund queen is an evolutionary stable strategy. However, there are various exceptions in the social Hymenoptera, where workers have regained reproductive control and in some cases even evolved into social parasites. The best studied example of worker social parasitism is that of the Cape honey bee, Apis mellifera capensis (Neumann and Moritz 2002). Typically, reproduction in honey bee colonies is confined to the queen and under strict pheromonal control, especially that of the queen (Slessor et al. 1988; Plettner et al. 1993). However, in the case of a queen loss, workers can activate their ovaries and produce queen-like pheromones, becoming “false queens” or “pseudoqueens” (Sakagami 1958; Crewe and Velthuis 1980). In general, these pseudoqueens produce male offspring, as these workers are not mated and hence lay unfertilized haploid eggs that develop into drones (arrhenotoky; Ruttner and Hesse 1981). This is different in A. m. capensis workers, which produce diploid female offspring (thelytoky; de Villiers 1883; Onions 1914) via central fusion of meiotic products (automixis; Verma and Ruttner 1983). In addition, A. m. capensis pseudoqueens can rapidly activate their ovaries (usually within 8 days [Ruttner and Hesse 1981; Hepburn et al. 1991] and occasionally even in the presence of a queen [Pirk et al. 2002]) and can produce much higher quantities of queen-like pheromones than workers of other subspecies (Hemmling et al. 1979; Crewe and Velthuis 1980; Okosun et al. 2017). The combination of thelytokous reproduction, swift ovary activation, and queen-like pheromone secretions (thelytoky syndrome [Lattorff and Moritz 2013]) of A. m. capensis workers allows them to exploit an additional life history trajectory as social parasites. After entering a foreign colony, they are able to establish themselves as reproductive dominant pseudoqueens, to the detriment of the resident queen (Neumann and Moritz 2002). These social parasites are particularly successful and commonly observed in the NE of South Africa, where a single worker lineage (Baudry et al. 2004) infests colonies of the adjacent subspecies A. m. scutellata, causing substantial colony losses to beekeepers since decades (Allsopp 1992; Pirk et al. 2014). In addition, thelytokous worker reproduction enables queenless colonies to rapidly requeen themselves from diploid laying worker’s offspring (Holmes et al. 2010), increasing the chance of colony survival, facilitated by a genetically related individual. This ability has been suggested as reason for the evolution of thelytoky in A. m. capensis as the endemic region of this subspecies is characterized by suddenly changing (Tribe 1983) and very windy weather, which might make it difficult for queens to return to their colonies after mating flights (Moritz 1986). Given these selective advantages and fitness benefits, it has been difficult to explain why thelytoky did not spread further into other honey bee populations (Moritz 1986; Greeff 1996). This was particularly puzzling because the syndrome showed Mendelian segregation in experimental crosses, suggesting a single locus control (Ruttner 1988; Lattorff et al. 2005, 2007; Aumer et al. 2017). Previous studies using microsatellite quantitative trait locus mapping (Lattorff et al. 2007) and RNAi knockdown experiments (Jarosch et al. 2011) suggested that differential splicing of the transcription factor gemini is controlling thelytokous reproduction. However, subsequent studies (Chapman et al. 2015; Aumer et al. 2017) showed that even though gemini is a key regulating gene for worker reproduction in general, it was not the ultimate gene switch controlling the mode of parthenogenesis and thus the entire thelytoky syndrome. To determine the actual genetic locus controlling thelytoky, we used an A. m. capensis mapping population (Aumer et al. 2017), consisting of workers produced by a single queen that had naturally mated in the wild with a large number of drones. This allowed to study both, 1) the frequency of the parental male alleles for a population-wide analysis and 2) the segregation of the heterozygous (Aumer et al. 2017) maternal alleles in a classical genetic mapping study by comparing thelytokous and arrhenotokous workers, providing a multifaceted basis for the discovery and analysis of the functional loci involved in this iconic reproductive trait (Schaid et al. 2018).

Results

The Thelytoky-Associated Locus

To identify the thelytoky-associated locus, we analyzed 21 thelytokous and 21 arrhenotokous (from 13 individual patrilines each) robustly phenotyped A. m. capensis workers of the mapping population using a data set of 7,238,467 biallelic single nucleotide polymorphisms (SNPs) in a genome-wide association study. A single genomic region on chromosome 1 (Group1.23: 480,000–509,450 bp) that harbors two genes, mycosubtilin synthase subunit C (mycC, LOC409109, the ortholog of Drosophila melanogaster ebony) and the 3′ part of the uncharacterized gene LOC409096, showed highly elevated genetic differentiation between the two phenotypic groups (fig. 1 and supplementary fig. 1, Supplementary Material online). Conversely, the adjacent region (Group1.23: 509,450–570,000 bp) with the 5′ part of LOC409096 and the gene ecdysis-triggering hormone receptor (Ethr, LOC724495) almost completely lacked any genetic differentiation (supplementary fig. 1, Supplementary Material online) as well as recombination (fig. 2). Both thelytokous and arrhenotokous workers were nearly entirely homozygous within the stretch of 60,550 bp (median of 39 [26-73] heterozygous SNPs per individual, fig. 1 and supplementary figs. 1, Supplementary Material online). Thus, the gene Ethr is represented by a single functional allele in A. m. capensis (one polymorphic nonsynonymous SNP at the 3′ end caused marginal putative changes of the protein, supplementary fig. 3, Supplementary Material online and supplementary table 1, Supplementary Material online). In contrast, the gene mycC showed increased levels of heterozygosity in thelytokous and arrhenotokous workers (median of 216 [185-236] heterozygous SNPs in thelytokous workers and median of 53 [1-95] heterozygous SNPs in arrhenotokous workers within 9,576 bp gene sequence, fig. 1 and supplementary figs. 1, Supplementary Material online). With increasing distance to the almost entirely homozygous region around Ethr, heterozygosity levels within mycC decreased in thelytokous workers, but increased in arrhenotokous workers toward the genomic average, consistent with a reduction in linkage disequilibrium due to recombination (figs. 12). Accordingly, considerable allelic variance was found within the mapping population (17 different alleles were identified based on 27 polymorphic nonsynonymous SNPs, supplementary table 1, Supplementary Material online). The lack of allelic covariance with the worker reproductive phenotype suggests that neither Ethr nor mycC control the mode of parthenogenesis. However, the alleles of LOC409096 showed consistent differentiation patterns between the two phenotypic groups. The arrhenotokous workers were almost completely monomorphic for the entire gene sequence (median of 5 [1-14] heterozygous SNPs per individual within 14,929 bp, fig. 1 and supplementary fig. 2, Supplementary Material online) and homozygous for a single functional allele (coding sequence as well as associated 3′ and 5′ untranslated regions [UTRs], supplementary table 1, Supplementary Material online). The thelytokous workers were similarly monomorphic in the larger 5′ part of the gene (median of 8 [4-13] heterozygous SNPs per individual within 13,435 bp) but were highly heterozygous in the 857 bp section at the 3′ end (median of 15 [12-17] heterozygous SNPs per individual, fig. 1 and supplementary fig. 2, Supplementary Material online), including a small part of exon 7 and the complete exon 8. As a consequence, all thelytokous workers were heterozygous with two functional alleles: one identical to the allele found in all arrhenotokous workers, the other allele specific to heterozygous thelytokous workers (supplementary table 1, Supplementary Material online). Thus, the thelytoky-specific allele appears to be dominant rather than recessive, as previously inferred in several studies (Ruttner 1988; Lattorff et al. 2005, 2007; Aumer et al. 2017). An independent data set of workers from the socially parasitic lineage (thelytokous A. m. capensis) in NE South Africa (n = 4) confirmed the consistent pattern of heterozygosity at this locus (supplementary figs. 1, Supplementary Material online and supplementary table 1, Supplementary Material online). Due to the unambiguous association of a single allele with the mode of parthenogenesis, we suggest to term LOC409096 Thelytoky (Th), as the major regulator of thelytokous reproduction. Structural and conserved domain analyses of the Th amino acid sequence revealed that Th encodes a receptor protein with a transmembrane helix (AA430–480) and a signal peptide (AA1–21) at the extracellular N-terminus, indicating that it is linked to the secretory pathway (Blobel and Dobberstein 1975).
. 1.

The thelytoky locus in Apis mellifera capensis. (a) Genetic differentiation (FST) across the whole genome (chromosome 1–16) between the thelytokous (th) and arrhenotokous (ar) workers, shown as mean per 100 SNP sliding window (50 SNP steps, blue solid line: 99.99th percentile). (b) Observed heterozygosity (Hobs per individual) of arrhenotokous (red) and thelytokous (blue) workers (robustly phenotyped n = 13 per group) within the genomic region of mycC, Th, and Ethr, shown as mean per 100 SNP sliding window with 50 SNP steps (dashed lines: genome-wide means of both phenotypic groups, dotted vertical lines: borders of the heterozygous and homozygous regions). (c) Observed heterozygosity (Hobs per individual) of the heterozygous thelytokous (blue) workers (robustly phenotyped n = 13) within the genomic region of Th shown per SNP (dashed line: genome-wide mean). The heterozygous nonsynonymous substitution at 509,225 bp leading to the two distinct alleles (Th and ar) is displayed larger and shown in bicolor (red and blue). (d and e) Modeled tertiary protein structures for both Th-alleles: arrhenotoky allele (d) and thelytoky allele (e) in rainbow colors from the N-terminus (blue) to the C-terminus (red).

. 2.

The linkage disequilibrium within the genomic locus harboring mycC, Th, and Ethr (480,000–580,000 bp). The linkage disequilibrium was assessed using the squared Pearson coefficient of correlation (r2) measurement (color scheme: black indicates r2 = 1 [complete linkage disequilibrium/nonrandom association of two loci]; shades of gray indicate 0 < r2 < 1 [the darker, the bigger r2]; white indicates r2 = 0 [complete linkage equilibrium/random association of two loci]). The analysis was done with SNPs at 500-bp intervals. Each square represents the comparison of two SNPs. The location of the three genes is indicated by bars above the figure (mycC = green, Th = blue, and Ethr = red).

The Gene Expression of Th in Fat Bodies of Thelytokous and Arrhenotokous Workers during Sexual Maturation. Note.—The average expression value in the fat bodies (±SD; normalized counts) at the early (day 3/4) and the late (day 7/8) sampled time points after emergence of thelytokous A. m. capensis and arrhenotokous A. m. scutellata workers. Given are the log 2-fold change and adjusted P value for both pairwise comparisons within (day3/4 vs. day7/8) and between the phenotypic groups. The thelytoky locus in Apis mellifera capensis. (a) Genetic differentiation (FST) across the whole genome (chromosome 1–16) between the thelytokous (th) and arrhenotokous (ar) workers, shown as mean per 100 SNP sliding window (50 SNP steps, blue solid line: 99.99th percentile). (b) Observed heterozygosity (Hobs per individual) of arrhenotokous (red) and thelytokous (blue) workers (robustly phenotyped n = 13 per group) within the genomic region of mycC, Th, and Ethr, shown as mean per 100 SNP sliding window with 50 SNP steps (dashed lines: genome-wide means of both phenotypic groups, dotted vertical lines: borders of the heterozygous and homozygous regions). (c) Observed heterozygosity (Hobs per individual) of the heterozygous thelytokous (blue) workers (robustly phenotyped n = 13) within the genomic region of Th shown per SNP (dashed line: genome-wide mean). The heterozygous nonsynonymous substitution at 509,225 bp leading to the two distinct alleles (Th and ar) is displayed larger and shown in bicolor (red and blue). (d and e) Modeled tertiary protein structures for both Th-alleles: arrhenotoky allele (d) and thelytoky allele (e) in rainbow colors from the N-terminus (blue) to the C-terminus (red). The linkage disequilibrium within the genomic locus harboring mycC, Th, and Ethr (480,000–580,000 bp). The linkage disequilibrium was assessed using the squared Pearson coefficient of correlation (r2) measurement (color scheme: black indicates r2 = 1 [complete linkage disequilibrium/nonrandom association of two loci]; shades of gray indicate 0 < r2 < 1 [the darker, the bigger r2]; white indicates r2 = 0 [complete linkage equilibrium/random association of two loci]). The analysis was done with SNPs at 500-bp intervals. Each square represents the comparison of two SNPs. The location of the three genes is indicated by bars above the figure (mycC = green, Th = blue, and Ethr = red). The pattern of divergence and heterozygosity at the thelytoky locus is consistent with the absence of recombination between the two queen alleles within the mapping population (n = 70, first detected recombination event upstream of mycC at about 471,500 bp, first detected recombination event downstream of Ethr at about 593,500 bp). Since recombination suppression is often associated with structural variants such as inversions (Kirkpatrick 2010), we examined the short read sequencing data from the mapping population as well as additional Oxford Nanopore long read sequence data of two thelytokous pseudoqueens of the socially parasitic lineage for the presence of inversions and other structural variants. No evidence for an inversion was found in this region of the genome (supplementary figs. 4–10, Supplementary Material online). The strongly reduced recombination rate that is typical for thelytokous A. m. capensis workers (Baudry et al. 2004) as well as the close proximity of the thelytoky locus to the centromere, where recombination is often reduced (Baudry et al. 2004; Stapley et al. 2017), might contribute to the maintenance of the observed heterozygosity patterns at this locus.

A Single SNP Is Associated with Thelytokous Worker Reproduction in A. m. capensis

Within the mapping population, 63 polymorphic SNPs were identified in the coding sequence of Th, including three nonsynonymous SNPs, predicted to affect the amino acid sequence (supplementary table 1, Supplementary Material online). Two of these SNPs corresponded to subfamily-specific polymorphisms which were rarely observed in both phenotypic groups, leading to changes within the same amino acid group and thus, small structural changes at the protein level. Only a single nonsynonymous SNP within the 3′ part of Th (position 509,225 bp in Group1.23) was consistently associated with the mode of parthenogenesis. This SNP was homozygous in all arrhenotokous workers (genotype G/G), but heterozygous in all thelytokous workers (G/A, supplementary table 1, Supplementary Material online). The independent samples of thelytokous A. m. capensis social parasites (n = 4) carried the identical heterozygous SNP genotype (genotype G/A) at this position (supplementary table 1, Supplementary Material online). This substitution causes a change from the polar amino acid threonine to the nonpolar amino acid isoleucine in the protein sequence (p.Thr400Ile), leading to substantial structural modifications and likely functional consequences. Various protein prediction models revealed profound differences between the tertiary structures of the arrhenotoky (fig. 1) and the thelytoky allele (fig. 1), including changes of the positions and the number of the predicted protein binding sites.

The Thelytoky Locus Is Under Strong Selection in A. m. capensis

While the genomic regions flanking the Th–Ethr linkage group (60,000 kb up- and downstream) did not deviate from the genome-wide average for nucleotide diversity or heterozygosity, the linkage group of the 5′ part of Th and the neighboring gene Ethr was characterized by extremely low levels of nucleotide diversity and heterozygosity over a region of approximately 60,550 bp (fig. 1 and supplementary fig. 1, Supplementary Material online). All individuals of the mapping population were mostly homozygous in this region, with the majority of SNPs (77.33%) being the derived allele (supplementary fig. 2, Supplementary Material online). These patterns were mirrored in the thelytokous socially parasitic A. m. capensis lineage of NE South Africa (n = 4). Such a drastic reduction of genetic differentiation together with the excessive linkage disequilibrium (fig. 2) typically represents exceptionally strong selection (Jackson et al. 2015). Indeed, a McDonald–Kreitman test confirmed that Ethr is under positive selection, consistent with a strong selective sweep with sharply reduced genetic diversity, characterized by an overabundance of fixed synonymous and nonsynonymous substitutions (supplementary table 2, Supplementary Material online). In contrast, the gene Th shows signatures of negative selection overall, but within thelytokous workers, the 3′ part of Th is under balancing selection with an excess of polymorphic sites (14 of 15 synonymous and nonsynonymous SNPs were polymorphic within 6% of the gene [626 bp]) and few nonsynonymous substitutions (2 within 6% of the gene). This and the observed elevated levels of heterozygosity (figs. 1 and 2 and supplementary fig. 2, Supplementary Material online) are indications for a locus with heterozygote advantage (overdominance). The adjacent gene mycC showed signatures of positive selection in arrhenotokous workers, but negative selection in thelytokous workers (supplementary table 2, Supplementary Material online) due to close linkage (and therefore increased heterozygosity) with the Th allele.

Population Genetics of Th in the Natural A. m. capensis Population

All 70 analyzed workers of the mapping population were offspring of a single heterozygous queen (Aumer et al. 2017) (Th/Th) that had mated with a large number of drones in the wild. Thus, the parental male genomes reflect a random sample of the natural A. m. capensis population from the Cape of Good Hope Nature Reserve, allowing us to estimate population-wide allele frequencies at Th. In total, we inferred the genotypes of 36 parental drones (based on the results of Aumer et al. [2017] and 100 SNPs). While the arrhenotoky allele Th was carried by 35 drones, the thelytoky allele Th was only found in a single drone. Considering both queen alleles (Th), the overall allele frequencies were p = 0.947 and p = 0.053. Given that we never observed homozygous Th workers (neither among the workers from the mapping population nor in the social parasitic workers from NE South Africa) and taking into account the balancing selection on the 3′ end of Th, we concluded that homozygosity of the Th allele is likely associated with substantial fitness disadvantages, due to either the failure of developing into adults or infertility. The observed low frequency of Th in parental males did, however, not reflect lethal effects on haploid drones. Nine out of 19 drone offspring of the heterozygous mapping population’s queen (Th/Th) carried the Th and 10 the alternative Th allele. As this perfectly matches a 1:1 segregation, we excluded that the hemizygous Th allele has any lethal effects on drones. This leaves potential fitness-reducing effects to the female sex. Assuming the most extreme, a fitness of zero in Th homozygous females (w = 0) and heterozygous Th having a higher relative fitness than homozygous Th females, we estimated a fitness difference of 5.6% between the two genotypes (w = 1, w = 0.944), considering the equilibrium frequency p = 0.947. Therefore, even in the most extreme case, we predict no large fitness differences between Th and Th queens that would be detected at the colony level.

Comparison to Other Subspecies

To determine whether other honey bee subspecies show signs of selection in the Th–Ethr region similar to A. m. capensis, we screened 53 previously published A. mellifera genomes (24 A. m. scutellata workers from Kenya [Cridland et al. 2017; Wallberg et al. 2017], 10 A. m. scutellata workers from South Africa, 9 A. m. carnica workers from Europe, and 10 A. m. yemenitica workers from the Arabic Peninsula [Harpur et al. 2014]). We found that in all subspecies, the levels of nucleotide diversity and heterozygosity at the Th–Ethr locus did not deviate from the genome-wide averages (supplementary fig. 1, Supplementary Material online) and no signs for extreme selection were detected. Furthermore, there was substantial variability within the Th–Ethr locus among the analyzed workers. A single South African A. m. scutellata worker was homozygous for the functional alleles of Th and Ethr found in arrhenotokous A. m. capensis workers within this locus. Another A. m. scutellata worker carried the A. m. capensis Ethr allele only. Sharing of alleles is not unexpected, as both subspecies form a stable hybrid zone, in which hybrid colonies can be observed (Hepburn and Crewe 1991). Such shared alleles also plausibly explain the successful crossing experiments that have been carried out between both subspecies in the past. Indeed, the segregation patterns observed in 28 previously published (back-) crosses between A. m. capensis and A. m. scutellata (Jordan et al. 2008; Oldroyd et al. 2014; Chapman et al. 2015) are in full agreement with the inferred balanced detrimental allele system at Th (supplementary table 3, Supplementary Material online).

Transcript Abundance Dynamics of Th Differ between Thelytokous and Arrhenotokous Workers

To test whether transcription patterns of mycC, Th, and Ethr show an association with the mode of parthenogenesis, we utilized previously established fat body transcriptome data (Aumer et al. 2018) and compared the gene expression between thelytokous A. m. capensis pseudoqueens (social parasitic lineage in NE South Africa) and arrhenotokous A. m. scutellata workers (NE South Africa), both sampled at four consecutive time points (3, 4, 7, and 8 days after emergence). In honey bees, the fat body is the key tissue involved in regulating ovary maturation (Amdam et al. 2011). After 8 days, all experimental A. m. capensis workers had fully activated ovaries, but the experimental A. m. scutellata workers were characterized by low levels of ovary development at all sampling time points (Aumer et al. 2018). Analyzing the gene expression over time, we found upregulation of Th by about three orders of magnitude after 8 days in the thelytokous A. m. capensis workers, while it remained at constantly low levels in the arrhenotokous A. m. scutellata workers (table 1). In addition, for both “early” (days 3/4) and “late” (days 7/8) time points, Th showed significantly higher expression in thelytokous (with both alleles Th and Th present) compared with arrhenotokous workers (table 1). Therefore, Th has highly thelytoky-specific transcript abundance dynamics. In addition to the reported coding sequence differences, the thelytoky-specific substitutions found in the 3′UTR of Th might have further regulatory influence (supplementary table 1, Supplementary Material online). The two other genes of interest in our comparison, mycC and Ethr, were not expressed at detectable levels in the fat bodies of the experimental workers at any sampled time point after emergence.
Table 1.

The Gene Expression of Th in Fat Bodies of Thelytokous and Arrhenotokous Workers during Sexual Maturation.

Day 3/4Day 7/8Log 2-Fold Change (within groups)Adjusted P Value (within groups)
Thelytokous workers (A. m. capensis)412.37 ± 162.86808.81 ± 201.351.90<0.01
Arrhenotokous workers (A. m. scutellata)43.56 ± 6.5031.41 ± 9.33−0.460.92
Log 2-fold change (between groups)3.244.68
Adjusted P value (between groups)<0.01<0.01

Note.—The average expression value in the fat bodies (±SD; normalized counts) at the early (day 3/4) and the late (day 7/8) sampled time points after emergence of thelytokous A. m. capensis and arrhenotokous A. m. scutellata workers. Given are the log 2-fold change and adjusted P value for both pairwise comparisons within (day3/4 vs. day7/8) and between the phenotypic groups.

Discussion

In the past, both a single recessive locus (Lattorff et al. 2005; Jarosch et al. 2011) and multiple loci (Chapman et al. 2015; Wallberg et al. 2016) had been suggested to control thelytokous worker reproduction in A. m. capensis. However, the assumption of a recessive locus on chromosome 13 (Lattorff et al. 2005) was based on a false positive quantitative trait locus signal, most likely the result of the relatively small sample size and the low number of analyzed microsatellite markers (546 simple sequence repeat markers genome-wide). The subsequently identified transcription factor gemini within the candidate locus on chromosome 13 is a key regulating gene for worker reproduction in general (Jarosch et al. 2011; Jarosch-Perlow et al. 2018) but does not control the mode of parthenogenesis per se (Chapman et al. 2015; Aumer et al. 2017). The two studies suggesting a multilocus control suffered from methodological flaws such as insufficient experimental control and incorrect microsatellite data analyses (Chapman et al. 2015, reanalysis of their data supported single locus control [Aumer et al. 2017]), or lacking phenotyping of the mode of parthenogenesis and conflation of population structure (Wallberg et al. 2016). Hence, although the genetic basis of the thelytoky syndrome had been intensely studied, it remained to be controversially discussed and ambiguous. The multifaceted basis of this study revealed that the genetic control of the thelytoky syndrome is regulated by a more complex genetic mechanism than previously assumed (Ruttner 1988; Lattorff et al. 2005; Jarosch et al. 2011; Chapman et al. 2015; Wallberg et al. 2016; Aumer et al. 2017). Indeed, thelytoky is controlled by a single locus (Ruttner 1988; Lattorff et al. 2005; Jarosch et al. 2011; Aumer et al. 2017), however the thelytoky allele (Th) is not a recessive, but a dominant allele which operates in combination with a complementing arrhenotoky allele (Th) that is under strong selection in A. m. capensis, to obtain workers that can reproduce thelytokously (Th/Th). It appears as though the arrhenotoky allele Th found in A. m. capensis provides specific complementary functions similar to a “rescue allele” for Th. In this case, Th would need to interact with the specific Th allele to result in thelytokous workers (Th/Th). Any other combination would distort the trait due to nonmatching alleles and would result in either nonfunctional (Th/Th) or fertile arrhenotokous phenotypes (Th/Th). This matches the mostly unsuccessful outcomes of attempted crossings of foreign honeybee subspecies with A. m. capensis, such as reported already in 1883 by Lord de Villiers, who never found fertile queens among the offspring of Italian queens that had been imported to South Africa (de Villiers 1883). Such a balanced detrimental allele system plausibly explains the stability of the hybrid zone between A. m. capensis and A. m. scutellata (Hepburn and Crewe 1991) and why thelytoky is not spreading into other honey bee populations, despite the various fitness benefits resulting from thelytokous worker reproduction. In addition, due to the observed tight linkage, the single functional variant of Ethr that was detected in A. m. capensis may also be involved in the expression of the full thelytoky syndrome (Lattorff and Moritz 2013). Ethr encodes the receptor for the ecdysis-triggering hormone that regulates ecdysis and juvenile hormone synthesis in insects (Roller et al. 2010; Areiza et al. 2014; Meiselman et al. 2017) and is central for the regulation of larval and ovarian development. Even though no expression of Ethr was detected in the fat bodies of adult A. m. capensis pseudoqueens, the A. m. capensis functional Ethr variant may well be involved in the development of the queen-like traits in A. m. capensis workers during larval growth, including the high number of ovarioles (Ruttner 1977), the presence of a spermatheca (Ruttner 1988) and the production of queen-like pheromones (Hemmling et al. 1979; Crewe and Velthuis 1980; Okosun et al. 2017). Similarly to Ethr, genotypes of the gene mycC in tight linkage with the Th allele might be involved in the expression of the full thelytoky syndrome, as it is the orthologous gene of ebony of Drosophila melanogaster. Ebony is known to interact with dopamine signaling in insects (Borycz et al. 2002) and there are correlations between brain dopamine levels and ovary development (Harris and Woodring 1995), as well as between dopamine receptor expression in antennae and attraction to queen pheromones (Vergoz et al. 2009) in honey bee workers. Hence, in A. m. capensis, mycC might play a role in the behavioral changes from a social honeybee worker to a selfish social parasite (Moritz et al. 2002; Neumann and Hepburn 2002). Furthermore, ebony is known to be involved in pigmentation (Wittkopp and Beldade 2009) and thus, its ortholog mycC might drive the generally dark color of A. m. capensis workers (Ruttner 1977). On a broader level, the identified genetic architecture of thelytoky in honey bees may serve as a model for other eusocial species with thelytokous reproduction, in particular for novel ant model systems, such as Platythyrea punctata and the clonal raider ant Ooceraea biroi. In conclusion, we show that thelytoky in A. m. capensis is controlled by a single heterozygous dominant locus located in a region under extreme selection. A single nonsynonymous SNP was inferred to be sufficient to lead to the striking phenotypic change from a social honey bee worker into a thelytokous social parasite. Thus, this study pertains to the rare examples of complex genetic structure influenced by strong purifying and balancing selection, and how a single substitution affects major phenotypic changes (Wang et al. 2015; Ito et al. 2018), distorting sociality in the colony.

Materials and Methods

Am. capensis Mapping Population

Freshly emerged worker offspring of a single multiply mated A. m. capensis queen (colony at the Cape of Good Hope section of the Table Mountain National Park (34°14′45.0′′S, 18°24′15.0′′E)) provided a mapping population (see Aumer et al. 2017). A single individually labeled A. m. capensis worker (numbered tag on the thorax and paint marks on the abdomen) was introduced into a queenless mini-colony comprising about 1,000 freshly emerged queenless A. m. scutellata host workers, kept in Apidea mating nucs provided with sugar candy as well as honey-, pollen- and empty comb in bee-proof tents to prevent contamination or parasitism by other bees. These colonies (n = 74) were inspected at 2-day intervals to add more host workers, sugar candy or pollen as required. In such queenless host colonies, the A. m. capensis worker develops into a laying pseudoqueen within a few days (Johannsmeier 1983; Koeniger and Würkner 1992; Martin et al. 2002; Neumann and Hepburn 2002). Once larvae were emerging, but no later than 14 days after initiating the experiment, the A. m. capensis workers and all brood (eggs and/or larvae) were collected and stored at −20 °C until phenolchloroform DNA extractions (Kirby 1957; Hunt and Page 1994). To confirm maternity and to determine the ploidy of the offspring (haploid = male, diploid = female), all A. m. capensis workers and their brood were genotyped at seven unlinked polymorphic microsatellite markers (A107, A079, A113, A014, A028, A088 and A035; Solignac et al. 2003), using a standard polymerase chain reaction (PCR) protocol (Solignac et al. 2003) and capillary sequencing (MegaBace 1000, GE Health Care, MegaBACE Fragment Profiler 1.3). Individuals with only one of the maternal alleles at all genotyped loci were classified as haploid male offspring. Individuals heterozygous with both maternal alleles at one or more loci were identified as diploid female offspring. Only individuals that produced reproducible genotypes for at least five of the seven marker loci (worker and all of its offspring) were included in the phenotype scoring, resulting in 21 thelytokous and 21 arrhenotokous workers that passed the genotyping threshold.

Whole Genome Sequencing and Variant Calling

The genomes of 71 A. m. capensis workers of the mapping population (42 with and 29 without confirmed phenotype) were sequenced on an Illumina HiSeq4000 Sequencing System (150 bp PE, TruSeq Nano DNA library preparation from 20 to 100 ng DNA, 350 bp insert size) to a coverage of ≥20× (raw reads are deposited in NCBI SRA, accession: PRJNA507348). We first performed quality assessment (fastqc v0.11.7 [bioinformatics.babraham.ac.uk/projects/fastqc]) of raw reads, followed by the removal of low quality reads and adapter contamination using skewer (Jiang et al. 2014) (v0.2.2, minimum window base quality of 20, minimum end-quality of 15, minimum length of 100, filter against degenerated N containing reads). The resultant reads were aligned to the A. mellifera reference genome (Amel_4.5 scaffold assembly) (Elsik et al. 2014) (GCA_000002195.1, scaffolds GL630009–GL635652, we used this assembly instead of the placed scaffolds to avoid known scaffold orientation problems) with BWA-MEM (Li and Durbin 2010) (v0.7.17-r1188). Alignments were processed and filtered (mapping quality ≥45, alignment length ≥80) in parallel using sambamba (Tarasov et al. 2015) (v0.6.7) and samtools (Li et al. 2009) (v1.6). Optical and PCR duplicates were removed with sambamba (Tarasov et al. 2015) markdup (v0.6.7) and the clumpify.sh script (BBMap [Bushnell 2014] package v37.86), respectively. Alignments were assessed using Qualimap2 (Okonechnikov et al. 2016) (v2.2.1) and any individual with a coverage of <20× was excluded from further analyses (n = 1). Variants were identified using FreeBayes (Garrison and Marth 2012) (v1.1.0-54-g49413aa, –min-alternate-fraction 0.25 –min-alternate-total 2 –min-coverage 2). Variant calls from repetitive regions (tandem repeats finder [Benson 1999] v4.09) unreliable for read alignment were removed. The remaining variants were then decomposed and quality filtered (QUAL ≥ 20) and only biallelic SNPs that were present in at least 68 out of 70 individuals (N = 7,238,467) retained (vcflib [github.com/vcflib/vcflib, accessed January 23, 2018], vt [Tan et al. 2015] and vcftools [Danecek et al. 2011] [v0.1.15]). To determine the effect of nucleotide substitutions, we used SNPeff (Cingolani et al. 2012) (v4.3) to annotate all biallelic SNPs detected above. For this, we employed existing NCBI RefSeq annotations of the A. mellifera chromosome assembly (Amel_4.5, GCA_000002195.4) and lifted them over on our reference assembly (Amel_4.5, assembly GCA_000002195.1, scaffolds GL630009–GL635652, the most recent RefSeq annotations [GCA_000002195.4] are not available specifically for this scaffold assembly) using the software flo (github.com/wurmlab/flo, commit 8b7372d, accessed January 23, 2018).

Identification of the Thelytoky-Controlling Locus

We measured the genetic differentiation (FST) and absolute pairwise nucleotide differences (D) between arrhenotokous and thelytokous workers to identify alleles cosegregating with the two phenotypes. To avoid pseudoreplication, a subset of 13 arrhenotokous and 13 thelytokous workers, representing 13 unique patrilines within each phenotypic group, was used to determine nucleotide diversity (π) as well as the observed heterozygosity for both thelytokous and arrhenotokous workers. We estimated these parameters across the genome in overlapping windows of 100 SNPs (50 SNP steps, maximal window size 20 kb) using VCFtools (Danecek et al. 2011) (v0.1.15) and published scripts (Martin et al. 2016) (github.com/simonhmartin/genomics_general). Using all individuals of the mapping population mirrored these results (data not shown). The degree of linkage disequilibrium within the detected candidate region (scaffold Group1.23: 480,000–580,000 bp) was analyzed with HaploView (Barrett et al. 2005) (v4.2) using SNPs at 500-bp intervals. The minimum genotype threshold was set to 75% and markers with rare subfamily-specific alleles were excluded from the analysis to avoid false negative signals as the squared Pearson coefficient of correlation (r2) measurement was used. Based on the SNP genotypes, for each coding sequence within the detected candidate region and the neighboring regions up- and downstream of the candidate region, the queen alleles were inferred for each gene following Mendelian inference (Estoup et al. 1995). The genotypes of all individuals of the mapping population (n = 70) were manually assessed to detect recombination events between the queen alleles. To determine whether the detected region of divergence is part of an inversion, we analyzed the short read mapping data (above) using lumpy-SV (Layer et al. 2014) (v0.2.13), delly (Rausch et al. 2012) (v0.7.7), and samplot (https://github.com/ryanlayer/samplot). In addition, we sequenced genomic DNA of two thelytokous females from the parasitic lineage on a single Oxford Nanopore flowcell each (LSK108 1D ligation library preparation, R9.4.1 flowcell, raw reads are deposited in NCBI SRA, accession: PRJNA507349). Basecalling and adapter trimming was done with Albacore (OxfordNanoporeTechnologies) (v2.3.3) and porechop (github.com/rrwick/Porechop) (v0.2.3). One individual underperformed (∼5× genomic coverage) and was thus excluded from further analyses. The other individual yielded 2,481,636 reads and 5,836,823,564 bases (∼25× genomic coverage) of long reads in the “pass” quality category (mean read length = 2,352 bp, longest read = 104,006 bp, and read N50 = 4,532 bp). The trimmed reads were aligned to the Amel_4.5 scaffold assembly, and separately also to the recently available Amel_HAv3.1 assembly (GenBank GCA_03254395.1, Wallberg et al. 2018), with ngmlr (Sedlazeck et al. 2018) (v0.2.7) and structural variants were called with sniffles (Sedlazeck et al. 2018) (v1.0.9) with a threshold of minimum five reads supporting a variant. Further, within the candidate region and its neighboring regions (400,000–460,000 and 600,000–660,000 bp) as well as within candidate genes, the number and proportion of heterozygous, homozygous, and fixed SNPs were evaluated. For candidate genes, the direction of selection was assessed using the McDonald–Kreitman test (McDonald and Kreitman 1991). Coding SNPs causing nonsynonymous substitutions as well as SNPs in the 5′UTRs and 3′UTRs of candidate genes were analyzed in more detail by assessing the individual genotypes to determine whether they are associated with the mode of parthenogenesis or fixed in the population. Based on these analyses, the haplotype sequences of the thelytoky and the arrhenotoky allele of Th as well as of Ethr were inferred. In addition to our mapping population, we analyzed the genomes of four A. m. capensis individuals, offspring of social parasitic workers (thelytokous socially parasitic lineage of A. m. capensis in NE South Africa), collected from a parasitized A. m. scutellata colony in Pretoria (University of Pretoria, experimental farm) (sequencing methods and sequence analyses as outlined above, raw reads are deposited in NCBI SRA, accession: PRJNA507349). We also included published A. mellifera genome sequences into our analyses: 9 A. m. carnica (NCBI SRA accession: PRJNA216922; Harpur et al. 2014), 34 A. m. scutellata (NCBI SRA accessions: PRJNA294105, PRJNA237819; Harpur et al. 2014; Cridland et al. 2017; Wallberg et al. 2017), and 10 A. m. yemenitica workers (NCBI SRA accession: PRJNA294105; Harpur et al. 2014). For variant calling of the additional samples, the filtered biallelic variants detected in our mapping population were used as defined input, thus we only genotyped the additional individuals at polymorphic loci of the mapping population.

Protein Structure Analyses

The potential protein structure and function of the putative thelytoky gene (Th) were analyzed using the amino acid sequence (XP_397545.2) as input for Phyre2 (Kelley et al. 2015) (v2.0), PredictProtein (Yachdav et al. 2014), CCTOP (Dobson et al. 2015), and interproscan (Jones et al. 2014) (v68.0). Potential protein structures and functions were only reported if they were reliably reproducible with at least two programs. After the thelytoky and the arrhenotoky allele of Th were determined (based on variants that were present in all individuals of each phenotypic group), the corresponding amino acid sequences were translated from the distinct haplotype sequences and were analyzed with Phyre2 (Kelley et al. 2015) (v2.0) to model the tertiary structures and PredictProtein (Yachdav et al. 2014) to assess the number and the position of potential polynucleotide binding sites of each haplotype sequence separately, thus detecting potential changes between haplotypes. Similarly, the amino acid sequence of Ethr (XP_006570145.1) and the amino acid sequences derived from distinct haplotype sequences were analyzed.

Population Genetics

The previously identified patrilines of the mapping population, based on microsatellite data (Aumer et al. 2017), were reanalyzed based on 100 SNPs (spanning 2.995 bp: Group1.23: 506,639–509,446 bp) including and next to the thelytoky-associated variant (at 509,225 bp), following Mendelian inference (Estoup et al. 1995). The allele frequencies of the thelytoky (p) and the arrhenotoky allele (p) in the parental generation of the mapping population were determined. Based on these calculations, the relative fitness w and the selection coefficients s of the different genotypes (Th, Th, Th were assessed based on the following assumptions: heterozygous workers (Th) reproduce thelytokously and can develop into pseudoqueens (Neumann and Moritz 2002), resulting in a maximum relative fitness (w = 1, s = 0). Since individuals that are homozygous for the thelytoky allele (Th) were not detected, we could not exclude that Th homozygous individuals are either lethal, infertile or experience significant fitness disadvantages. We therefore assumed a balanced detrimental allele system, where w = 0 and s = 1, resulting in the stable equilibrium frequency of p = s/(s + s).

Allele Frequencies in Adult Drones

Because Th homozygous individuals were not detected in our data set, we tested whether both Th and Th are present in the drone offspring of the heterozygous mapping population’s queen (Th). Nineteen drones were collected in the colony and stored in ethanol at −20 °C. One hind-leg per drone was used for DNA extraction with 100 μl Chelex (Walsh et al. 1991) solution (5%) and 0.1 mg proteinase K, following the standard Chelex thermocycler protocol (Walsh et al. 1991). One microliter of total DNA per drone was subsequently used for PCR amplification of a 383 bp DNA fragment (508,685–509,068 bp) downstream of Th (one reaction contained 0.2 μM of each primer [primer 1: GGTTCTCTGTCGCTGCCTAT, primer 2: CCATGGTCAGTTCTTTGTTTCGT], 0.2 mM dNTPs, 0.25 units peqGOLD Taq-DNA-Polymerase and 10× reaction buffer S [both from Peqlab] in a total volume of 10 μl) following a standard PCR protocol (5 min at 94 °C, 38 cycles of 30 s at 94 °C, 30 s at 60 °C, 60 s at 72 °C, 15 min at 72 °C). PCR products were purified (QIAquick PCR Purification Kit, Qiagen) and Sanger sequenced (Eurofins Genomics, Germany) in one direction. Sequences were analyzed in BioEdit (Hall 1999) (v7.0.4) regarding a 3-bp deletion (508,880–50,882 bp) for which the presence (associated with Th) or the absence (associated with Th) was determined.

Thelytoky-Associated Transcript Abundance Dynamic Analysis

To assess transcript abundance dynamics of Th, mycC, and Ethr in thelytokous and arrhenotokous workers, we used previously published fat body RNAseq data (NCBI SRA accession: SRP135683; Aumer et al. 2018) from A. m. capensis workers of the socially parasitic lineage (thelytokous, obtained from infested A. m. scutellata colonies in NE South Africa) and A. m. scutellata workers (arrhenotokous, obtained from queenright A. m. scutellata colonies in NE South Africa). To follow the temporal transcriptomic changes during the first days after emergence, at least three replicate samples for both subspecies were taken at 3, 4, 7, and 8 days after starting the experimental cages (per time point six replicates for A. m. capensis, three replicates for A. m. scutellata). Usually, about 7–8 days after emergence, A. m. capensis workers have fully activated ovaries (Ruttner and Hesse 1981; Hepburn and Crewe 1991; Martin et al. 2002). For each time point and subspecies, three replicates were combined for full transcriptome sequencing (100 bp PE) on an Illumina HiSeq4000 Sequencing System (BGI, Hong Kong). Adapters were trimmed using Trimmomatic (Bolger et al. 2014) (v0.36) and all libraries (65,900,726 ± 6,190,602 [SD] reads per library [Aumer et al. 2018]) were aligned to the honeybee reference scaffold assembly Amel_4.5 (Elsik et al. 2014) (assembly GCA_000002195.1, scaffolds GL630009–GL635652) using HISAT2 (Kim et al. 2015) (v2.1.0) with default settings, followed by creation of a counts matrix including all samples and genes using HTSeq (Anders et al. 2015) (v0.7.1). To identify the differential expression levels of Th, Ethr, and mycC between thelytokous A. m. capensis and arrhenotokous A. m. scutellata workers, DeSeq2 (Love et al. 2014; Anders et al. 2015) (v1.20.0) was used on “early” replicated samples (days 3 and 4) and “late” replicated samples (days 7 and 8). Temporal transcript abundance dynamics of Th were determined using Next maSigPro (Nueda et al. 2014) (v1.52.0), which identifies genes that are significantly differentially expressed over time and between experimental groups. To assess whether both the thelytoky (Th) and the arrhenotoky allele (Th) of Th were present in the transcripts, we assessed genomic RNAseq alignments manually in IGV (Thorvaldsdóttir et al. 2013; Robinson et al. 2017). Here, the gene expression reanalysis was restricted to the candidate loci (Th, Ethr, and mycC); a complete transcriptome analysis can be found in Aumer et al. (2018).

Analyses of Previously Published Back-Crosses

To obtain a further additional and independent verification of our model of the genetic thelytoky control, we tested whether the balanced detrimental allele system is in agreement with the observed segregation patterns of previously published data sets on (back-) crosses between honey bee lineages of thelytokous and arrhenotokous laying workers (Lattorff et al. 2005; Jordan et al. 2008; Oldroyd et al. 2014; Chapman et al. 2015), following Mendelian inference (Estoup et al. 1995).

Supplementary Material

Supplementary data are available at Molecular Biology and Evolution online. Click here for additional data file.
  61 in total

1.  Haploview: analysis and visualization of LD and haplotype maps.

Authors:  J C Barrett; B Fry; J Maller; M J Daly
Journal:  Bioinformatics       Date:  2004-08-05       Impact factor: 6.937

2.  Chelex 100 as a medium for simple extraction of DNA for PCR-based typing from forensic material.

Authors:  P S Walsh; D A Metzger; R Higuchi
Journal:  Biotechniques       Date:  1991-04       Impact factor: 1.993

3.  Parasitic Cape honeybee workers, Apis mellifera capensis, evade policing.

Authors:  Stephen J Martin; Madeleine Beekman; Theresa C Wossler; Francis L W Ratnieks
Journal:  Nature       Date:  2002-01-10       Impact factor: 49.962

4.  Tandem repeats finder: a program to analyze DNA sequences.

Authors:  G Benson
Journal:  Nucleic Acids Res       Date:  1999-01-15       Impact factor: 16.971

5.  Unified representation of genetic variants.

Authors:  Adrian Tan; Gonçalo R Abecasis; Hyun Min Kang
Journal:  Bioinformatics       Date:  2015-02-19       Impact factor: 6.937

6.  Evidence That the Origin of Naked Kernels During Maize Domestication Was Caused by a Single Amino Acid Substitution in tga1.

Authors:  Huai Wang; Anthony J Studer; Qiong Zhao; Robert Meeley; John F Doebley
Journal:  Genetics       Date:  2015-05-04       Impact factor: 4.562

7.  Peripheral modulation of worker bee responses to queen mandibular pheromone.

Authors:  Vanina Vergoz; H James McQuillan; Lisa H Geddes; Kiri Pullar; Brad J Nicholson; Michael G Paulin; Alison R Mercer
Journal:  Proc Natl Acad Sci U S A       Date:  2009-11-23       Impact factor: 11.205

8.  HTSeq--a Python framework to work with high-throughput sequencing data.

Authors:  Simon Anders; Paul Theodor Pyl; Wolfgang Huber
Journal:  Bioinformatics       Date:  2014-09-25       Impact factor: 6.937

9.  DELLY: structural variant discovery by integrated paired-end and split-read analysis.

Authors:  Tobias Rausch; Thomas Zichner; Andreas Schlattl; Adrian M Stütz; Vladimir Benes; Jan O Korbel
Journal:  Bioinformatics       Date:  2012-09-15       Impact factor: 6.937

Review 10.  Variation in recombination frequency and distribution across eukaryotes: patterns and processes.

Authors:  Jessica Stapley; Philine G D Feulner; Susan E Johnston; Anna W Santure; Carole M Smadja
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2017-12-19       Impact factor: 6.237

View more
  3 in total

1.  Increased response to sequential infections of honeybee, Apis mellifera scutellata, colonies by socially parasitic Cape honeybee, A. m. capensis, workers.

Authors:  Peter Neumann; Christian W W Pirk
Journal:  Sci Rep       Date:  2019-05-20       Impact factor: 4.379

2.  Attack of the dark clones the genetics of reproductive and color traits of South African honey bees (Apis mellifera spp.).

Authors:  Laura Patterson Rosa; Amin Eimanifar; Abigail G Kimes; Samantha A Brooks; James D Ellis
Journal:  PLoS One       Date:  2021-12-14       Impact factor: 3.240

3.  Genetic Variations in Metallothionein Genes and Susceptibility to Hypertensive Disorders of Pregnancy: A Case-Control Study.

Authors:  Shudan Wei; Xiangyuan Yu; Xiaolan Wen; Min Zhang; Qi Lang; Ping Zhong; Bo Huang
Journal:  Front Genet       Date:  2022-06-06       Impact factor: 4.772

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.