Literature DB >> 25670770

Adaptive evolution of genes involved in the regulation of germline stem cells in Drosophila melanogaster and D. simulans.

Heather A Flores1, Vanessa L Bauer DuMont1, Aalya Fatoo1, Diana Hubbard1, Mohammed Hijji1, Daniel A Barbash1, Charles F Aquadro2.   

Abstract

Population genetic and comparative analyses in diverse taxa have shown that numerous genes involved in reproduction are adaptively evolving. Two genes involved in germline stem cell regulation, bag of marbles (bam) and benign gonial cell neoplasm (bgcn), have been shown previously to experience recurrent, adaptive evolution in both Drosophila melanogaster and D. simulans. Here we report a population genetic survey on eight additional genes involved in germline stem cell regulation in D. melanogaster and D. simulans that reveals all eight of these genes reject a neutral model of evolution in at least one test and one species after correction for multiple testing using a false-discovery rate of 0.05. These genes play diverse roles in the regulation of germline stem cells, suggesting that positive selection in response to several evolutionary pressures may be acting to drive the adaptive evolution of these genes.
Copyright © 2015 Flores et al.

Entities:  

Keywords:  adaptive evolution; germline stem cells; oogenesis; positive selection; spermatogenesis

Mesh:

Substances:

Year:  2015        PMID: 25670770      PMCID: PMC4390574          DOI: 10.1534/g3.114.015875

Source DB:  PubMed          Journal:  G3 (Bethesda)        ISSN: 2160-1836            Impact factor:   3.154


Reproduction and fertility are among the most important traits for organismal fitness. Many models and theoretical studies have proposed that germline and fertility-related genes will be targeted for selection, and empirical evidence has documented rapid evolution and in many cases positive selection on numerous genes known or proposed to be involved in male fertility (Tsaur ; Begun ; Swanson , 2004; Clark and Swanson 2005; Haerty ), female reproductive tract function (Lawniczak and Begun 2007; Prokupek ; Kelleher and Markow 2009), host defense against segregation distorters (Presgraves 2007; Phadnis and Orr 2009), and sperm-egg interactions (Swanson and Vacquier 1995; Swanson ; Aagaard ). Most of these genes are expressed at the latter stages of gametogenesis and often are associated with meiosis or interactions between gametes. However, Civetta and Bauer DuMont independently discovered that two genes expressed in the earliest stages of gametogenesis, specifically germline stem cell (GSC) regulation, also show evidence of adaptive evolution. One of these genes, (), is under intensely strong positive selection with an astonishing 59 nonsynonymous substitutions among 442 codons between two closely related fruit fly species, Drosophila melanogaster and D. simulans (Civetta ; Bauer DuMont ). A second gene, (), which acts together with as a key “switch” to initiate GSC differentiation, is also evolving under positive selection in these two species (Bauer DuMont ). These discoveries raise a fundamental question: what is the selective pressure(s) driving these adaptive changes at early gametogenesis loci? There have been several genome-wide, next-generation sequencing surveys of variation in D. melanogaster and D. simulans that have reported departures from an equilibrium neutral model in directions consistent with natural selection for GSC-related gene ontology categories or at/near several GSC genes (Begun ; Langley ; Pool ). It remains informative to examine specific genes, particularly using parallel assays on population data from both D. melanogaster and D. simulans. Here, we report high-quality Sanger resequencing from population samples of both species for eight genes involved in GSC regulation (cyclin A, , , P-element induced wimpy testis (aka ), , , , and ), test for evidence of selection using polymorphism-based methods and reanalyze longer-term sequence evolution at these genes using phylogenetic analysis by maximum likelihood (PAML). These eight genes include those whose products genetically and/or physically interact with and/or and are likely to have shared functions, and those that appear to have non-/-related roles in GSC regulation. Figure 1 illustrates the roles of these loci within the female germline, wherein the functions and interactions of these genes are more thoroughly understood. We note that several of these genes function somewhat differently in the male germline (Fuller and Spradling 2007; Gilboa and Lehmann 2004; Gonczy ; Insco ; Kawase ; Song ).
Figure 1

Schematic of the Drosophila ovarian germline stem cell (GSC) niche with genes analyzed. Adapted from Wong . The GSC (light blue cell) is present in a niche environment (green cells are somatic cap and terminal filament cells, yellow cells are escort stem cells) required to maintain its stem cell state. Bam is repressed in the GSC. Only when the GSC moves away from the niche is Bam expressed and this cell starts to differentiate (tan cell). Yb is involved in the maintenance of GSCs and regulating their division. Piwi acts cell nonautonomously to help in the repression of Bam in the GSC. Zpg is an adherens junction protein that functions in cell signaling. Nos and Pum act as translational repressors of genes that will promote differentiation. Mei-p26 acts in concert with the miRNA machinery (miRISC in the figure) to also repress transcripts (indicated by red squiggly lines), some of which are shared with Nos and Pum. Bgcn is required for Bam to cause GSCs to differentiate. Bam and Bgcn antagonize the Nos/Pum complex. Stwl represses Bam-independent differentiation pathways and thus maintains GSC self-renewal. The cystoblast (tan cell) will undergo four mitotic divisions. CycA participates in the regulation of these mitotic divisions but is not shown in this diagram.

Schematic of the Drosophila ovarian germline stem cell (GSC) niche with genes analyzed. Adapted from Wong . The GSC (light blue cell) is present in a niche environment (green cells are somatic cap and terminal filament cells, yellow cells are escort stem cells) required to maintain its stem cell state. Bam is repressed in the GSC. Only when the GSC moves away from the niche is Bam expressed and this cell starts to differentiate (tan cell). Yb is involved in the maintenance of GSCs and regulating their division. Piwi acts cell nonautonomously to help in the repression of Bam in the GSC. Zpg is an adherens junction protein that functions in cell signaling. Nos and Pum act as translational repressors of genes that will promote differentiation. Mei-p26 acts in concert with the miRNA machinery (miRISC in the figure) to also repress transcripts (indicated by red squiggly lines), some of which are shared with Nos and Pum. Bgcn is required for Bam to cause GSCs to differentiate. Bam and Bgcn antagonize the Nos/Pum complex. Stwl represses Bam-independent differentiation pathways and thus maintains GSC self-renewal. The cystoblast (tan cell) will undergo four mitotic divisions. CycA participates in the regulation of these mitotic divisions but is not shown in this diagram. GSCs produce the cells that will develop to form either eggs or sperm throughout a fly’s life. GSCs are maintained in a microenvironment called the stem cell niche that is located in the proximal end of the Drosophila ovary or the apical end of the testis (Figure 1). acts, together with , as a switch to allow for female GSC differentiation, and therefore its expression is repressed in the GSCs (McKearin and Ohlstein 1995; Lavoie ; Ohlstein ) by extrinsic signals from the stem cell niche (Song ). However, this signaling quickly dissipates and thus repression only occurs in cells that are in physical contact with the stem cell niche (Wong ; Xia ). To receive these extrinsic signals, GSCs remain physically attached to the niche through adherens junctions (Song ). The gap junction protein Zero population growth (Zpg) is present in the cytoplasmic membrane of both GSCs and niche cells and is required for the maintenance of GSCs through the sharing of small molecules and signals between the niche and GSC (Tazuke ; Gilboa ). Repression of expression in the GSC is also controlled by the genes female-sterile(1)Yb (also abbreviated as Yb) and P-element induced wimpy testis () (King ; Szakmary ). Intrinsic mechanisms within the GSC play an important role in its maintenance as well, at the levels of transcription and translation. The chromatin-associated protein Stonewall (Stwl) represses genes that promote differentiation (Maines ), whereas Mei-P26 antagonizes the miRNA pathway and represses transcripts that will promote differentiation in the cystoblast (Neumuller ; Li ). At the translational level, Nanos (Nos) and Pumilio (Pum) bind to mRNAs that promote differentiation and inhibit their translation (Lin and Spradling 1997; Wang and Lin 2004). is also required to promote cystoblast differentiation (Tazuke ; Gilboa ). So depending on the context, and both inhibit and promote GSC differentiation. Finally, the cystoblast will undergo four mitotic divisions. is thought to regulate the number of mitotic divisions, and genetic interaction assays have suggested that interacts with the cell cycle factor, cyclin A (cycA) in this process (Lilly ). We report here that all eight genes show a statistically significant departure from an equilibrium neutral model for at least one polymorphism-based statistical test. Additionally, Yb and also reject neutrality by the McDonald-Kreitman (MK) test, suggesting an excess of nonsynonymous fixations between species consistent with positive selection. These eight genes together with and have various molecular functions and are expressed in a range of cell types including GSCs, cysts, and surrounding somatic cells suggesting that multiple evolutionary forces are acting throughout the early germline to drive the adaptive evolution of these genes.

Materials and Methods

Fly stocks

When possible, African populations of Drosophila melanogaster and D. simulans were used to minimize the effects of demography in our ability to detect selection (Begun and Aquadro 1993). In some cases, different populations were used for different genes because of the availability of stocks with extracted chromosomes, which allowed us to sequence homozygous lines in D. melanogaster for the X, second, or third chromosomes. For D. simulans populations, inbred lines were used. For , , , and a D. melanogaster population from Uganda, Africa (Pool and Aquadro 2006) and a D. simulans population from Lake Kariba, Zimbabwe, Africa (Pool and Aquadro 2006) were used. For Yb and , a D. melanogaster population collected from Sengua Wildlife Research Institute in Zimbabwe, Africa (Begun and Aquadro 1994) and a D. simulans population from Lake Kariba, Zimbabwe (Pool and Aquadro 2006) were used. For cyclin A and , a D. melanogaster population sample collected from Lake Kariba, Zimbabwe, Africa (Pool and Aquadro 2006) and an inbred D. simulans population sample from North Carolina (Aquadro ) were used.

Sequencing

Genomic DNA was extracted from approximately 20 adult flies using Puregene Core Kit A DNA isolation kits (QIAGEN). Polymerase chain reaction and sequencing primer sequences for each gene are listed in Supporting Information, Table S1. Sanger sequencing (both strands) was performed by the Cornell University Genomics Core DNA Sequencing Facility (http://cores.lifesciences.cornell.edu/brcinfo/?f=1) using ABI chemistry and 3730XL DNA Analyzers. Sequences were assembled and edited using Sequencher 4.9 (Gene Codes) and aligned using MEGA 4 (Tamura ) using default settings, and checked manually to assure the reading frame was retained. Sequences have been deposited in GenBank under accession numbers JX647382-JX647689. For , a single 4.8-kb sequence that includes all exons was amplified. This large fragment was problematic for direct sequencing, so it was cloned into the pCR-BluntII-TOPO plasmid (Invitrogen). Two clones of each sample were sequenced to control for PCR errors. If there was ambiguity between the two clones, a third was sequenced and the majority nucleotide was used. The locus spans over 160 kb, so four separate products were sequenced that include most of the exons (Figure S1A). The locus was amplified in two separate products that included both exons (Figure S1B). The cycA locus also amplified in two separate products that include two groups of exons in the 5′ and 3′ region of the gene (Figure S1C). For , only exons 3−6 were amplified. Our results based on this region are consistent with other reports that has not been subject to recurrent, positive selection (Anderson ).

Polymorphism analysis

DnaSP (Librado and Rozas 2009) was used to estimate basic summary statistics of variation within and between species. To detect signatures of recent selection from polymorphism data we applied two quite different tests: OmegaPlus (Pavlidis ), which focuses on the linkage disequilibrium signature of selective sweeps, and SweeD (Pavlidis ), which assesses the fit of the site frequency spectrum to a particular neutral model (it is a faster extension of the widely used SweepFinder method; Nielsen ). Statistical significance of OmegaPlus (dependent on linkage disequilibrium) and SweeD (dependent on SFS) test results was determined using neutral simulations with or without demography. We considered a region to be a significant outlier if it fell within the 5% quantile of the simulated datasets. These simulations were done using the program msABC (Pavlidis ). We surveyed variation from an African population of D. melanogaster which is within this species’ presumed ancestral range. There is mounting evidence that even African populations of this species have experienced changes in effective population size over time (Glinka ; Haddrill ; Hutter ; Li and Stephan. 2006; Duchen ; Singh ) and/or migration (Pool ). Because inferring demographic parameters is challenging, we simulated three different scenarios: standard neutral model with constant population size, standard neutral model with exponential growth as estimated by Hutter , or standard neutral model with a 3-phase (“3 epoch”) bottleneck as estimated by Duchen . We supplied msABC with uniform prior distributions for theta and all demographic parameters. The theta prior distribution for D. melanogaster was obtained from Pool and ranged between 0.006 and 0.009 per site. Figure S2 shows the basic model of the demographic scenarios we considered and the demographic priors used in the simulations. To date, there are no comparable estimates of demographic parameters available for D. simulans. Given that the ancestral range of both of these species is in Africa and they are sympatric, we used the D. melanogaster demographic parameters as an approximation for D. simulans. For D. simulans, we used the theta range we observed across the eight GSC loci in this study, which ranged between 0.003 and 0.04 per site. The MK test (McDonald and Kreitman 1991) was used to test for recurrent, historical positive selection by contrasting pooled polymorphism for D. melanogaster and D. simulans to fixed differences between species using D. yakuba as an outgroup. We used the program DoFE (http://www.sussex.ac.uk/lifesci/eyre-walkerlab/resources) from Eyre-Walker and Keightley (2009) to calculate the proportion of amino acid fixations predicted to be due to positive selection (α). This method uses the site frequency spectrum to jointly estimate the selective effects of new deleterious mutations and the number of adaptive substitution for a selected class of mutations while also incorporating a generalized model of effective population size. For our analysis, we used the site frequency spectrum of fourfold (neutral class) and 0-fold (selected class) codon positions for both D. melanogaster and D. simulans. The sample size for each locus in our analysis varied for each species. We randomly selected nine and six alleles in D. melanogaster and D. simulans, respectively. These values correspond to the smallest sample size in each species.

Divergence analysis

The has previously reported tests of long-term recurrent positive selection using PAML (Yang 1997, 2007) for , , , cycA, and and found none departed from a neutral model. Three genes (, , and Yb), had not been included in this previous study due to their strict criteria that ruled out genes with alignment ambiguities. We generated new multiple-sequence alignments using PRANK alignment software (Löytynoja and Goldman 2005) from single sequences of D. melanogaster, D. simulans, D. sechellia, D. yakuba, D. erecta, and D. ananassae downloaded from FlyBase. We did not use more divergent species due to the problems of saturation of synonymous site divergence (The ). Yb from D. ananassae has a large number of indels relative to the other five species (and has a much larger coding sequence and an additional intron). Therefore, we analyzed these Yb alignments with PAML in two ways: 1) excluding any region with an indel, and 2) excluding any region with an indel as well as with one codon on either side (to reduce the chance calling of “false” substitutions associated with alignment problems). For Yb, we also used the recently published improved reference genome sequence of D. simulans from Hu ). The models M0 vs. M3, M7 vs. M8, and M8 vs. M8a were compared. Consistent with the analyses from the , each run was performed using three tree topologies: Tree 1, D. yakuba and D. erecta as sister species; Tree 2, D. yakuba as an outgroup and Tree 3, D. erecta as an outgroup. Each model comparison was run under three different initial ω values to assure that convergence was to a global and not local maximum.

Adjusting for multiple testing:

We adjusted our criteria for statistical significance by estimating the appropriate P-value threshold assuming an experiment-wide Benjamini and Hochberg (1995) false-discovery rate (FDR) of 0.05 using the p.adjust function in the R Project (www.r-project.org). The P-values of SweeD and MK tests were combined for correction for each species separately as both tests use the frequency or counts of each polymorphism. OmegaPlus only uses patterns of linkage disequilibrium across sites, and thus those P-values were corrected separately (again for each species alone).

Results

Polymorphism-based analyses

Gene function and sample size data from African populations of D. melanogaster and either African or North American D. simulans are reported in Table 1, and standard summary statistics for each gene in Table 2. We find that D. simulans levels of nucleotide variability are generally higher than those seen in D. melanogaster, consistent with previous results (Aquadro ).
Table 1

Genes surveyed and sample sizes

GeneFunctionNumber of Alleles Sampled
D. melanogasterD. simulans
cyc AaRegulation of cyst mitotic divisions
 Segment 1910
 Segment 2910
YbGSC maintenance and cytoblast  differentiation199
mei-P26aGSC maintenance1910
nosaGSC maintenance910
pumaGSC maintenance
 Segment 1179
 Segment 21110
 Segment 3199
 Segment 4187
piwiGSC maintenance106
stwlChromatin factor, GSC maintenance
 Segment 1188
 Segment 2159
zpgaGSC adherens junction and cystoblast  differentiation1810

Indicates that gene has a genetic and/or physical interaction reported with bam. For pumilio, four separate regions were amplified and analyzed, labeled as 1-4. For stonewall and cycA two separate regions were amplified, labeled as 1 and 2. GSC, germline stem cell.

Table 2

Nucleotide polymorphism estimates for GSC genes

GeneSpeciesSθπTotπSynπNon
cycA 1D. melanogaster140.00510.00460.00250.0054
D. simulans100.00350.00440.02770.0000
cycA 2D. melanogaster150.00850.00740.01570.0010
D. simulans110.00560.00610.01640.0000
mei-P26D. melanogaster260.00620.00610.01810.0000
D. simulans210.00550.00310.01100.0000
nosD. melanogaster210.00440.00450.00900.0009
D. simulans350.00970.00930.01500.0042
piwiD. melanogaster1030.00790.00740.01960.0024
D. simulans1960.02220.02040.03680.0025
pumilio 1D. melanogaster260.00400.00460.00330.0012
D. simulans1030.02020.01690.01420.0003
pumilio 2D. melanogaster100.00520.00400.00720.0000
D. simulans330.01420.01440.03460.0005
pumilio 3D. melanogaster100.00400.00460.01720.0020
D. simulans720.04000.03880.06850.0014
pumilio 4D. melanogaster130.00210.00200.00700.000
D. simulans500.00950.00890.02070.0011
stwl 1D. melanogaster210.00920.00580.00000.0001
D. simulans170.00880.00640.01190.0000
stwl 2D. melanogaster490.00510.00480.01230.0025
D. simulans430.00530.00500.00970.0033
YbD. melanogaster880.00790.00600.01290.0028
D. simulans1110.01280.01280.02590.0085
zpgD. melanogaster410.00950.01130.04130.0001
D. simulans600.01640.01480.02860.0015

Each amplified region of cycA, pumilio, and stwl was analyzed separately; see the section Materials and Methods and Figure S1 for locations of each amplicon. S, segregating sites; θ, nucleotide diversity; πTot, total diversity; πsyn, synonymous diversity; πnon, nonsynonymous diversity. GSC, germline stem cell.

Indicates that gene has a genetic and/or physical interaction reported with bam. For pumilio, four separate regions were amplified and analyzed, labeled as 1-4. For stonewall and cycA two separate regions were amplified, labeled as 1 and 2. GSC, germline stem cell. Each amplified region of cycA, pumilio, and stwl was analyzed separately; see the section Materials and Methods and Figure S1 for locations of each amplicon. S, segregating sites; θ, nucleotide diversity; πTot, total diversity; πsyn, synonymous diversity; πnon, nonsynonymous diversity. GSC, germline stem cell. Analysis of the polymorphism site frequency data using SweeD reveals significant departures from neutrality at 15 of 16 gene/species comparisons after multiple-testing correction (Table 3). For this tabulation, we consider a gene to be showing a significant departure from neutrality if at least one of the gene regions analyzed shows a significant departure (after multiple test correction) for all three demographic scenarios (standard neutral, exponential growth, and 3-epoch bottleneck). Only in D. simulans fits a neutral model under all three demographic scenarios.
Table 3

Site frequency tests of departures from neutral models for eight GSC genes in D. melanogaster and D. simulans

Test DetailsSweeD Test of Recent SelectionOmegaPlus Test of Recent Selection
D. melanogasterD. simulansD. melanogasterD. simulans
Orig P-valueFDR adj P-ValueOrig P-ValueFDR adj P-ValueOrig P-ValueFDR adj P-ValueOrig P-ValueFDR adj P-Value
CycA1.SN0.00070.00230.00150.00320.3670.40370.47560.4756
CycA1.Ex0.00070.00230.00090.00230.43230.43900.40490.4711
CycA1.3Ep0.00070.00230.00120.00270.4390.43900.40940.4711
CycA2.SN0.62030.70300.99330.99330.37260.40370.25940.4214
CycA2.Ex0.34600.43340.73630.80240.28090.38940.26220.4214
CycA2.3Ep0.34670.43340.78630.84600.28660.38940.25950.4214
meiP26.SN0.00100.00230.00090.00230.18940.38940.03330.4214
meiP26.Ex0.00080.00230.00100.00230.20810.38940.09990.4214
meiP26.3Ep0.00080.00230.00060.00230.21080.38940.06480.4214
nano.SN0.00900.01500.00080.00230.40550.42740.22780.4214
nano.Ex0.00200.00400.00100.00230.25600.38940.23290.4214
nano.3Ep0.00500.00920.00090.00230.34030.40370.22040.4214
piwi.SN0.00090.00230.96800.97950.12670.38940.18990.4214
piwi.Ex0.00700.01240.93700.97130.10960.38940.38990.4711
piwi.3Ep0.00100.00230.95000.97290.10300.38940.39090.4711
pum1.SN0.00080.00230.41740.49970.23020.38940.21980.4214
pum1.Ex0.00090.00230.16400.22130.25370.38940.21260.4214
pum1.3Ep0.00090.00230.17360.23060.25320.38940.22930.4214
pum2.SN0.00080.00230.09190.12810.36310.40370.26490.4214
pum2.Ex0.00090.00230.02780.04070.36300.40370.25990.4214
pum2.3Ep0.00080.00230.02270.03390.33890.40370.26050.4214
pum3.SN0.20730.27110.01530.02450.17280.38940.16640.4214
pum3.Ex0.16330.22130.00100.00230.06120.38940.27010.4214
pum3.3Ep0.08000.11330.00190.00390.08000.38940.21310.4214
pum4.SN0.01860.02870.82490.87650.29950.38940.42960.4711
pum4.Ex0.00670.01210.71740.79190.26940.38940.39090.4711
pum4.3Ep0.00800.01390.67100.75050.26900.38940.34140.4711
stwlReg1.SN0.01590.02500.02190.03320.08850.38940.04440.4214
stwlReg1.Ex0.00090.00230.00210.00420.06300.38940.10620.4214
stwlReg1.3Ep0.00430.00810.00100.00230.05160.38940.08270.4214
stwlReg2.SN0.00100.00230.87800.92140.29870.38940.40990.4711
stwlReg2.Ex0.00060.00230.49430.57560.28510.38940.46370.4756
stwlReg2.3Ep0.00070.00230.46880.55340.28690.38940.43490.4711
yb.SN0.00090.00230.00140.00310.00090.03510.44690.4711
yb.Ex0.00060.00230.00400.00770.00590.07670.42120.4711
yb.3Ep0.00080.00230.00070.00230.00330.06440.37020.4711
zpg.SN0.00080.00230.00090.00230.21290.38940.20930.4214
zpg.Ex0.00900.01500.00060.00230.29910.38940.14610.4214
zpg.3Ep0.00100.00230.00090.00230.28070.38940.16920.4214

Simulations to establish P-values were from the standard neutral model (SN), exponential growth model (Ex), or a 3-epoch model (3Ep; large, small, large population size) as described in the section Materials and Methods. FDR-adjusted P-values were determined as described in text. Significant results (P < 0.05) are in bold. FDR, false-discovery rate.

Simulations to establish P-values were from the standard neutral model (SN), exponential growth model (Ex), or a 3-epoch model (3Ep; large, small, large population size) as described in the section Materials and Methods. FDR-adjusted P-values were determined as described in text. Significant results (P < 0.05) are in bold. FDR, false-discovery rate. OmegaPlus rejected the standard neutral model only for Yb in D. melanogaster after multiple test correction at the 0.05 FDR level (Table 3). The generally short size of the regions analyzed may have limited the statistical power of the OmegaPlus method, which relies on a unique structure of linkage disequilibrium generated by recent selective sweeps.

Polymorphism and divergence-based tests

The McDonald-Kreitman (MK) test rejected neutrality for both Yb and after correction for multiple testing (Table 4). The method of Bauer DuMont suggests that these MK test rejections are not due to selection on synonymous sites for either gene. High dN/dS ratios between species (0.627 for Yb, and 0.502 for ) compared with the genome-wide average of 0.0125 (Larracuente ), yet normal levels of dS for both genes (0.132 and 0.119, respectively), suggests that the MK test rejections are due to excesses of fixed nonsynonymous differences between species consistent with positive selection.
Table 4

MK tests of departures from a neutral model for eight GSC genes using polymorphism within both D. melanogaster and D. simulans and fixed differences between species

GeneSynon PolySynon DivNonsyn PolyNonsyn DivP-ValueFDR adj P-Value
cyc A1714460.4140.4997
mei-P26244400NANA
nos11117230.0460.0663
piwi845426220.4160.4997
pum77511150.5060.5812
stwl4562481240.0150.0245
Yb8662801490.000010.0010
zpg5314640.2300.2962

FDR adjusted P-values were determined as described in text. Significant results after FDR (P < 0.05) are in bold. MK, McDonald-Kreitman; GSC, germline stem cell; FDR, false-discovery rate.

FDR adjusted P-values were determined as described in text. Significant results after FDR (P < 0.05) are in bold. MK, McDonald-Kreitman; GSC, germline stem cell; FDR, false-discovery rate. Using the DoFE program of Eyre-Walker and Keightley (2009), we estimated, in both D. melanogaster and D. simulans, the overall proportion of amino acid substitutions fixed due to positive selection (α), and the 95% credibility interval around this estimate (supplemental method presented in Eyre-Walker and Keightley 2009). This analysis uses the site frequency spectrum across the eight GSC loci to estimate the distribution of fitness effects acting on new deleterious mutations, while incorporating a general model of effective population size change. The distribution of fitness effects is then used to determine the proportion of amino acid fixations that are due to positive selection. For the eight loci in our study, we estimate α to be 0.814 (95% credibility interval: 0.698−0.896) and 0.790 (95% credibility interval: 0.681−0.881) for D. melanogaster and D. simulans, respectively. We also analyzed the X and autosomal loci separately. For D. melanogaster we observe a α of 0.934 (95% credibility interval: 0.852−0.979) for the X chromosome and 0.672 (95% credibility interval: 0.413−0.836) for the autosomes. For D. simulans we observe a α of 0.856 (95% credibility interval: 0.695−0.957) for the X chromosome and 0.743 (95% credibility interval: 0.579−0.876) for the autosomes. The autosomal 95% credibility interval estimated for α from our D. melanogaster data encompasses the α estimate obtained from sequence data from 419 autosomal loci chosen randomly (0.52; Keightley and Eyre-Walker 2012). To date, this method to estimate α has not been applied to another D. simulans dataset. However, α has been calculated by other methods for D. simulans and estimates have ranged from 0.43 to 0.94 (reviewed in Eyre-Walker 2006), which is similar to the estimates we present here.

Divergence-based analyses

No evidence of recurrent, adaptive evolution at the same subset of codons across D. melanogaster, D. simulans, D. sechellia, D. yakuba, D. erecta, and D. ananassae was detected using PAML (Yang 1997, 2007) for seven of the eight genes cycA, , , , , , (our analyses and those presented in The ). However, we do find evidence of recurrent, positive selection at specific codons for Yb. Using both models M7 vs. M8, and M8 vs. M8a, we find that the data fit a model of selection significantly better than a neutral null model (likelihood ratio test statistics of 16.068 with P < 0.0003, and 6.321 with P < 0.01, respectively). This result is robust to alignment with this highly diverged protein, including reanalysis removing all codons adjacent to predicted INDELS. 19 of the aligned codons at Yb are predicted by Bayes Empirical Bayes analysis to be in the selective class with an average codon-specific dN/dS (= ω) of 1.88. However, only two codons in this class have predicted posterior probabilities greater than 0.90, and they do not fall in areas of known domains.

Discussion

Previous genome-wide next-generation sequencing studies using both site frequency-based and MK tests of neutrality have reported an enrichment of putative adaptive evolution in Gene Ontology categories such as germ-cell development, cystoblast division, and germarium-derived oocyte fate determination (Begun ; Langley ; Mackay ). In this study, we performed high-quality Sanger sequencing of population samples from both D. melanogaster and D. simulans and found that all eight genes involved in GSC regulation studied here reject a neutral model of evolution in at least one test and species (Tables 3 and 4). Most of these rejections are due to the polymorphism-based SweeD analysis for which every locus, except , rejects the neutral model in both D. melanogaster and D. simulans. The locus only rejects neutrality by the SweeD test in D. melanogaster. Rejecting the neutral model with SweeD is suggestive of positive selection, but it could also be due to demographic history (Pavlidis ). We attempted to take the demographic history of these species into account by using simulated replicates of estimates of D. melanogaster African population dynamics (Hutter ; Duchen ) to determine our significant SweeD cutoff points. However, the true demographic history of these species is unknown. So, we stress that our SweeD rejections are restricted to the demographic scenarios we considered. The detection of outliers of a test statistic’s genomic distribution is another method used to determine statistical significance. Recently Pool applied SweeD (labeled SweepFinder in their manuscript) genome-wide for an African population of D. melanogaster and they list regions containing genomic outliers, assumed to be due to positive selection. As an attempt to determine if our SweeD rejections are more likely due to demography vs. selection, we checked to see if the eight GSC loci we analyzed fell within or near the Pool outliers. The protein coding regions (CDS) for three GSC loci (Yb, and ) are within an outlier region, suggesting that for these loci our SweeD rejections are due to positive selection. The CDS for two other GSC loci ( and nano) are within 50 kb of an outlier region. Simulations have shown that SweeD’s ability to pinpoint the target of selection is compromised if both selection and demographic perturbations have occurred (Pavlidis ) with the predicted target being tens of kilobases away from the actual location of selection. To determine if by chance one would expect to observe three of eight loci within an outlier regions, or five of eight loci 50 kb from an outlier region, we randomly picked eight loci from the D. melanogaster genome. The loci were picked such that we obtained a random sample with the same distribution across the X, 2nd, or 3rd chromosomes as observed across the GSC loci. For both cases our observation is significant with only 36 of 1000 bootstrapped samples having 3 and greater or 5 and greater loci within or 50 Kb from an outlier region, respectively (thus P-value = 0.036 for our observation). Therefore, for 5 of the 8 GSC loci we analyzed, two different datasets (using two different methods for determining the significant cutoffs) suggest that their frequency spectra do not match the neutral model in D. melanogaster. For D. simulans, making a distinction between demography and selection is more tenuous, especially given that there are no comparable estimates of the demographic history within Africa for this species. Yb is the only gene to show significant departures from neutrality consistent with natural selection for the site frequency test SweeD as well as for both the MK and PAML tests that can detect recurrent historical selection. This combination of test results suggests that the recent sweeps at Yb detected by SweeD are just the latest of many selective fixations of nonsynonymous substitutions that have occurred among these six species. Using the method of Eyre-Walker and Keightley (2009), we estimate that 81% of the amino acid differences fixed in these eight genes in the D. melanogaster lineage and 79% of the amino acid differences fixed in D. simulans lineage have been driven by positive selection. Estimates for X-linked genes were slightly, although not significantly, larger than those for autosomes. This proportion is on the upper end of that estimated for other groups of genes in these species. The pattern of evidence for recent or recurrent positive selection that we observe and the diverse functions and expression patterns of these genes suggest that there are likely multiple selective pressures driving the adaptive evolution in genes important in GSC regulation. For example, three of the eight genes examined that reject the neutral model have no known interaction or dependence on function (, , and Yb: Chen and McKearin 2005; Li ). Yb is expressed in the stem cell niche (King and Lin 1999; King ), whereas binds chromatin (Clark and McKearin 1996; Maines ), making it less likely that the same specific selective pressures act on both. The hypothesis of sexual selection and sexual conflict (Swanson and Vacquier 2002) cannot be formally rejected but seem implausible for genes functioning in GSCs. For example, most theories of sexual selection predict strong effects on premating traits, which are highly unlikely to be influenced by the genes we have examined. Likewise, sexual conflict, whereby one sex manipulates the reproductive fitness of the other sex, is much more likely to occur for molecules that are transmitted between males and females, a function that is implausible for any of the GSC regulatory genes in this study. Several other mechanistic and evolutionary hypotheses have been proposed to explain the evolutionary causes of positive selection inferred for and . Some of these selective pressures also may drive the adaptive evolution of other genes involved in GSC regulation. Civetta proposed that species-specific changes in rates of proteolysis could drive protein sequence divergence. This proposal was supported by the observation that ’s expression is transient and by previous studies in C. elegans that have shown that transiently expressed genes have elevated rates of protein evolution (Cutter and Ward 2005). Although this could influence the molecular evolution of , and potentially which is also transiently expressed (Ohlstein ), it is unlikely to explain all selection acting on GSC gene evolution since , Yb, , and have much broader patterns and timings of expression (Clark and McKearin 1996; Forbes and Lehmann 1998; Cox ; Szakmary ). We had previously hypothesized that coevolution with external pathogens infecting the germline could underlie the elevated nonsynonymous divergence in and along the D. melanogaster and D. simulans lineages (Bauer DuMont ). Two maternally-inherited bacterial endosymbionts (Wolbachia and Spiroplasma) have been detected in some but not all species of Drosophila (Mateos ; Watts ). Infection by Wolbachia can have beneficial effects in some species by increasing resistance to viral infections, which may explain their widespread presence (Chrostek ; Hedges ; Teixeira ). However, Wolbachia infection can also reduce fecundity due to cytoplasmic incompatibilities in crosses between infected and uninfected individuals (Fry ). Overreplication of Wolbachia also has been linked to shortening life-span and rupture of host cells (Min and Benzer 1997). There is likely to be a delicate balance in controlling endosymbiont proliferation within a cell so that the host can receive benefits from the endosymbiont but minimize any deleterious effects (Chrostek ). Maintaining such a balance could contribute to an “arms race” between GSC regulatory genes and endosymbionts (e.g., Werren 2005; Bauer DuMont ). The expression patterns and known pleiotropic functions of Yb, , (Aravin ; Brennecke ; Clark and McKearin. 1996; Maines ) suggest that other pressures may be acting on them. One possible selective pressure is intracellular parasites such as transposons. Transposons are selfish genetic elements that can propagate throughout the genome, resulting in deleterious effects on their host. Recent studies demonstrated that many taxa, including Drosophila, have a small RNA silencing pathway, termed the piRNA pathway, that is active in the germline and provides an adaptive defense against transposons (Aravin ). Many piRNA pathway genes also have been shown to adaptively evolve (Obbard ; Kolaczkowski ). and Yb are required for the proper silencing of transposons (Aravin ; Olivieri ; Saito ). Therefore, the adaptive evolution seen in these two proteins may reflect their involvement in silencing transposons as previously suggested for (Obbard ; Kolaczkowski ). Additionally, it is possible that selective pressure to repress transposons may be driving the adaptive evolution of since some other chromatin-associated proteins are involved in transposon silencing (Klattenhoff ; Rangan ). Species-specific changes in life history and the timing of reproduction could also pose changing selective pressures on the germline (Schmidt and Paaby 2008), though our limited knowledge of the ages of reproduction for natural populations of Drosophila limits our ability to test this hypothesis. In the future, it will be important to test whether these positively selected GSC genes function in the specific biological processes that we hypothesize are driving their adaptive evolution. For example, do or play a role in regulating the transmission of bacterial endosymbionts, or does act in the repression of transposons? Additional insight may come from sampling these genes from additional Drosophila species to determine whether they have experienced a long-term selective pressure across many Drosophila or whether it is specific to D. melanogaster and D. simulans.
  95 in total

1.  Demography and natural selection have shaped genetic variation in Drosophila melanogaster: a multi-locus approach.

Authors:  Sascha Glinka; Lino Ometto; Sylvain Mousset; Wolfgang Stephan; David De Lorenzo
Journal:  Genetics       Date:  2003-11       Impact factor: 4.562

Review 2.  How different is Venus from Mars? The genetics of germ-line stem cells in Drosophila females and males.

Authors:  Lilach Gilboa; Ruth Lehmann
Journal:  Development       Date:  2004-10       Impact factor: 6.868

3.  Searching for footprints of positive selection in whole-genome SNP data from nonequilibrium populations.

Authors:  Pavlos Pavlidis; Jeffrey D Jensen; Wolfgang Stephan
Journal:  Genetics       Date:  2010-04-20       Impact factor: 4.562

4.  History and structure of sub-Saharan populations of Drosophila melanogaster.

Authors:  John E Pool; Charles F Aquadro
Journal:  Genetics       Date:  2006-09-01       Impact factor: 4.562

5.  Roles for the Yb body components Armitage and Yb in primary piRNA biogenesis in Drosophila.

Authors:  Kuniaki Saito; Hirotsugu Ishizu; Miharu Komai; Hazuki Kotani; Yoshinori Kawamura; Kazumichi M Nishida; Haruhiko Siomi; Mikiko C Siomi
Journal:  Genes Dev       Date:  2010-10-21       Impact factor: 11.361

6.  Accumulation of a differentiation regulator specifies transit amplifying division number in an adult stem cell lineage.

Authors:  Megan L Insco; Arlene Leon; Cheuk Ho Tam; Dennis M McKearin; Margaret T Fuller
Journal:  Proc Natl Acad Sci U S A       Date:  2009-12-14       Impact factor: 11.205

7.  Reproductive diapause and life-history clines in North American populations of Drosophila melanogaster.

Authors:  Paul S Schmidt; Annalise B Paaby
Journal:  Evolution       Date:  2008-02-21       Impact factor: 3.694

8.  Pervasive adaptive evolution in primate seminal proteins.

Authors:  Nathaniel L Clark; Willie J Swanson
Journal:  PLoS Genet       Date:  2005-09       Impact factor: 5.917

9.  The Yb protein defines a novel organelle and regulates male germline stem cell self-renewal in Drosophila melanogaster.

Authors:  Akos Szakmary; Mary Reedy; Hongying Qi; Haifan Lin
Journal:  J Cell Biol       Date:  2009-05-11       Impact factor: 10.539

10.  Wolbachia variants induce differential protection to viruses in Drosophila melanogaster: a phenotypic and phylogenomic analysis.

Authors:  Ewa Chrostek; Marta S P Marialva; Sara S Esteves; Lucy A Weinert; Julien Martinez; Francis M Jiggins; Luis Teixeira
Journal:  PLoS Genet       Date:  2013-12-12       Impact factor: 5.917

View more
  11 in total

1.  The Drosophila bag of marbles Gene Interacts Genetically with Wolbachia and Shows Female-Specific Effects of Divergence.

Authors:  Heather A Flores; Jaclyn E Bubnell; Charles F Aquadro; Daniel A Barbash
Journal:  PLoS Genet       Date:  2015-08-20       Impact factor: 5.917

2.  Molecular population genetics of the Polycomb genes in Drosophila subobscura.

Authors:  Juan M Calvo-Martín; Montserrat Papaceit; Carmen Segarra
Journal:  PLoS One       Date:  2017-09-14       Impact factor: 3.240

3.  RNA-Interference Pathways Display High Rates of Adaptive Protein Evolution in Multiple Invertebrates.

Authors:  William H Palmer; Jarrod D Hadfield; Darren J Obbard
Journal:  Genetics       Date:  2018-02-01       Impact factor: 4.562

4.  Contrasting patterns of molecular evolution in metazoan germ line genes.

Authors:  Carrie A Whittle; Cassandra G Extavour
Journal:  BMC Evol Biol       Date:  2019-02-11       Impact factor: 3.260

5.  Molecular population genetics of Sex-lethal (Sxl) in the Drosophila melanogaster species group: a locus that genetically interacts with Wolbachia pipientis in Drosophila melanogaster.

Authors:  Vanessa L Bauer DuMont; Simone L White; Daniel Zinshteyn; Charles F Aquadro
Journal:  G3 (Bethesda)       Date:  2021-08-07       Impact factor: 3.154

Review 6.  Redundant mechanisms regulating the proliferation vs. differentiation balance in the C. elegans germline.

Authors:  Kara Vanden Broek; Xue Han; Dave Hansen
Journal:  Front Cell Dev Biol       Date:  2022-09-02

7.  Spermatogenesis Drives Rapid Gene Creation and Masculinization of the X Chromosome in Stalk-Eyed Flies (Diopsidae).

Authors:  Richard H Baker; Apurva Narechania; Rob DeSalle; Philip M Johns; Josephine A Reinhardt; Gerald S Wilkinson
Journal:  Genome Biol Evol       Date:  2016-03-26       Impact factor: 3.416

8.  Molecular Evolution of Drosophila Germline Stem Cell and Neural Stem Cell Regulating Genes.

Authors:  Jae Young Choi; Charles F Aquadro
Journal:  Genome Biol Evol       Date:  2015-10-27       Impact factor: 3.416

9.  QTL mapping of natural variation reveals that the developmental regulator bruno reduces tolerance to P-element transposition in the Drosophila female germline.

Authors:  Erin S Kelleher; Jaweria Jaweria; Uchechukwu Akoma; Lily Ortega; Wenpei Tang
Journal:  PLoS Biol       Date:  2018-10-30       Impact factor: 8.029

10.  Stonewall prevents expression of ectopic genes in the ovary and accumulates at insulator elements in D. melanogaster.

Authors:  Daniel Zinshteyn; Daniel A Barbash
Journal:  PLoS Genet       Date:  2022-03-24       Impact factor: 5.917

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.