Literature DB >> 26882987

R2d2 Drives Selfish Sweeps in the House Mouse.

John P Didion1, Andrew P Morgan2, Liran Yadgary2, Timothy A Bell2, Rachel C McMullan2, Lydia Ortiz de Solorzano2, Janice Britton-Davidian3, Carol J Bult4, Karl J Campbell5, Riccardo Castiglia6, Yung-Hao Ching7, Amanda J Chunco8, James J Crowley9, Elissa J Chesler4, Daniel W Förster10, John E French11, Sofia I Gabriel12, Daniel M Gatti4, Theodore Garland13, Eva B Giagia-Athanasopoulou14, Mabel D Giménez15, Sofia A Grize16, İslam Gündüz17, Andrew Holmes18, Heidi C Hauffe19, Jeremy S Herman20, James M Holt21, Kunjie Hua9, Wesley J Jolley22, Anna K Lindholm16, María J López-Fuster23, George Mitsainas14, Maria da Luz Mathias12, Leonard McMillan21, Maria da Graça Morgado Ramalhinho12, Barbara Rehermann24, Stephan P Rosshart24, Jeremy B Searle25, Meng-Shin Shiao26, Emanuela Solano6, Karen L Svenson4, Patricia Thomas-Laemont8, David W Threadgill27, Jacint Ventura28, George M Weinstock29, Daniel Pomp30, Gary A Churchill4, Fernando Pardo-Manuel de Villena1.   

Abstract

A selective sweep is the result of strong positive selection driving newly occurring or standing genetic variants to fixation, and can dramatically alter the pattern and distribution of allelic diversity in a population. Population-level sequencing data have enabled discoveries of selective sweeps associated with genes involved in recent adaptations in many species. In contrast, much debate but little evidence addresses whether "selfish" genes are capable of fixation-thereby leaving signatures identical to classical selective sweeps-despite being neutral or deleterious to organismal fitness. We previously described R2d2, a large copy-number variant that causes nonrandom segregation of mouse Chromosome 2 in females due to meiotic drive. Here we show population-genetic data consistent with a selfish sweep driven by alleles of R2d2 with high copy number (R2d2(HC)) in natural populations. We replicate this finding in multiple closed breeding populations from six outbred backgrounds segregating for R2d2 alleles. We find that R2d2(HC) rapidly increases in frequency, and in most cases becomes fixed in significantly fewer generations than can be explained by genetic drift. R2d2(HC) is also associated with significantly reduced litter sizes in heterozygous mothers, making it a true selfish allele. Our data provide direct evidence of populations actively undergoing selfish sweeps, and demonstrate that meiotic drive can rapidly alter the genomic landscape in favor of mutations with neutral or even negative effects on overall Darwinian fitness. Further study will reveal the incidence of selfish sweeps, and will elucidate the relative contributions of selfish genes, adaptation and genetic drift to evolution.
© The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

Entities:  

Keywords:  House Mouse.; Meiotic Drive; R2d2; Selective Sweep; Selfish Genes

Mesh:

Substances:

Year:  2016        PMID: 26882987      PMCID: PMC4868115          DOI: 10.1093/molbev/msw036

Source DB:  PubMed          Journal:  Mol Biol Evol        ISSN: 0737-4038            Impact factor:   16.240


Introduction

Population-level sequencing data have enabled analyses of positive selection in many species, including mice (Staubach et al. 2012) and humans (Williamson et al. 2007; Grossman et al. 2013; Colonna et al. 2014). These studies seek to identify genetic elements, such as single nucleotide variants and copy number variants, that are associated with phenotypic differences between populations that share a common origin (Fu and Akey 2013; Bryk and Tautz 2014). A marked difference in local genetic diversity between closely related taxa might indicate that one lineage has undergone a sweep. During a sweep, a variant under strong positive selection rises in frequency and carries with it linked genetic variation (“genetic hitch-hiking”), thereby reducing local haplotype diversity (Maynard Smith and Haigh 1974; Kaplan et al. 1989). In genomic scans for sweeps, it is typically assumed that the driving allele will have a strong positive effect on organismal fitness. Prominent examples of sweeps for which this assumption holds true (i.e., classic selective sweeps) include alleles at the Vkorc1 locus, which confers rodenticide resistance in the brown rat (Pelz et al. 2005), and enhancer polymorphisms conferring lactase persistence in human beings (Bersaglieri et al. 2004). However, we and others have suggested that selfish alleles that strongly promote their own transmission irrespective of their effects on overall fitness could give rise to genomic signatures indistinguishable from those of classic selective sweeps (Sandler and Novitski 1957; White 1978; Henikoff and Malik 2002; Derome et al. 2004; Pardo-Manuel de Villena 2004; Brandvain and Coop 2011). Suggestive evidence that sweeps may be driven by selfish alleles comes from studies in Drosophila. Incomplete sweeps have been identified at the segregation distorter (SD) locus (Presgraves et al. 2009) and in at least three X-chromosome systems (Babcock and Anderson 1996; Dyer et al. 2007; Derome et al. 2008; Kingan et al. 2010), all of which drive through the male germline. In addition, genomic conflict has been proposed as a possible driver of two nearly complete sweeps in Drosophila mauritiana (Nolte et al. 2013). Incomplete sweeps were also detected in natural populations of Mimulus (monkeyflower); the cause was identified as female meiotic drive of the centromeric D locus (Fishman and Saunders 2008). The fact that all evidence of selfish sweeps derives from two genera is to some extent reflective of an observational bias, but may also indicate a difference in the incidence or effect of selfish alleles between these taxa and equally well-studied mammalian species (e.g., humans and mice). Furthermore, the lack of completed selfish sweeps reported in the literature may be due to an unexpected strength of balancing selection, in which the deleterious effects of selfish alleles prevent them from driving to fixation, or due to insufficient methods of detection. Here, we investigate whether a selfish allele can sweep in natural and laboratory populations of the house mouse, Mus musculus domesticus. We recently described R2d2, a meiotic drive responder locus on mouse Chromosome 2 (Didion et al. 2015). R2d2 is a variable-size copy number gain of a 127-kb core element that contains a single annotated gene, Cwc22 (a spliceosomal protein). Females heterozygous for R2d2 preferentially transmit to their offspring an allele with high copy number (R2d2) relative to an allele with low copy number (R2d2), where “high copy number” is the minimum copy number with evidence of distorted transmission in existing experiments—approximately 7 of the core element. In contrast to many meiotic drive systems, in which the component elements are tightly linked, the action of R2d2 is dependent on unlinked modifier loci whose frequencies, modes of action, and effect sizes are unknown. These modifier loci modulate the degree of transmission distortion. This explains why distorted transmission is present in some laboratory crosses segregating for R2d2 alleles, but absent in others (Siracusa et al. 1991; Montagutelli et al. 1996; Swallow et al. 1998; Eversley et al. 2010; Kelly, Nehrenberg, Peirce, et al. 2010, Didion et al. 2015). R2d2 genotype is also either uncorrelated or negatively correlated with litter size—a major component of absolute fitness in mice—depending on the presence of meiotic drive. R2d2 therefore behaves as a selfish genetic element. In this study, we provide evidence of a recent sweep at R2d2 in wild M. m. domesticus mice and we show that R2d2 has repeatedly driven selfish sweeps in closed-breeding mouse populations.

Results and Discussion

Evidence for a Selfish Sweep in Wild Mouse Populations

A recent study showed extreme copy number variation at Cwc22 in a sample of 26 wild M. m. domesticus mice (Pezer et al. 2015). To determine whether this was indicative of R2d2 copy number variation in the wild, we assayed an additional 396 individuals sampled from 14 European countries and the United States (supplementary table S1, Supplementary Material online, and fig. 1). We found that R2d2 alleles are segregating at a wide range of frequencies in natural populations (0.00–0.67; table 1).
F

Wild mouse populations used in this study. (A) Geographic distribution of samples used in this study. Samples are colored by taxonomic origin: Blue for Mus musculus domesticus, green for Mus musculus castaneus. Those with standard karyotype (2n = 40) are indicated by closed circles; samples with Robertsonian fusion karyotypes (2n < 40) are indicated by open circles. Populations from Floreana Island (Galapagos Islands, Ecuador; “EC”), Farallon Island (off the coast of San Francisco, California, United States; “USW”), and Maryland, United States (“USE”) are not shown. (B, C) MDS (k = 3 dimensions) reveals population stratification consistent with geography. Mus musculus domesticus populations are labeled by country of origin. Outgroup samples of M. m. castaneus origin cluster together (cas). (D) Population graph estimated from autosomal allele frequencies by TreeMix. Black edges indicate ancestry, while colored edges indicate gene flow by migration or admixture (with yellow to red indicating increasing probability of migration). Topography of the population graph is consistent with MDS result and with the geographic origins of the samples.

Table 1.

R2d2 Allele Frequencies in Wild Mus musculus domesticus Populations.

PopulationR2d2HC Allele Frequency2 × (Number of Individuals)
BE0.506
CH0.3228
CY0.0014
DE0.676
DK0.0618
EC0.0024
ES0.2218
FR0.1526
GR0.08106
IT0.0934
LB0.258
PT0.1354
TN0.004
UK0.006
USE0.21102
USW0.0024

Note.—Populations are given as ISO country codes, except for USE (US East Coast, Maryland) and USW (US West Coast, Farallon Island).

Wild mouse populations used in this study. (A) Geographic distribution of samples used in this study. Samples are colored by taxonomic origin: Blue for Mus musculus domesticus, green for Mus musculus castaneus. Those with standard karyotype (2n = 40) are indicated by closed circles; samples with Robertsonian fusion karyotypes (2n < 40) are indicated by open circles. Populations from Floreana Island (Galapagos Islands, Ecuador; “EC”), Farallon Island (off the coast of San Francisco, California, United States; “USW”), and Maryland, United States (“USE”) are not shown. (B, C) MDS (k = 3 dimensions) reveals population stratification consistent with geography. Mus musculus domesticus populations are labeled by country of origin. Outgroup samples of M. m. castaneus origin cluster together (cas). (D) Population graph estimated from autosomal allele frequencies by TreeMix. Black edges indicate ancestry, while colored edges indicate gene flow by migration or admixture (with yellow to red indicating increasing probability of migration). Topography of the population graph is consistent with MDS result and with the geographic origins of the samples. R2d2 Allele Frequencies in Wild Mus musculus domesticus Populations. Note.—Populations are given as ISO country codes, except for USE (US East Coast, Maryland) and USW (US West Coast, Farallon Island). To test for a selfish sweep at R2d2, we genotyped the wild-caught mice on the MegaMUGA array (Rogala et al. 2014; Morgan and Welsh 2015; Morgan et al. 2015) and examined patterns of haplotype diversity. In the case of strong positive selection, unrelated individuals are more likely to share extended segments that are identical by descent in the vicinity of the selected locus (Albrechtsen et al. 2010) compared with a population subject only to genetic drift. Consistent with this prediction, we observed an extreme excess of shared identity by descent (IBD) across populations around R2d2 (fig. 2): R2d2 falls in the top 0.25% of IBD-sharing scores across the autosomes. In all cases, the shared haplotype has high copy number and this haplotype appears to have a single origin in European mice (supplementary fig. S1, Supplementary Material online). Strong signatures of selection are also evident at a previously identified target of positive selection, the Vkorc1 locus (distal Chromosome 7) (Song et al. 2011). The 12 loci in the top 1% of IBD-sharing scores are shown in table 2.
F

Haplotype sharing at R2d2 provides evidence of a selective sweep in wild mice of European origin. (A) Weighted haplotype-sharing score (supplementary methods, Supplementary Material online) computed in 500 kb bins across autosomes, within which individuals are drawn from the same population (lower panel) or different populations (upper panel). Peaks of interest overlay R2d2 (Chromosome 2; see supplementary fig. S2, Supplementary Material online, for zoomed-in view) and Vkorc1 (distal Chromosome 7). The position of the closely linked t-haplotype and MHC loci is also marked. (B) Decay of EHH (Sabeti et al. 2002) on the R2d2-associated (blue) versus the R2d2-associated (red) haplotype. EHH is measured outward from the index SNP at chr2:83,790,275 and is bounded between 0 and 1. (C) Haplotype bifurcation diagrams for the R2d2 (top panel, blue) and R2d2 (bottom panel, red) haplotypes at the index SNP (open circle). Darker colors and thicker lines indicate higher haplotype frequencies. Haplotypes extend 100 sites in each direction from the index SNP.

Table 2.

The 12 Loci above the 99 Percentile of IBD-sharing Scores.

ChromosomeStart (Mb)End (Mb)LocusPeak IBD-Sharing ScoreStaubach et al. (2012)
279.7585.75R2d20.108
43.257.750.051
4149149.50.045
5113113.50.045
735360.049
7132.75137.25Vkorc10.154*
8116.51180.076
1086.25890.098
137071.750.068
1726.7527.75MHC t-haplotype0.05*
1812.513.750.049
183335.50.216

Note.—Chromosome locations are given based on mouse genome build GRCm38/mm10. Loci identified as targets of positive selection are named and candidate targets of selection identified in wild mice in a previous study (Staubach et al. 2012) are marked with an asterisk. MHC, major histocompatibility complex.

Haplotype sharing at R2d2 provides evidence of a selective sweep in wild mice of European origin. (A) Weighted haplotype-sharing score (supplementary methods, Supplementary Material online) computed in 500 kb bins across autosomes, within which individuals are drawn from the same population (lower panel) or different populations (upper panel). Peaks of interest overlay R2d2 (Chromosome 2; see supplementary fig. S2, Supplementary Material online, for zoomed-in view) and Vkorc1 (distal Chromosome 7). The position of the closely linked t-haplotype and MHC loci is also marked. (B) Decay of EHH (Sabeti et al. 2002) on the R2d2-associated (blue) versus the R2d2-associated (red) haplotype. EHH is measured outward from the index SNP at chr2:83,790,275 and is bounded between 0 and 1. (C) Haplotype bifurcation diagrams for the R2d2 (top panel, blue) and R2d2 (bottom panel, red) haplotypes at the index SNP (open circle). Darker colors and thicker lines indicate higher haplotype frequencies. Haplotypes extend 100 sites in each direction from the index SNP. The 12 Loci above the 99 Percentile of IBD-sharing Scores. Note.—Chromosome locations are given based on mouse genome build GRCm38/mm10. Loci identified as targets of positive selection are named and candidate targets of selection identified in wild mice in a previous study (Staubach et al. 2012) are marked with an asterisk. MHC, major histocompatibility complex. In principle, the strength and age of a sweep can be estimated from the extent of loss of genetic diversity around the locus under selection. From the SNP data, we identified an ∼1 Mb haplotype with significantly greater identity between individuals with R2d2 alleles compared with the surrounding sequence. We used published sequencing data from 26 wild mice (Pezer et al. 2015) to measure local haplotype diversity around R2d2 and found that the haplotypes associated with R2d2 alleles are longer than those associated with R2d2 (fig. 2). This pattern of extended haplotype homozygosity (EHH) is consistent with positive selection over an evolutionary timescale as short as 450 generations (see Methods). However, due to the extremely low rate of recombination in the vicinity of R2d2 (Didion et al. 2015), this is most likely an underestimate of the true age of the mutation. Work is ongoing to better understand the evolutionary history of R2d2. It is important to note that the excess IBD we observe at R2d2 (fig. 2) arises from segments shared “between” geographically distinct populations (fig. 1). When considering sharing “within” populations only (supplementary fig. S2, Supplementary Material online), R2d2 is no longer an outlier. Therefore, it was unsurprising that we failed to detect a sweep around R2d2 using statistics that are designed to identify population-specific differences in selection, like hapFLK (Fariello et al. 2013), or selection in aggregate, like iHS (Voight et al. 2006) (supplementary fig. S3 and methods, Supplementary Material online).

A Selfish Sweep in an Outbred Laboratory Population

We validated the ability of R2d2 to drive a selfish sweep by examining R2d2 allele frequencies in multiple closed-breeding laboratory populations for which we had access to samples from the founder populations. The Diversity Outbred (DO) is a randomized outbreeding population derived from eight inbred mouse strains that is maintained under conditions designed to minimize the effects of both selection and genetic drift (Svenson et al. 2012). Expected time to fixation or loss of an allele present in the founder generation (with initial frequency of 1/8) is ∼900 generations. The WSB/EiJ founder strain contributed an R2d2 allele which underwent more than a 3-fold increase (from 0.18 to 0.62) in 13 generations (P < 0.001 by simulation; range 0.03–0.26 after 13 generations in 1,000 simulation runs) (fig. 3), accompanied by significantly distorted allele frequencies (P < 0.01 by simulation) across an ∼100 Mb region linked to the allele (fig. 3).
F

An R2d2 allele rises to high frequency despite negative effect on litter size in the DO. (A) R2d2 drives 3-fold increase in WSB/EiJ allele frequency in 13 generations in the DO population. Circle sizes reflect number of chromosomes genotyped (2N); error bars are ±2 standard error. (B) Allele frequencies across Chromosome 2 (averaged in 1 Mb bins) at generation 13 of the DO, classified by founder strain. Gray shaded region is the candidate interval for R2d2. (C) Mean litter size among DO females according to R2d2 genotype: LL, R2d2; LH − TRD, R2d2 without transmission ratio distortion; LH + TRD, R2d2 with transmission ratio distortion; HH, R2d2. Circle sizes reflect number of females tested; error bars are 95% confidence intervals from a linear mixed model which accounts for parity and repeated measures on the same female (supplementary methods, Supplementary Material online.) (D) Mean absolute number of R2d2 alleles transmitted in each litter by heterozygous females with (LL + TRD) or without (LL − TRD) transmission ratio distortion. LL + TRD females transmit more R2d2 alleles despite their significantly reduced litter size.

An R2d2 allele rises to high frequency despite negative effect on litter size in the DO. (A) R2d2 drives 3-fold increase in WSB/EiJ allele frequency in 13 generations in the DO population. Circle sizes reflect number of chromosomes genotyped (2N); error bars are ±2 standard error. (B) Allele frequencies across Chromosome 2 (averaged in 1 Mb bins) at generation 13 of the DO, classified by founder strain. Gray shaded region is the candidate interval for R2d2. (C) Mean litter size among DO females according to R2d2 genotype: LL, R2d2; LH − TRD, R2d2 without transmission ratio distortion; LH + TRD, R2d2 with transmission ratio distortion; HH, R2d2. Circle sizes reflect number of females tested; error bars are 95% confidence intervals from a linear mixed model which accounts for parity and repeated measures on the same female (supplementary methods, Supplementary Material online.) (D) Mean absolute number of R2d2 alleles transmitted in each litter by heterozygous females with (LL + TRD) or without (LL − TRD) transmission ratio distortion. LL + TRD females transmit more R2d2 alleles despite their significantly reduced litter size.

R2d2 Has an Underdominant Effect on Fitness

The fate of a selfish sweep depends on the fitness costs associated with the different genotypic classes at the selfish genetic element. For example, maintenance of intermediate frequencies of the M. musculus t-complex (Lyon 1991) and Drosophila SD (Hartl 1973) chromosomes in natural populations is thought to result from decreased fecundity associated with those selfish elements. To assess the fitness consequences of R2d2, we treated litter size as a proxy for absolute fitness (fig. 3). We determined whether each female had distorted transmission of R2d2 using a one-sided exact binomial test for deviation from the expected Mendelian genotype frequencies in her progeny. Average litter size among DO females homozygous for R2d2 (“LL” in fig. 3 8.1; 95% confidence interval [CI] 7.8–8.3; N = 339) is not different from females homozygous for R2d2 (“HH”: 8.1; 95% CI 7.4–8.7; N = 47) or heterozygous females without distorted transmission of R2d2 (“LH-TRD”: 8.1; 95% CI 7.7–8.5; N = 89). However, in the presence of meiotic drive, litter size is markedly reduced (“LH + TRD”: 6.5; 95% CI 5.9–7.2; N = 38; P = 3.7 × 10−5 for test of difference vs. all other classes). The relative fitness of heterozygous females with distorted transmission is w = 0.81, resulting in a selection coefficient of s = 1 – w = 0.19 (95% CI 0.10–0.23) against the heterozygote. Despite this underdominant effect, the absolute number of R2d2 alleles transmitted by heterozygous females in each litter is significantly higher in the presence of meiotic drive than its absence (P = 0.032; fig. 3). The rising frequency of R2d2 in the DO thus represents a truly selfish sweep.

Selfish Sweeps in Other Laboratory Populations

We also observed selfish sweeps in selection lines derived from the ICR:Hsd outbred population (Swallow et al. 1998) in which R2d2 alleles are segregating (fig. 4). Three of the 4 lines selectively bred for high voluntary wheel running (HR lines) and 2 of the 4 control lines (10 breeding pairs per line per generation in both conditions) went from starting R2d2 frequencies of ∼0.75 to fixation in 60 generations or less—2 lines were fixed by generation 20, and 3 more by generation 60. In simulations mimicking this breeding design and neutrality (fig. 4), median time to fixation was 46 generations (5th percentile: 9 generations). Although the R2d2HC allele would be expected to eventually fix by drift in six of the eight lines given its high starting frequency, the observed rates of fixation were not expected (P = 0.003 by simulation). In a related advanced intercross segregating for high and low copy number alleles at R2d2 (HR8xC57BL/6J; Kelly, Nehrenberg, Hua, et al. 2010), we observed that R2d2 increased from a frequency of 0.5 to 0.85 in just 10 generations and fixed by 15 generations (fig. 4) versus a median 184 generations in simulations (P < 0.001; fig. 4). The increase in R2d2 allele frequency in the DO and advanced intercross populations occurred at least an order of magnitude faster than what is predicted by drift alone.
F

R2d2 alleles rapidly increase in frequency in ICR:Hsd-derived laboratory populations. (A) R2d2 allele frequency during breeding of four HR selection lines and four control lines. Trajectories are colored by their fate: Blue, R2d2 fixed by generation 20; red, R2d2 fixed by generation 60; gray, R2d2 not fixed. Circle sizes reflect number of chromosomes (2N) genotyped. (B) Cumulative distribution of time to fixation (blue) or loss (gray) of the focal allele in 1,000 simulations of an intercross line mimicking the HR breeding scheme. Dotted line indicates median fixation time. (C) R2d2 allele frequency during breeding of an (HR8xC57BL/6J) advanced intercross line. Circle sizes reflect number of chromosomes (2N) genotyped. (D) Cumulative distribution of time to fixation (blue) or loss (gray) of the focal allele in 1,000 simulations of an advanced intercross line mimicking the HR8xC57BL/6J advanced intercross line (AIL). Dotted line indicates median fixation time.

R2d2 alleles rapidly increase in frequency in ICR:Hsd-derived laboratory populations. (A) R2d2 allele frequency during breeding of four HR selection lines and four control lines. Trajectories are colored by their fate: Blue, R2d2 fixed by generation 20; red, R2d2 fixed by generation 60; gray, R2d2 not fixed. Circle sizes reflect number of chromosomes (2N) genotyped. (B) Cumulative distribution of time to fixation (blue) or loss (gray) of the focal allele in 1,000 simulations of an intercross line mimicking the HR breeding scheme. Dotted line indicates median fixation time. (C) R2d2 allele frequency during breeding of an (HR8xC57BL/6J) advanced intercross line. Circle sizes reflect number of chromosomes (2N) genotyped. (D) Cumulative distribution of time to fixation (blue) or loss (gray) of the focal allele in 1,000 simulations of an advanced intercross line mimicking the HR8xC57BL/6J advanced intercross line (AIL). Dotted line indicates median fixation time. Using archival tissue samples, we were able to determine R2d2 allele frequencies in the original founder populations of 6 (out of ∼60) wild-derived inbred strains available for laboratory use (Didion and Pardo-Manuel de Villena 2013). In four strains, WSB/EiJ, WSA/EiJ, ZALENDE/EiJ, and SPRET/EiJ, R2d2 alleles were segregating in the founders and are now fixed in the inbred populations. In the other two strains, LEWES/EiJ and TIRANO/EiJ, the founders were not segregating for R2d2 copy number and the inbred populations are fixed, as expected, for R2d2 (supplementary fig. S4, Supplementary Material online). This trend in wild-derived strains is additional evidence of the tendency for R2d2 to go to fixation in closed breeding populations when segregating in the founder individuals.

On the Distribution and Frequency of R2d2 Alleles in the Wild

Considering the degree of transmission distortion in favor of R2d2 (up to 95%; Didion et al. 2015) and that R2d2 repeatedly goes to fixation in laboratory populations, the moderate frequency of R2d2 in the wild (0.14 worldwide; table 1) is initially surprising. Additionally, we do not find any obvious association between geography and R2d2 allele frequency that might indicate the mutation’s origin or its pattern of gene flow (table 1 and fig. 1). Several observations may explain these results. First, relative to the effective size of M. m. domesticus (82,500–165,000; Geraldes et al. 2011), our sample size was small. Our sampling was also geographically sparse and nonuniform. Thus, our allele frequency estimates may differ substantially from the true population allele frequencies at R2d2. Second, the reduction in litter size associated with R2d2 may have a greater impact on R2d2 allele frequency in a natural population than in the controlled laboratory populations we studied. In these breeding schemes, each mating pair contributes the same number of offspring to the next generation so that most fitness differences are effectively erased. Third, R2d2 alleles may be unstable and lose the ability to drive upon reverting to low copy number. This has been reported previously (Didion et al. 2015). Fourth, in a large population (i.e., in the wild), the dynamics of an underdominant meiotic drive allele are only dependent on the relationship between the degree of transmission distortion (m) and the strength of selection against heterozygotes (s; while this is not the standard interpretation of the parameter usually denoted, s, we chose it to be consistent with the notation in Hedrick 1981). This relationship can be expressed by the quantity q (see Methods), for which q > 1 indicates increasing probability of fixation of the driving allele, q < 1 indicates increasing probability that the allele will be purged, and q ≈ 1 leads to maintenance of the allele at an (unstable) equilibrium frequency (Hedrick 1981). The fate of the driving allele in a finite population additionally depends on the population size—the smaller the population, the greater the likelihood that genetic drift will fix a mutation with q < 1 (fig. 5). We note that R2d2 appears to exist close to the q ≈ 1 boundary (s ≈ 0.2, m ≈ 0.7, and thus q ≈ 0.96).
F

Population dynamics of a meiotic drive allele. (A) Phase diagram for a meiotic drive system like R2d2 with respect to transmission ratio (m) and selection coefficient against the heterozygote (s). Regions of the parameter space for which there is directional selection for the driving allele are shown in black; regions in which there are unstable equilibria or directional selection against the driving allele are shown in gray. (B) Probability of fixing the driving allele as a function of m, s, and population size (N). Notice that, in the area corresponding to the gray region of panel A, fixation probability declines rapidly as population size increases. (C) Probability of fixing the driving allele in simulations of meiotic drive dependent on no modifier (light gray) or a single modifier locus (dark gray) with varying allele frequency; N = 100, s = 0.2, maximum m = 0.8, initial driver frequency = 1/2N. Estimates are given ± 2 standard error. Gray dashed line corresponds to fixation probability for a neutral allele (1/2N). (D) Time to fixation of the driving allele. Values represent 100 fixation events in each condition. (E) Example of allele-frequency trajectories from a “collapsed” selfish sweep. Although the modifier allele is present at intermediate frequency, the driving allele sweeps to a frequency of ∼0.75. After the modifier allele is lost, the driver drifts out of the population as well.

Population dynamics of a meiotic drive allele. (A) Phase diagram for a meiotic drive system like R2d2 with respect to transmission ratio (m) and selection coefficient against the heterozygote (s). Regions of the parameter space for which there is directional selection for the driving allele are shown in black; regions in which there are unstable equilibria or directional selection against the driving allele are shown in gray. (B) Probability of fixing the driving allele as a function of m, s, and population size (N). Notice that, in the area corresponding to the gray region of panel A, fixation probability declines rapidly as population size increases. (C) Probability of fixing the driving allele in simulations of meiotic drive dependent on no modifier (light gray) or a single modifier locus (dark gray) with varying allele frequency; N = 100, s = 0.2, maximum m = 0.8, initial driver frequency = 1/2N. Estimates are given ± 2 standard error. Gray dashed line corresponds to fixation probability for a neutral allele (1/2N). (D) Time to fixation of the driving allele. Values represent 100 fixation events in each condition. (E) Example of allele-frequency trajectories from a “collapsed” selfish sweep. Although the modifier allele is present at intermediate frequency, the driving allele sweeps to a frequency of ∼0.75. After the modifier allele is lost, the driver drifts out of the population as well. Last but not least, the action of R2d2 is dependent on unlinked modifier loci. It is therefore difficult to predict the effect of the modifiers on R2d2 allele frequencies in the wild. We used forward-in-time simulations to explore the effect of a single unlinked modifier locus on fixation probability of a driving allele. Under an additive model (m = 0.80 for modifier genotype AA, 0.65 for genotype Aa, and 0.50 for genotype aa), fixation probability is reduced and time to fixation is increased by the presence of the modifier locus (fig. 5). As the modifier allele becomes more rare, fixation probability approaches the neutral expectation (1/2N, where N is population size). Importantly, the driving allele tends to sweep until the modifier allele is lost, and then drifts either to fixation or loss (fig. 5). Drift at modifier loci thus creates a situation akin to selection in a varying environment—one outcome of which is balancing selection (Gillespie 2010). This is consistent with the maintenance of R2d2 at intermediate frequencies in multiple populations separated by space and time, as we observe in wild mice. Work is ongoing to map the locations and determine the frequencies, effect sizes, and modes of action of these modifier loci.

Concluding Remarks

Most analyses of positive selection in the literature assume that the likelihood of a newly arising mutation becoming established, increasing in frequency and even going to fixation within a population is positively correlated with its effect on organismal fitness. Here, we have shown that a selfish genetic element has repeatedly driven sweeps in which the change in allele frequency and the effect on organismal fitness are decoupled. Our results suggest that evolutionary studies should employ independent evidence to determine whether loci implicated as drivers of selective sweeps are adaptive or selfish. Although a selfish sweep has clear implications for such experimental populations as the DO and the Collaborative Cross (Didion et al. 2015), the larger evolutionary implications of selfish sweeps are less obvious. On the one hand, sweeps may be relatively rare, as appears to be the case for classic selective sweeps in recent human history (Hernandez et al. 2011). On the other hand, theory and comparative studies indicate that selfish genetic elements may be a potent force during speciation (White 1978; Hedrick 1981; Pardo-Manuel de Villena and Sapienza 2001; Henikoff and Malik 2002; Brandvain and Coop 2011). With the growing appreciation for the potential importance of non-Mendelian genetics in evolution and the increasing tractability of population-scale genetic analyses, we anticipate that the effects of selfish elements such as R2d2 in natural populations, including their contributions to events of positive selection, will soon be elucidated. Improved understanding of the mechanism of meiotic drive at R2d2 may also enable practical applications of selfish genetic elements. As demonstrated by the recent use of RNA-guided genome editing to develop gene drive systems in mosquitos and fruit flies (Esvelt et al. 2014; Gantz and Bier 2015; Hammond et al. 2016), experimental manipulation of chromosome segregation is now feasible. R2d2 is an attractive option for the development of a mammalian gene drive system because, as we have shown here, it has already proven capable of driving to fixation in multiple independent genetic backgrounds. Furthermore, there are multiple unlinked modifiers of R2d2 that, when identified, might be exploited for fine-grained manipulation of transmission ratios.

Materials and Methods

Mice

Wild M. m. domesticus was trapped at a large number of sites across Europe and the Americas (fig. 1 [upper panel] and supplementary table S1, Supplementary Material online). A set of 29 Mus musculus castaneus mice trapped in northern India and Taiwan (fig. 1, lower panel) were included as an outgroup (Yang et al. 2011). Trapping was carried out in concordance with local laws and either did not require approval or was carried out with the approval of the relevant regulatory bodies (depending on the locality and institution). All DO mice were bred at The Jackson Laboratory. Litter sizes were counted within 24 h of birth. Individual investigators purchased mice for unrelated studies and contributed either tissue samples or genotype data to this study (supplementary table S2, Supplementary Material online). High running (HR) selection and intercross lines were developed as previously described (Swallow et al. 1998; Kelly, Nehrenberg, Peirce, et al. 2010; Leamy et al. 2012). Mouse tails were archived from three generations of the HR selection lines (−2, +22, and +61) and from every generation of the HR8xC57BL/6J advanced intercross. Progenitors of wild-derived strains have various origins (supplementary methods, Supplementary Material online), and were sent to Eva M. Eicher at The Jackson Laboratory for inbreeding in the early 1980s. Frozen tissues from animals in the founder populations were maintained at The Jackson Laboratory by Muriel Davidson until 2014, when they were transferred to the Pardo-Manuel de Villena laboratory at the University of North Carolina at Chapel Hill. All laboratory mice were handled in accordance with the IACUC protocols of the investigators’ respective institutions.

Genotyping

Microarray Genotyping and Quality Control

Whole-genomic DNA was isolated from tail, liver, muscle, or spleen using Qiagen Gentra Puregene or DNeasy Blood & Tissue kits according to the manufacturer’s instructions. All genome-wide genotyping was performed using the Mouse Universal Genotyping Array (MUGA) and its successor, MegaMUGA (GeneSeek, Lincoln, NE) (Collaborative Cross Consortium 2012; Morgan and Welsh 2015). Genotypes were called using Illumina BeadStudio (Illumina, Inc., Carlsbad, CA). We excluded all markers and all samples with missingness greater than 10%. We also computed the sum intensity for each marker: S = X + Y, where X and Y are the normalized hybridization intensities of the two allelic probes. We determined the expected distribution of sum intensity values using a large panel of control samples. We excluded any array for which the set of intensities I = {S, S, …, S} was not normally distributed or whose mean was significantly left shifted from the reference distribution (one-tailed t-test with P < 0.05).

PCR Genotyping

The R2d2 element has been mapped to a 900 kb critical region on Chromosome 2: 83,631,096–84,541,308 (mm9 build), referred to herein as the “candidate interval” (Didion et al. 2015). We designed primers to amplify two regions within the candidate interval. “Primer Set A” targets a 318 bp region (chr2: 83,673,604–83,673,921) with two distinct haplotypes in linkage with either the R2d2 allele or the R2d2 allele: 5′-CCAGCAGTGATGAGTTGCCATCTTG-3′ (forward) and 5′-TGTCACCAAGGTTTTCTTCCAAAGGGAA-3′ (reverse). “Primer Set B” amplifies a 518 bp region (chr2: 83,724,728–83,725,233); the amplicon is predicted, based on whole-genome sequencing, to contain a 169 bp deletion in HR8 relative to the C57BL/6J reference genome: 5′-GAGATTTGGATTTGCCATCAA-3′ (forward) and 5′-GGTCTACAAGGACTAGAAACAG-3′ (reverse). Primers were designed using IDT PrimerQuest (https://www.idtdna.com/Primerquest/Home/Index, last accessed February 24, 2016). Crude whole-genomic DNA for PCR reactions was extracted from mouse tails. The tissues were heated in 100 μl of 25 mM NaOH/0.2 mM ethylenediaminetetraacetic acid at 95 °C for 60 min followed by the addition of 100 μl of 40 mM Tris–HCl. The mixture was then centrifuged at 2,000 × g for 10 min and the supernatant used as polymerase (PCR) template. PCR reactions were performed in a 10 μl volume and contained 0.25 mM dNTPs (deoxyribonucleotide triphosphate mixture), 0.3 mM of each primer, and 0.5 U of GoTaq polymerase (Promega). Cycling conditions were 95 °C, 2–5 min, 35 cycles at 95 °C, 55 °C, and 72 °C for 30 s each, with a final extension at 72 °C, 7 min. For Primer Set A, products were sequenced at the University of North Carolina Genome Analysis Facility on an Applied Biosystems 3730XL Genetic Analyzer. Chromatograms were analyzed with the Sequencher software package (Gene Codes Corporation, Ann Arbor, MI). For Primer Set B, products were visualized and scored on 2% agarose gels. Assignment to haplotypes was validated by comparing the results to quantitative PCR (qPCR) assays for the single protein-coding gene within R2d2, Cwc22 (see “Copy-number assays” below). For generation +61, haplotypes were assigned based on MegaMUGA genotypes and validated by the normalized per-base read depth from whole-genome sequencing (see below), calculated with samtools mpileup (Li et al. 2009). The concordance between qPCR, read depth, and haplotypes assigned by MegaMUGA or Sanger sequencing is shown in supplementary fig. S5, Supplementary Material online.

Assays

Wild mice were genotyped on MegaMUGA (supplementary table S1, Supplementary Material online). DO mice were genotyped on MUGA and MegaMUGA (supplementary table S2, Supplementary Material online). HR selection lines were genotyped at three generations, one before (−2) and two during (+22 and +61) artificial selection. We genotyped 185 randomly selected individuals from generation −2 and 157 individuals from generation +22 using Primer Set A. An additional 80 individuals from generation +61 were genotyped with the MegaMUGA array (see Microarray Genotyping and Quality Control). The HR8xC57BL/6J advanced intercross line was genotyped with Primer Set B in tissues from breeding stock at generations 3, 5, 8, 9, 10, 11, 12, 13, 14, and 15.

Copy-Number Assays and Assignment of R2d2 Status

Copy number at R2d2 was determined by qPCR for Cwc22, the single protein-coding gene in the R2d repeat unit, as described in detail in Didion et al. (2015). Briefly, we used commercially available TaqMan kits (Life Technologies assay numbers Mm00644079_cn and Mm00053048_cn) to measure the copy number of Cwc22 relative to the reference genes Tfrc (cat. no. 4458366, for target Mm00053048_cn) or Tert (cat. no. 4458368, for target Mm00644079_cn). Cycle thresholds (Ct) were determined for each target using ABI CopyCaller v2.0 software with default settings, and relative cycle threshold was calculated as We normalized the across batches by fitting a linear mixed model with batch and target-reference pair as random effects. Estimation of integer diploid copy numbers greater than ∼3 by qPCR is infeasible without many technical and biological replicates, especially in the heterozygous state. We took advantage of R2d2 diploid copy-number estimates from whole-genome sequencing for the inbred strains C57BL/6J (0), CAST/EiJ (2), and WSB/EiJ (66), and the (WSB/EiJxC57BL/6J)F1 (33) to establish a threshold for declaring a sample “high copy.” For each of the two TaqMan target-reference pairs, we calculated the sample mean () and standard deviation () of the normalized among CAST/EiJ controls and wild M. m. castaneus individuals together. We designated as high copy any individual with normalized greater than , that is, any individual with approximately >95% probability of having diploid copy number >2 at R2d2. Individuals with high copy number and evidence of local heterozygosity (a heterozygous call at any of the 13 markers in the R2d2 candidate interval) were declared heterozygous R2d2/, and those with high copy number and no heterozygous calls in the candidate interval were declared homozygous R2d2.

Exploration of Population Structure in Wild Mice

Scans for signatures of positive selection based on patterns of haplotype sharing assume that individuals are unrelated. We identified pairs of related individuals using the IBS2* ratio (Stevens et al. 2011), defined as HETHET/(HOMHOM + HETHET), where HETHET and HOMHOM are the count of nonmissing markers for which both individuals are heterozygous (share two alleles) and homozygous for opposite alleles (share zero alleles), respectively. Pairs with IBS2* < 0.75 were considered unrelated. Among individuals who were a member of one or more unrelated pairs, we iteratively removed one sample at a time until no related pairs remained, and additionally excluded markers with minor allele frequency <0.05 or missingness >0.10. The resulting data set contains genotypes for 396 mice at 58,283 markers. Several of our analyses required that samples be assigned to populations. Because mice in the wild breed in localized demes and disperse only over short distances (on the order of hundreds of meters) (Pocock et al. 2005), it is reasonable to delineate populations on the basis of geography. We assigned samples to populations based on the country in which they were trapped. To confirm that these population labels correspond to natural clusters we performed two exploratory analyses of population structure. First, classical multidimensional scaling (MDS) of autosomal genotypes was performed with PLINK (Purcell et al. 2007) (–mdsplot –autosome). The result is presented in figure 1, in which samples are colored by population. Second, we used TreeMix (Pickrell and Pritchard 2012) to generate a population tree allowing for gene flow using the set of unrelated individuals. Autosomal markers were first pruned to reach a set in approximate linkage equilibrium (plink –indep 25 1). TreeMix was run on the resulting set using the M. m. castaneus samples as an outgroup and allowing up to 10 gene flow edges (treemix -root “cas” -k 10) (fig. 1). The clustering of samples by population evident by MDS and the absence of long-branch attraction in the population tree together indicate that our choices of population labels are biologically reasonable.

Scans for Selection in Wild Mice

Two complementary statistics, hapFLK (Fariello et al. 2013) and standardized iHS score (Voight et al. 2006), were used to examine wild-mouse genotypes for signatures of selection surrounding R2d2. The hapFLK statistic is a test of differentiation of local haplotype frequencies between hierarchically structured populations. It can be interpreted as a generalization of Wright’s FST which exploits local linkage disequilibrium (LD). Its model for haplotypes is that of fastPHASE (Scheet 2006) and requires a user-specified value for the parameter K, the number of local haplotype clusters. We computed hapFLK in the set of unrelated individuals using M. m. castaneus samples as an outgroup for K = {4, 8, 12, 16, 20, 24, 28, 32} (hapflk –outgroup cas -k {K}) and default settings otherwise. The iHS score (and its allele frequency standardized form |iHS|) is a measure of EHH on a derived haplotype relative to an ancestral one. For consistency with the hapFLK analysis, we used fastPHASE on the same genotypes over the same range of K with 10 random starts and 25 iterations of expectation maximization (fastphase –K{K} -T10 -C25) to generate phased haplotypes. We then used selscan (Szpiech and Hernandez 2014) to compute iHS scores (selscan –ihs) and standardized the scores in 25 equally sized bins (selscan-norm–bins 25). Values in the upper tail of the genome-wide distribution of hapFLK or |iHS| represent candidates for regions under selection. We used percentile ranks directly and did not attempt to calculate approximate or empirical P values.

Detection of IBD in Wild Mice

As an alternative test for selection, we computed density of IBD sharing using the RefinedIBD algorithm of BEAGLE v4.0 (r1399) (Browning BL and Browning SR 2013), applying it to the full set of 500 individuals. The haplotype model implemented in BEAGLE uses a tuning parameter (the “scale” parameter) to control model complexity—larger values enforce a more parsimonious model, increasing sensitivity and decreasing computational cost at the expense of accuracy. The authors recommend a value of 2.0 for ∼1 M SNP arrays in humans. We increased the scale parameter to 5.0 to increase detection power given 1) our much sparser marker set and 2) the relatively weaker local LD in mouse versus human populations (Laurie et al. 2007). We trimmed one marker from the ends of candidate IBD segments to reduce edge effects (java -jar beagle.jar ibd = true ibdscale = 5 ibdtrim = 1). We retained those IBD segments shared between individuals in the set of 396 unrelated mice. In order to limit noise from false-positive IBD segments, we further removed segments with LOD (logarithm of odds) score  < 5.0 or width < 0.5 cM. An empirical IBD sharing score was computed in 500 kb bins with 250 kb overlap as where the sum in the numerator is taken over all IBD segments overlapping bin n and s is an indicator variable which takes the value 1 if individuals i,j share a haplotype IBD in bin n and 0 otherwise. The weighting factor w is defined as with where n and n are the number of unrelated individuals in the population to which individuals i and j belong, respectively. This weighting scheme accounts for the fact that we oversample some geographic regions (for instance, Portugal and Maryland) relative to others. To explore differences in haplotype sharing within versus between populations, we introduce an additional indicator p. Within-population sharing is computed by setting p = 1 if individuals i,j are drawn from the same population and p = 0 otherwise. Between-population sharing is computed by reversing the values of p. The result is displayed in figure 2.

Analysis of Local Sequence Diversity in Whole-Genome Sequence from Wild Mice

We obtained raw sequence reads for 26 unrelated wild mice (European Nucleotide Archive project accession PRJEB9450; Pezer et al. 2015); samples are listed in supplementary table S3, Supplementary Material online. Details of the sequencing protocol are given in the indicated reference. Briefly, paired-end libraries with mean insert size 230 bp were prepared from genomic DNA using the Illumina TruSeq kit. Libraries were sequenced on the Illumina HiSeq 2000 platform with 2 × 100 bp reads to an average coverage of 20× per sample (populations AHZ, CLG, and FRA) or 12× per sample (population HGL). We realigned the raw reads to the mouse reference genome (GRCm38/mm10 build) using BWA MEM (Li and Durban, unpublished data) with default parameters. SNPs relative to the reference sequence of Chromosome 2 were called using samtools mpileup v0.1.19-44428cd with maximum per-sample depth of 200. Genotype calls with root-mean-square mapping quality <30 or genotype quality >20 were treated as missing. Sites were used for phasing if they had a minor-allele count ≥2 and at most two missing calls. BEAGLE v4.0 (r1399) was used to phase the samples conditional on each other, using 20 iterations for phasing and default settings otherwise (java -jar beagle.jar phasing-its = 20). Sites were assigned a genetic position by linear interpolation on the most recent genetic map for the mouse (Liu et al. 2010 , 2014). We note that, unlike for humans, a large panel of reference haplotypes does not exist for mice. Using sample haplotypes as templates for phasing results in higher rates of switching errors, especially when the sample size is small. Switching errors introduce bias toward the null hypothesis in EHH- and iHS-type tests, which compare the length of haplotypes linked to the derived versus the ancestral allele at a specific locus (Voight et al. 2006). The R2d2 candidate interval spans positions 83,790,939–84,701,151 in the mm10 reference sequence. We used as the R2d2 index SNP the marker with strongest nominal association with R2d2 copy number (as estimated by Pezer et al. 2015) within 1 kb of the proximal boundary of the candidate interval. That SNP is chr2:83,790,275T > C. The C allele is associated with high copy number and is therefore presumed to be the derived allele. We computed the EHH statistic (Sabeti et al. 2002) in the phased data set over a 1 Mb window on each side of the index SNP using selscan (selscan –ehhehh-win 1000000). The result is presented in figure 2. Decay of haplotypes away from the index SNP was visualized as a bifurcation diagram (figure 2) using code adapted from the R package rehh (https://cran.r-project.org/package=rehh, last accessed February 24, 2016).

Estimation of Age of R2d2 Alleles in Wild Mice

To obtain a lower bound for the age of R2d2 and its associated haplotype, we used the method from Stephens et al. (1998). Briefly, this method approximates the probability P that a haplotype is affected by recombination or mutation during the G generations since its origin as where μ and r are the per-generation rates of mutation and recombination, respectively. Assuming μ ≪ r and taking P′ (the observed number of ancestral [nonrecombined] haplotypes) in a sample, as an estimator of P, obtain the following expression for G: We enumerated haplotypes in our sample of 52 chromosomes at three SNPs spanning the R2d2 candidate interval. The most proximal SNP is the index SNP for the EHH analyses (chr2:83,790,275T > C); the most distal SNP is the SNP most associated with copy number within 1 kbp of the boundary of the candidate interval (chr2:84,668,280T > C); and the middle SNP was randomly chosen to fall approximately halfway between (chr2:84,079,970C > T). The three SNPs span genetic distance of 0.154 cM (corresponding to r = 0.00154). The most common haplotype among samples with high copy number according to Pezer et al. (2015) was assumed to be ancestral. Among 52 chromosomes, 22 carried at least part of the R2d2-associated haplotype; of those, 11 were ancestral and 11 recombinant (supplementary table S3, Supplementary Material online). This gives an estimated age of 450 generations for R2d2. We note that the approximations underlying this model assume constant population size and neutrality. To the extent that haplotype homozygosity decays more slowly on a positively (or selfishly) selected haplotype, we will underestimate the true age of R2d2.

Inference of Local Phylogeny at R2d2

To determine whether the R2d2 haplotype(s) shared among wild mice have a single origin, we constructed a phylogenetic tree from the 39 MegaMUGA SNPs in the region flanking R2d2 (Chromosome 2: 82–85 Mb). We first excluded individuals heterozygous in the region and then constructed a matrix of pairwise distances from the proportion of alleles shared identical-by-state between samples. A tree was inferred from the distance matrix using the neighbor-joining method implemented in the R package ape (http://cran.r-project.org/package=ape, last accessed February 24, 2016).

Haplotype Frequency Estimation in the Diversity Outbred

We inferred the haplotypes of DO individuals using probabilistic methods (Liu et al. 2010 , 2014). We combined the haplotypes of DO individuals genotyped in this study with the Generation 8 individuals in Didion et al. (2015). As an additional QC step, we computed the number of historical recombination breakpoints per individual per generation (Svenson et al. 2012) and removed outliers (more than 1.5 standard deviations from the mean). We also excluded related individuals based on the distribution of haplotype sharing between related and unrelated individuals computed from simulations (mean 0.588 ± 0.045 for first-degree relatives; mean 0.395 ± 0.039 for second-degree relatives; and mean 0.229 ± 0.022 for unrelated individuals; see supplementary methods, Supplementary Material online). Finally, we computed in each generation the frequency of each founder haplotype at 250 kb intervals surrounding the R2d2 region (Chromosome 2: 78–86 Mb), and identified the greatest WSB/EiJ haplotype frequency.

Analyses of Fitness Effects of R2d2 in the Diversity Outbred

To assess the consequences of R2d2 for organismal fitness, we treated litter size as a proxy for absolute fitness. Using breeding data from 475 females from DO generations 13, 16, 18 and 19, we estimated mean litter size in four genotype groups: R2d2 homozygous females; R2d2 heterozygous females with transmission ratio distortion (TRD) in favor of the R2d2 allele; R2d2 heterozygous females without TRD; and R2d2homozygous females. The 126 heterozygous females were originally reported in Didion et al. (2015). Group means were estimated using a linear mixed model with parity and genotype as fixed effects and a random effect for each female using the lme4 package for R. Confidence intervals were obtained by likelihood profiling and post hoc comparisons were performed via F-tests, using the Kenward–Roger approximation for the effective degrees of freedom. The mean number of R2d2 alleles transmitted per litter by heterozygous females with and without TRD was estimated from the data in Didion et al. (2015) with a weighted linear model, using the total number of offspring per female as weights. Litter sizes are presented in supplementary table S2, Supplementary Material online and estimates of group mean litter sizes in figure 3.

Whole-Genome Sequencing of HR Selection Lines

Ten individuals from generation +61 of each of the eight HR selection lines were subject to whole-genome sequencing. Briefly, high-molecular-weight genomic DNA was extracted using a standard phenol/chloroform procedure. Illumina TruSeq libraries were constructed using 0.5 μg starting material, with fragment sizes between 300 and 500 bp. Each library was sequenced on one lane of an Illumina HiSeq2000 flow cell in a single 2 × 100 bp paired-end run.

Null Simulations of Closed Breeding Populations

Widespread fixation of alleles due to drift is expected in small, closed populations such as the HR lines or the HR8xC57BL/6J advanced intercross line. But even in these scenarios, an allele under positive selection is expected to fix 1) more often than expected by drift alone in repeated breeding experiments using the same genetic backgrounds and 2) more rapidly than expected by drift alone. We used the R package simcross (https://github.com/kbroman/simcross, last accessed February 26, 2016) to obtain the null distribution of fixation times and fixation probabilities for an HR line under Mendelian transmission. We assume that the artificial selection applied for voluntary exercise in the HR lines (described in Swallow et al. 1998) was independent of R2d2 genotype. This assumption is justified for two reasons. First, 3 of the 4 selection lines and 2 of the 4 control (unselected) lines fixed R2d2. Second, at generations 4 and 10 of the HR8xC57BL/6J advanced intercross, no quantitative trait loci (QTL) associated with the selection criteria (total distance run on days 5 and 6 of a 6-day trial) were found on Chromosome 2. QTL for peak and average running speed were identified at positions linked to R2d2; however, HR8 alleles at those QTL were associated with decreased, not increased, running speed (Kelly, Nehrenberg, Peirce, et al. 2010; Leamy et al. 2012). Without artificial selection, an HR line reduces to an advanced intercross line maintained by avoidance of sibling mating. We therefore simulated 100 replicates of an advanced intercross with 10 breeding pairs and initial focal allele frequency of 0.75. Trajectories were followed until the focal allele was fixed or lost. As a validation, we confirmed that the focal allele was fixed in 754 of 1,000 runs, which is not different from the expected 750 (P = 0.62, binomial test). Simulated trajectories and the distribution of sojourn times are presented in figure 4. The HR8xC57BL/6J advanced intercross line was simulated as a standard biparental AIL with initial focal allele frequency of 0.5. Again, 1,000 replicates of an AIL with 20 breeding pairs were simulated and trajectories were followed until the focal allele was fixed or lost. The result is presented in figure 4.

Investigation of Population Dynamics of Meiotic Drive

We used two approaches to investigate the population dynamics of a female-limited meiotic drive system with selection against the heterozygote. First, we evaluated the fixation probability of a driving allele in relationship to transmission ratio (m), selection coefficient against the heterozygote (s), and population size (N) by modeling the population as a discrete-time Markov chain whose states are possible counts of the driving allele. Following Hedrick (1981), where is the expected frequency of the driving allele in generation t + 1 given its frequency in the previous generation (). In an infinite population, the equilibrium behavior of the system is governed by the quantity q: When q > 1, the driving allele always increases in frequency. For values of q ≈ 1 and smaller, the driving allele is either lost or reaches an unstable equilibrium frequency determined by m and s. Let M be the matrix of transition probabilities for the Markov chain with 2N + 1 states corresponding to possible counts of the driving allele in the population (0, …, 2N). The entries m of M are Given a vector p of starting probabilities, the probability distribution at generation t is obtained by iteration We initiated the chain with a single copy of the driving allele (i.e., ). Because this Markov chain has absorbing states (namely allele counts 0 and 2N), we approximated steady-state probabilities by iterating the chain until the change in probabilities between successive generations was <10−4. Fixation probability is given by the value of the entry at convergence. We evaluated all possible combinations of (in steps of 0.1) and (in steps of 0.05). To investigate the effects of modifier loci on the frequency trajectory of a driving allele, we implemented in Python forward-in-time simulations under a Wright–Fisher model with selection. Simulations assumed a constant population size of 2N = 200 chromosomes, each 100 cM long, with balanced sex ratio. At the beginning of each run a driving allele was introduced (at 50 cM) on a single, randomly chosen chromosome. Modifier alleles were introduced into the population independently at a specified frequency, at position 0.5 cM (i.e., unlinked to the driving allele). To draw the next generation, an equal number of male and female parents were selected (with replacement) from the previous generation according to their fitness. Among females heterozygous for the driving allele, transmission ratio (m) was calculated according to genotype at the modifier loci (if any). For males and homozygous females, m = 0.5. Individuals were assigned a relative fitness of 1 if m = 0.5 and 0.8 if m > 0.5. Recombination was simulated under the Haldane model (i.e., a Poisson process along chromosomes with no crossover interference). Finally, for each individual in the next generation, one chromosome was randomly chosen from each parent with probability m. Simulation runs were restarted when the driving allele was fixed or lost, until 100 fixation events were observed in each condition of interest. Probability of fixation was estimated using the waiting time before each fixation event, assuming a geometric distribution of waiting times, using the fitdistr() function in the R package MASS. Simulations are summarized in figure 5.

Data Availability

All data are made available at http://csbio.unc.edu/r2d2/ (last accessed February 24, 2016). Simulation code is available at: https://github.com/andrewparkermorgan/r2d2-selfish-sweep (last accessed February 24, 2016).

Supplementary Material

Supplementary tables S1–S3, figures S1–S5, and methods are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).
  59 in total

1.  Genetic architecture of voluntary exercise in an advanced intercross line of mice.

Authors:  Scott A Kelly; Derrick L Nehrenberg; Jeremy L Peirce; Kunjie Hua; Brian M Steffy; Tim Wiltshire; Fernando Pardo-Manuel de Villena; Theodore Garland; Daniel Pomp
Journal:  Physiol Genomics       Date:  2010-04-13       Impact factor: 3.107

2.  Higher differentiation among subspecies of the house mouse (Mus musculus) in genomic regions with low recombination.

Authors:  A Geraldes; P Basset; K L Smith; M W Nachman
Journal:  Mol Ecol       Date:  2011-10-18       Impact factor: 6.185

3.  Selective sweeps in a 2-locus model for sex-ratio meiotic drive in Drosophila simulans.

Authors:  Nicolas Derome; Emmanuelle Baudry; David Ogereau; Michel Veuille; Catherine Montchamp-Moreau
Journal:  Mol Biol Evol       Date:  2007-12-10       Impact factor: 16.240

4.  Dating the origin of the CCR5-Delta32 AIDS-resistance allele by the coalescence of haplotypes.

Authors:  J C Stephens; D E Reich; D B Goldstein; H D Shin; M W Smith; M Carrington; C Winkler; G A Huttley; R Allikmets; L Schriml; B Gerrard; M Malasky; M D Ramos; S Morlot; M Tzetis; C Oddoux; F S di Giovine; G Nasioulas; D Chandler; M Aseev; M Hanson; L Kalaydjieva; D Glavac; P Gasparini; E Kanavakis; M Claustres; M Kambouris; H Ostrer; G Duff; V Baranov; H Sibul; A Metspalu; D Goldman; N Martin; D Duffy; J Schmidtke; X Estivill; S J O'Brien; M Dean
Journal:  Am J Hum Genet       Date:  1998-06       Impact factor: 11.025

5.  Identifying recent adaptations in large-scale genomic data.

Authors:  Sharon R Grossman; Kristian G Andersen; Ilya Shlyakhter; Shervin Tabrizi; Sarah Winnicki; Angela Yen; Daniel J Park; Dustin Griesemer; Elinor K Karlsson; Sunny H Wong; Moran Cabili; Richard A Adegbola; Rameshwar N K Bamezai; Adrian V S Hill; Fredrik O Vannberg; John L Rinn; Eric S Lander; Stephen F Schaffner; Pardis C Sabeti
Journal:  Cell       Date:  2013-02-14       Impact factor: 41.582

6.  The hitch-hiking effect of a favourable gene.

Authors:  J M Smith; J Haigh
Journal:  Genet Res       Date:  1974-02       Impact factor: 1.588

7.  Signature of selective sweep associated with the evolution of sex-ratio drive in Drosophila simulans.

Authors:  Nicolas Derome; Karine Métayer; Catherine Montchamp-Moreau; Michel Veuille
Journal:  Genetics       Date:  2004-03       Impact factor: 4.562

8.  Subspecific origin and haplotype diversity in the laboratory mouse.

Authors:  Hyuna Yang; Jeremy R Wang; John P Didion; Ryan J Buus; Timothy A Bell; Catherine E Welsh; François Bonhomme; Alex Hon-Tsen Yu; Michael W Nachman; Jaroslav Pialek; Priscilla Tucker; Pierre Boursot; Leonard McMillan; Gary A Churchill; Fernando Pardo-Manuel de Villena
Journal:  Nat Genet       Date:  2011-05-29       Impact factor: 38.330

Review 9.  Concerning RNA-guided gene drives for the alteration of wild populations.

Authors:  Kevin M Esvelt; Andrea L Smidler; Flaminia Catteruccia; George M Church
Journal:  Elife       Date:  2014-07-17       Impact factor: 8.140

10.  A CRISPR-Cas9 gene drive system targeting female reproduction in the malaria mosquito vector Anopheles gambiae.

Authors:  Andrew Hammond; Roberto Galizi; Kyros Kyrou; Alekos Simoni; Carla Siniscalchi; Dimitris Katsanos; Matthew Gribble; Dean Baker; Eric Marois; Steven Russell; Austin Burt; Nikolai Windbichler; Andrea Crisanti; Tony Nolan
Journal:  Nat Biotechnol       Date:  2015-12-07       Impact factor: 54.908

View more
  26 in total

1.  A Mixed Model Approach to Genome-Wide Association Studies for Selection Signatures, with Application to Mice Bred for Voluntary Exercise Behavior.

Authors:  Shizhong Xu; Theodore Garland
Journal:  Genetics       Date:  2017-08-03       Impact factor: 4.562

2.  Probing the Depths of Biological Diversity During the Second Century of GENETICS.

Authors:  Linnea Sandell; Sarah P Otto
Journal:  Genetics       Date:  2016-10       Impact factor: 4.562

Review 3.  Gene conversion generates evolutionary novelty that fuels genetic conflicts.

Authors:  Matthew D Daugherty; Sarah E Zanders
Journal:  Curr Opin Genet Dev       Date:  2019-08-26       Impact factor: 5.578

4.  Rodent gene drives for conservation: opportunities and data needs.

Authors:  John Godwin; Megan Serr; S Kathleen Barnhill-Dilling; Dimitri V Blondel; Peter R Brown; Karl Campbell; Jason Delborne; Alun L Lloyd; Kevin P Oh; Thomas A A Prowse; Royden Saah; Paul Thomas
Journal:  Proc Biol Sci       Date:  2019-11-06       Impact factor: 5.349

5.  Instability of the Pseudoautosomal Boundary in House Mice.

Authors:  Andrew P Morgan; Timothy A Bell; James J Crowley; Fernando Pardo-Manuel de Villena
Journal:  Genetics       Date:  2019-04-26       Impact factor: 4.562

Review 6.  Do Gametes Woo? Evidence for Their Nonrandom Union at Fertilization.

Authors:  Joseph H Nadeau
Journal:  Genetics       Date:  2017-10       Impact factor: 4.562

7.  Male Infertility Is Responsible for Nearly Half of the Extinction Observed in the Mouse Collaborative Cross.

Authors:  John R Shorter; Fanny Odet; David L Aylor; Wenqi Pan; Chia-Yu Kao; Chen-Ping Fu; Andrew P Morgan; Seth Greenstein; Timothy A Bell; Alicia M Stevans; Ryan W Feathers; Sunny Patel; Sarah E Cates; Ginger D Shaw; Darla R Miller; Elissa J Chesler; Leonard McMillian; Deborah A O'Brien; Fernando Pardo-Manuel de Villena
Journal:  Genetics       Date:  2017-06       Impact factor: 4.562

8.  Genetic Basis of Aerobically Supported Voluntary Exercise: Results from a Selection Experiment with House Mice.

Authors:  David A Hillis; Liran Yadgary; George M Weinstock; Fernando Pardo-Manuel de Villena; Daniel Pomp; Alexandra S Fowler; Shizhong Xu; Frank Chan; Theodore Garland
Journal:  Genetics       Date:  2020-09-25       Impact factor: 4.562

9.  Wild Mouse Gut Microbiota Promotes Host Fitness and Improves Disease Resistance.

Authors:  Stephan P Rosshart; Brian G Vassallo; Davide Angeletti; Diane S Hutchinson; Andrew P Morgan; Kazuyo Takeda; Heather D Hickman; John A McCulloch; Jonathan H Badger; Nadim J Ajami; Giorgio Trinchieri; Fernando Pardo-Manuel de Villena; Jonathan W Yewdell; Barbara Rehermann
Journal:  Cell       Date:  2017-10-19       Impact factor: 41.582

10.  The Evolution of Polymorphic Hybrid Incompatibilities in House Mice.

Authors:  Erica L Larson; Dan Vanderpool; Brice A J Sarver; Colin Callahan; Sara Keeble; Lorraine L Provencio; Michael D Kessler; Vanessa Stewart; Erin Nordquist; Matthew D Dean; Jeffrey M Good
Journal:  Genetics       Date:  2018-04-24       Impact factor: 4.562

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.